[Date Prev][Date Next][Thread Prev][Thread Next][Date Index][Thread Index]
Re: Literal syntax
Dear Till,
> may I remind you that there is a proposal for the concrete syntax
> of literals (numbers, string, lists, ...) in CASL, available under
>
>
http://www.informatik.uni-bremen.de/~cofi/LiteralExtension/literal.[tex,dvi,ps,ps.gz]
The title, date, and label of the above document are the same as for
the original Summary-Changes document, which is a bit confusing.
The full text of the proposal for literals is reproduced below, to
facilitate citing parts of it in any further e-mail discussion.
> The deadline for comments is (as for the list of changes to the summary)
>
> Monday, July the 5th
>
> So far we have not received any comments, but we would like
> to have a discussion on literal syntax: it is not likely
> that the our proposal cannot be improved.
Sorry for the lateness of this response...
In summary: I'm sympathetic towards most of the proposed syntax for
literals themselves, but I'd much prefer to use annotations (rather
than attributes and new forms of datatype declarations) for linking
the literal syntax to declarations of operations.
In more detail (my comments are delimited by solid lines like this):
________________________________________________________________________
CASL
The Common Algebraic Specification Language
Summary
-- DRAFT LIST OF CHANGES: LITERALS --
by Till Mossakowski
May 1999
Changes
App. C.2.1, p. C-4 and p. C-6
Add:
TERM ::= ... | LITERAL
TOKEN ::= ... | DIGIT | ""
NUMBER ::= DIGIT | NNUMBER
App. C.4
Replace:
A WORD consists of a sequence of upper- and/or lower- case letters
(A,...,Z and all the ISO Latin-1 national and accented letters
except for the Icelandic `eth' and `thorn'), digits (0,...,9), and
primes (').
by:
A WORD consists of a sequence of upper- and/or lower- case letters
(A,...,Z and all the ISO Latin-1 national and accented letters
except for the Icelandic `eth' and `thorn'), digits (0,...,9), and
primes ('). It may not entirely consist of digits, nor may it
contain an `E' preceding or following a digit. This special syntax
allows to use the binary operation E within floating point numbers
without the need to insert spaces between the E and the digits.
________________________________________________________________________
I agree that it would be nice to allow standard notation like 2.7E6 to
be written without spaces. But I don't see that this requires banning
the use of `E' following or preceding a digit in all words. My
suggestion is to replace the last two sentences above by:
It may not start with a digit (various sequences of characters
starting with a digit are reserved for literal numbers).
This should simplify the grammar for WORD below considerably. Perhaps
NNUMBER can then be eliminated altogether too, with single digits
being reserved for literal numbers?
One may then add something like:
FLOATING ::= NUMBER . NUMBER 'E' OPT-SIGN NUMBER
OPT-SIGN ::= + | - |
Use of this notation would require that an operation for E has been
declared. (Without the above addition of FLOATING, 2.7E6 would have
to be written as `2.7E 6' or `2.7E+6'.)
It might be good to cater also for binary, octal, and hexadecimal
notation for numbers. Isn't there a rather standard notation for
indicating the base, by prefixing a sequence of digits by `2#', `8#',
or `16#'? It should be easy to add this (now or in some later
extension), provided that words are not allowed to start with a digit;
otherwise, one might need to remove further patterns of letters and
digits from WORD...
________________________________________________________________________
This is specified by the grammar:
WORD ::= E NON-DIGIT-WORD
| NON-E-WORD
NON-E-WORD ::= NON-DIGIT-E-WORD
| DIGIT NON-E-WORD
NON-DIGIT-E-WORD ::= NON-E-LETTER NNUMBER
| NON-E-LETTER DIGIT
| NON-E-LETTER WORD
| NON-E-LETTER
NON-DIGIT-WORD ::= NON-DIGIT-E-WORD
| E NON-DIGIT-WORD
Furthermore, add:
FRACTION ::= NUMBER . NUMBER
NNUMBER ::= DIGIT DIGIT | DIGIT NNUMBER
NUMBER ::= DIGIT | NNUMBER
CHAR ::= " " | ! | '\"' | # | $ | ... | '\'' | ... | '\\' | ...
| \n | \t | \r | ...
| \000 ... | \255
QUOTED-CHAR ::= ' CHAR '
STRING ::= '"' CLOSE-STRING
CLOSE-STRING ::= CHAR '"'
| CHAR CLOSE-STRING
Only WORD, FRACTION, NNUMBER, QUOTED-CHAR, STRING are the
non-terminals of the lexical syntax. The other non- terminals are
just auxiliary.
(N.B. There is a difficulty in using double quotation marks in the
grammar: sometimes, they have to stand for themselves. In this
case, I enclosed them into single quotes. Till)
The following section is meant to be added as a separate section to
appendix C.
SYNTAX FOR LITERALS
In this section, several attributes for operations are introduced that can
be used to provide a literal syntax for numbers, strings and lists.
Literal syntax for numbers
LITERAL ::= NNUMBER
________________________________________________________________________
Each single digit could be reserved too - or is it important to allow
the use of the constants `0' and `1' in non-numerical specifications?
________________________________________________________________________
OP-ATTR ::= ... | concatdigits
The attribute for declaring an operation to be used for concatenation of
digits within a number is written `concatdigits'. Only one binary operation
f within a specification SPEC is allowed to have this attribute, otherwise,
the specification is ill-formed.
The attribute has the effect that an NNUMBER of form d1 ...dn (where
n>1 and
each di is a DIGIT) is translated to the (abstract syntax of) the term
f(f(...f(t_1,t_2) ...,t_n-1),t_n) where ti is the abstract syntax tree for
di.
Vice versa, an abstract syntax tree corresponding to a term of the above
form which is maximal (i.e., it is not a subterm of a larger term of the
same form) is expected to be printed as d1 ...dn.
________________________________________________________________________
It seems that the proposed attribute cannot be merely part of concrete
syntax, but must be added to the abstract syntax of CASL too. Thus
corresponding changes would be needed in Apps. A and B, and in the
text of the body of the summary. This should perhaps have been stated
explicitly in the proposal.
An alternative might be to regard the proposed attribute as an
ANNOTATION. Parsers and formatters could take account of it -
complaining if they found more than one such annotation - and there
would be no need to let it affect well-formedness: the abstract syntax
tree produced by the above translation would determine the entire
semantics of the specification.
If there are vital reasons for preferring attributes to annotations
here, kindly explain them.
Similar objections apply to the other parts of the proposal below.
________________________________________________________________________
If there is no operation with a `concatdigits' attribute, then an
NNUMBER is
not recognized as a well-formed LITERAL.
LITERAL ::= ... | FRACTION
OP-ATTR ::= ... | decimalpoint
The attribute for declaring an operation to be used for evaluating the
decimal point within a sequence of digits within a number is written
`decimalpoint'. Only one binary operation f within a specification
SPEC is
allowed to have this attribute, otherwise, the specification is ill-formed.
The attribute has the effect that a FRACTION of form n1.n2 (where each
ni is
a NUMBER) is translated to the (abstract syntax of) the term f(t_1,t_2)
where ti is the abstract syntax tree for ni, i=1,2.
Vice versa, an abstract syntax tree corresponding to a term of the above
form which is maximal (i.e., it is not a subterm of a larger term of the
same form) is expected to be printed as n1.n2.
If there is no operation with a `decimalpoint' attribute, then a
FRACTION is
not recognized as a well-formed LITERAL.
Due to the special status of the character E in the lexical syntax of WORDS,
it is possible without introducing further syntax to write
exponentations of
form FRACTION E NUMBER in the case that there is a binary infix
operation E
in the signature.
________________________________________________________________________
The point above is that you are wanting to avoid the need for spaces
around the E.
If E is also used as a variable (as in E=mc^2 :-), the declaration of
E as an infix operation may give ambiguities in the presence of an
invisible `__ __' operation. Since many users may want to import the
basic datatype for numbers, one should be careful to avoid making
assumptions about their general style of notation.
If there is no special status of E in the lexical syntax for words,
one could identify the binary operation to use to interpret E in
FLOATING (see above) in the same way as for the dot in FRACTION.
This would give extra uniformity to the proposal.
________________________________________________________________________
Literal syntax for strings
LITERAL ::= ... | STRING
OP-ATTR ::= ... | concatchars
The attribute for declaring an operation to be used for concatenation of
digits within a number is written `concatchars'. Only one binary
operation f
within a specification SPEC is allowed to have this attribute, otherwise,
the specification is ill-formed.
The attribute has the effect that an STRING of form "c1 ...cn" (where n>1
and each ci is a CHAR) is translated to the (abstract syntax of) the term
f(t_1,f(t_2, ...f(t_n,t)...)) where ti is the abstract syntax tree for ci,
and t is the abstract syntax tree for "".
________________________________________________________________________
Is "" supposed to be declared as a constant? It would seem more
uniform to let the declared constant have an arbitrary name, and
identify it in the same way as the concatenation operation.
________________________________________________________________________
Vice versa, an abstract syntax tree corresponding to a term of the above
form which is maximal (i.e., it is not a subterm of a larger term of the
same form) is expected to be printed as "c1 ...cn".
If there is no operation with a `concatchars' attribute, then a STRING is
not recognized as a well-formed LITERAL.
Literal syntax for lists
DATATYPE-DECL ::= SORT "::=" ALTERNATIVE "|"..."|"
ALTERNATIVE , LIST-BRACKETS
LIST-BRACKETS ::= brackets SIGNS-BACKETS PLACE SIGNS-BRACKETS
The attribute for declaring a list-like syntax for a datatype is written
`brackets b1 __ b2'. Thus enclosing datatype declaration declaring the sort
s must consists of excatly two ALTERNATIVEs: the first has to be a constant
c of arbitrary argument type, while the second is a constructor f with
exactly two argument sorts, the second of which has to be s.
________________________________________________________________________
Although the above proposal allows the introduction of neat notation
for literals of various sorts of collections (e.g. {e1,...,en} for
sets, as well as [e1,...,en] for lists), I find the means used to
determine the translation quite ugly. In particular, it seems strange
to REQUIRE the use of a DATATYPE-DECL; what about sub-languages that
don't support this construct?
Couldn't one instead use annotations for indicating that any outfix
operation `b1 __ b2' is a unary bracketing constructor, determining at
the same time the intended `nil' and `cons' (or `conc') operations
used in translating non-unary applications? How the latter
constructors have been declared is then irrelevant.
________________________________________________________________________
The attribute leads to an extension of the syntax for LITERALs:
LITERAL ::= ... | b1 b2
| b1 TERM , ... , TERM b2
A list of form b1 t1,...,tn b2 is (where n>=0 and each ti is a TERM) is
translated to the (abstract syntax of) the term f(u_1,f(u_2,
...f(u_n,d)...)) where ui is the abstract syntax tree for ci, and d is the
abstract syntax tree for c.
Vice versa, an abstract syntax tree corresponding to a term of the above
form which is maximal (i.e., it is not a subterm of a larger term of the
same form) is expected to be printed as "b1 t1,...,tn b2".
------------------------------------------------------------------------
I hope that the above may help you clarify your proposal. I'm
intending to finalize the new release of the Summary by this coming
weekend, and announce it early next week. I'll ask Bernd to decide
whether your (perhaps revised) proposal for literals should be
included in the new release or not. In the meantime, I look forward
to your reactions to my comments.
Cheers,
-- Peter
_________________________________________________________
Dr. Peter D. Mosses International Fellow (*)
Computer Science Laboratory mailto:mosses@csl.sri.com
SRI International phone: +1 (650) 859-2200
333 Ravenswood Avenue fax: +1 (650) 859-2844
Menlo Park, CA 94025, USA http://www.brics.dk/~pdm/
(*) on leave from DAIMI & BRICS, University of Aarhus, DK
also affiliated to CS Department, Stanford University
_________________________________________________________