The abstract syntax is central to the definition of a formal language. It stands between the concrete representations of documents, such as marks on paper or images on screens, and the abstract entities, semantic relations, and semantic functions used for defining their meaning.
The abstract syntax has the following objectives:
The abstract syntax is presented as a set of production rules in which each sort of entity is defined in terms of its subsorts:
or in terms of its constructor and components:SOME-SORT ::= SUBSORT-1 | ... | SUBSORT-n
The productions form a context-free grammar; algebraically, the nonterminal symbols of the grammar correspond to sorts (of trees), and the terminal symbols correspond to constructor operations. The notation COMPONENT* indicates repetition of COMPONENT any number of times; COMPONENT+ indicates repetition at least once. (These repetitions could be replaced by auxiliary sorts and constructs, after which it would be straightforward to transform the grammar into a CASL FREE-DATATYPE specification.)SOME-CONSTRUCT ::= some-construct COMPONENT-1 ... COMPONENT-n
The context conditions for well-formedness of specifications are not determined by the grammar (these are considered as part of semantics).
The grammar here has the property that there is a sort for each construct (although an exception is made for constant constructs with no components). Appendix B provides an abbreviated grammar defining the same abstract syntax. It was obtained by eliminating each sort that corresponds to a single construct, when this sort occurs only once as a subsort of another sort.
The following nonterminal symbols correspond to lexical syntax, and are left unspecified in the abstract syntax: WORDS, SIGNS, DOT-WORDS, PLACE, URL, and NUMBER.