Camlp5 Documentation

Introduction

Camlp5 is a preprocessor and pretty-printer for OCaml programs. It also provides parsing and printing tools.

As a preprocessor, it allows to:

  • extend the syntax of OCaml,
  • redefine the whole syntax of the language.

As a pretty printer, it allows to:

  • display OCaml programs in an elegant way,
  • convert from one syntax to another,
  • check the results of syntax extensions.

Camlp5 also provides some parsing and pretty printing tools:

  • extensible grammars
  • extensible printers
  • stream parsers and lexers
  • pretty print module

It works as a shell command and can also be used in the OCaml toplevel.

Quickstart with ocamlfind

The easiest (and the one I use) way to use camlp5 is with ocamlfind. Once Camlp5 is installed, ocamlfind list | grep camlp5 produces a long list of packages: each of these corresponds to a feature of camlp5, and this documentation will explain how to use them. Camlp5 can be used in the ocaml toplevel, and the tutorials in this documentation will use this capability also.

But overall, the general idea is that we compile ocaml files, specifying their syntax (typically camlp5o, but perhaps camlp5r), and requiring the camlp5 package and maybe other packages. So for instance, to compile an original-syntax ocaml file:

ocamlfind ocamlc -package camlp5,fmt -syntax camlp5o \
    -linkpkg streams.ml -o streams

Let’s unpack that:

  1. use ocamlfind ocamlc
  2. specify packages camlp5 and fmt (for some pretty-printing in the program)
  3. specify original syntax: -syntax camlp5o
  4. link-flags: -linkpkg

If the file were in revised syntax, we would compile it thus:

ocamlfind ocamlc -package camlp5,fmt -syntax camlp5r \
    -linkpkg streams.ml -o streams

And if it contained code that worked with extensible grammars, we’d use:

ocamlfind ocamlc -package camlp5,fmt,camlp5.extend -syntax camlp5r \
    -linkpkg streams.ml -o streams

Note the package camlp5.extend (that provides extensible grammar syntax support).

For users of camlp5 who do not wish to write new syntax-manipulating code, but only use existing packages, this is typically enough: there are packages like pa_ppx that use camlp5, and those packages also provide their functionality as a collection of ocamlfind packages, which are used in exactly the same manner as above.

Camlp5 Package Naming and Overview

Briefly, there is a camlp5 package, and a number of sub-packages. Below is a list, and basically everything in Camlp5 is exposed as an ocamlfind package. Here, categorized (a bit) are the available packages:

  1. the main package
  • camlp5: the main camlp5 package
  1. grammar support: this is packages that setup the grammar for parsing ML files
  • camlp5.pa_o: grammar for original syntax
  • camlp5.pa_op: stream parser grammar support for original syntax
  • camlp5.pa_r: grammar for revised syntax
  • camlp5.pa_scheme: grammar for scheme syntax (written in scheme syntax)
  • camlp5.pa_schemer: grammar support for scheme syntax (written in revised syntax)
  1. printer support: this is packages that setup the printer used to print out ML ASTs
  • camlp5.pr_depend: printing support for depend generation (deprecated)
  • camlp5.pr_dump: printer that dumps out official Ocaml AST (the default way camlp5 outputs its result AST)
  • camlp5.pr_o: ML AST printer to original syntax (implemented using Camlp5’s printer machinery, hence extensible)
  • camlp5.pr_official: ML AST printer using Ocaml’s official compiler-libs printer machinery
  • camlp5.pr_r: ML AST printer to revised syntax (again, implemented using Camlp5’s printery machinery)
  • camlp5.pr_scheme: ML AST printer to Scheme syntax
  1. Syntax extensions for writing AST-manipulating code: these are packages that aid the programmer in writing new AST-manipulating code (like the code in the above packages):
  • camlp5.extend: extensible grammars
  • camlp5.extfun: extensible functions
  • camlp5.extprint: extensible printers
  • camlp5.fstream: functional streams
  • camlp5.gramlib: the grammar-interpreter machinery, as a separate package
  • camlp5.pa_lexer: syntax extension for writing lexers
  • camlp5.macro: IFDEF-like syntax extension
  • camlp5.phony_quotations: grammar support for phony quotations (only for developers)
  • camlp5.pprintf: syntax extensions for Camlp5’s ``pprintf’’ pretty-printer
  • camlp5.pragma: experimental pragma support (don’t use this)
  • camlp5.quotations: support for quotations and anti-quotations in ML code

There are three ways that a piece of Ocaml code can be used, and this applies equally to Camlp5 packages. So, for a camlp5 package X above, we can do one of:

  1. load into the preprocessor:

    ocamlfind ocamlc -package X ....
    ocamlfind ocamlopt -package X ....
    
  2. load into the toplevel (and used to preprocess there, but also linked-in):

    #require "X" ;;
    
  3. link with the program (e.g. with a final link-command using ocamlc):

    ocamlfind ocamlc -package X.link ....
    ocamlfind ocamlopt -package X.link ....
    

Notice that for use #3, we supply the name X.link instead of X. For example, to link revised-syntax grammar support into a program, we’d use package camlp5.pa_r.link.

The Ocaml Toplevel

A warning for users who use some frontend to interact with the Ocaml toplevel: many frontends have a baked-in understanding of Ocaml’s syntax, and specifically that toplevel phrases always end with ;; (e.g. tuareg-mode in Emacs). If you load the revised syntax into an Ocaml toplevel accessed via one of these front-ends (which is almost-never necessary), you will find that it doesn’t work: you may various find that you get no response back to input, or that the front-end inserts extra semicolons, or other weirdness. When I use Emacs with revised syntax Ocaml, I typically do so in a M-x shell RET window.

NOTE: It would be useful to fix tuareg-mode to understand revised syntax.

To use camlp5 from the toplevel, first decide which syntax you wish to use. Then

  1. Start the ocaml toplevel.
  2. “use” the findlib/camlp5 include file.
  3. then select your syntax.
  4. Proceed to use the toplevel.

For original syntax:

        OCaml version 4.10.0

# #use "topfind.camlp5";;
- : unit = ()
Findlib has been successfully loaded. Additional directives:
  #require "package";;      to load a package
  #list;;                   to list the available packages
  #camlp4o;;                to load camlp4 (standard syntax)
  #camlp4r;;                to load camlp4 (revised syntax)
  #predicates "p,q,...";;   to set these predicates
  Topfind.reset();;         to force that packages will be reloaded
  #thread;;                 to enable threads

- : unit = ()
Additional Camlp5 directives:
  #camlp5o;;                to load camlp5 (standard syntax)
  #camlp5r;;                to load camlp5 (revised syntax)

- : unit = ()
# #camlp5o ;;
/home/chetsky/Hack/Ocaml/GENERIC/4.10.0/lib/camlp5: added to search path
/home/chetsky/Hack/Ocaml/GENERIC/4.10.0/lib/camlp5/camlp5o.cma: loaded
     Camlp5 parsing version 8.00

#

Again, just the commands:

#use "topfind.camlp5";;
#camlp5o ;;

For the revised syntax, just replace the last line with #camlp5r ;; The tutorial has examples of loading packages and code into a toplevel using camlp5. [Again, I reiterate that revised syntax and (e.g.) the tuareg-mode front-end will not interact well.]

Parsing and Printing kits

Parsing and printing extensions are (of course) OCaml object files, i.e. files with the extension “.cmo” or “.cma”. But one almost never has to deal with them in this way; instead, one uses standard ocamlfind package-names as described in Camlp5 Package Naming and Overview.

For instance, in Tutorial: Extending OCaml with New Syntax the parsing kits camlp5.extend, camlp5.extfold, and camlp5.quotations are used both on the command-line and in the toplevel. Typically this is how all kits are used: it is rare to need to reference the .cmo/.cma files directly.

Extending syntax

There is a detailed example of extending the syntax of Ocaml in
Tutorial: Extending OCaml with New Syntax.

Pretty printing

It is oftentimes really useful to see the result of camlp5 processing (for debugging). Camlp5 pretty-printing kits are designed for this purpose. Just as parsing kits are named and used via findlib packages, so are pretty printing kits. For instance, the file tutorials/streams/streams.ml is in revised syntax. We can parse it with camlp5 and pretty-print it in original syntax:

not-ocamlfind preprocess -package camlp5.pr_o -syntax camlp5r \
    tutorials/streams/streams.ml

And we can pretty-print the original-syntax version of the example:

not-ocamlfind preprocess -package camlp5.pr_r -syntax camlp5o \
    tutorials/streams-original/streams.ml

It is possible to use lower-level access to the camlp5 command-line executables, but typically using not-ocamlfind and findlib packages is both easier and more compatible with the syntax used for building code:

1. take the ocamlfind ocamlc line 2.remove non-preprocessing options 3. replace the prefix with not-ocamlfind preprocess 4. add a pretty-printing kit package (e.g. camlp5.pr_r)

and you get a commandline for preprocessing a file and seeing the output.

Note: the revised syntax

The revised syntax is a specific syntax whose aim is to resolve some problems and inconsistencies of the normal OCaml syntax. A chapter will explain the differences between the normal and the revised syntax.

The one place in Camlp5 where revised syntax is mandatory is in quotations – bits of syntax that are converted into patterns and expressions in Ocaml. This is because Ocaml’s original syntax has gaps that make inserting anti-quotations in some places difficult; revised syntax was designed to remedy these gaps.

Many examples in this documentation are written using revised syntax, but over time we’ll convert all that are possible, to original syntax. The tutorial examples are all available in both revised syntax and original syntax, and use many of the syntax-extensions provided by Camlp5: nothing prevents users from writing extensions in original syntax, and of course applying Camlp5 extensions to code written in original syntax.

Even if you don’t know revised syntax, it is not difficult to understand. And as mentioned above, it is almost never necessary to use revised syntax to use Camlp5 (again, aside from quotations).

Building and Installing Camlp5

Most Camlp5 users will install via opam, but for developers and those investigating bugs, as well as those wishing to build documentation and the testsuite, here are instructions.

Requirements

To build Camlp5 you need the ocaml compiler installed. It must be accessible in the path. The Sphinx/RST documentation and testsuite require other packages (Ocaml, Python, Perl) to be installed, and this is described below.

Installation

In the nearly-ubiquitious fashion of open-source, you will run (in the shell)

./configure
make
make install

configure has arguments allowing you to specify installation directories and other options.

Detailed Steps (that are almost never needed)

Alternatively, you can decompose the operations into:

$ make out

Then, to make camlp5 check itself:

$ make bootstrap

Further, to make the native-code library:

$ make opt

At end, to make more native-code programs:

$ make opt.opt
  1. The binaries are installed by typing:
$ make install

or, if you install to a system location but are not an administrator

$ sudo make install

Building Sphinx Documentation

There are two forms of documentation: HTML and RST (ReStructured Text) (for Sphinx). From this release forward, the Sphinx documentation will be updated and improved; the HTML documentation will soon be removed.

The directory doc/rst contains the sources of the RST documentation. To build it requires:

  • python3 (tested with version 3.6.9)
  • Sphinx (tested with version 3.0.3)
  • sphinx-rtd-theme (tested with 0.4.3)

The easiest way to do that is to use Python’s venv and install them with pip3 (e.g.):

python3 -m venv ~/Sphinx
source ~/Sphinx/bin/activate
pip3 install sphinx-rtd-theme

Then

make -C doc/rst all

This creates the Sphinx documentation in doc/rst/_build.

Testsuite

There is a rather extensive testsuite (to which any suggestions or contributions are welcome) in testsuite. It assumes only that camlp5 has been successfully built, but not installed. To build/run it, one must install a number of packages:

  • rresult
  • fmt
  • pcre
  • ounit2
  • patdiff
  • camlp5-buildscripts

and using opam:

opam install rresult fmt pcre ounit2 patdiff

These packages are not required to build/install camlp5, but only to build and run the testsuite. However, for add-on packages (like pa_ppx) they are required, and so it is probably wisest to go ahead and install them now.

Then to build/run it:

make -C testsuite all-tests

Tutorial examples

There are a number of tutorial examples in the directory tutorials; some come with tests. They assume that camlp5 has been installed (and hence, do not reference the build-tree). They also assume that the testsuite prereqs have been installed. To build (and run) the tutorial examples:

make -C tutorials all
make -C tutorials test

The Sphinx/RST documentation uses these examples to explain how to write code using Camlp5.

For Camlp5 Developers

The file DEVEL gives information for people who want to make changes in Camlp5, or who are just curious of how it is implemented. The same explanations are also in the chapter “Camlp5 sources” in the documentation.

Tutorial: Extending OCaml with New Syntax

In this tutorial we’ll explain how to extend Camlp5’s with new syntax and logic to convert that syntax into standard Ocaml AST. Since most users will prefer to use PPX rewriters, I’ll just note that you can find PPX rewriters based on Camlp5 in the pa_ppx project.

The code for this tutorial can be found in the directory tutorials, and in each case, you’ll find a buildable tree (with Makefile), and (sometimes) simple tests.

A simple example: sum 1 ; 2 end

The code for this example is in tutorials/sum. In this example, we’ll add new syntax of the form

sum 1 ; 2 ; 3 end

to the Ocaml syntax. Since “;” is an infix expression operator in the original syntax, this necessarily means that the grammar will be different for revised syntax, than it is for original syntax. We’ll describe both.

The new syntactic form

We’re going to extend Ocaml’s syntax to accept

sum end --> 0
sum 5 end --> 5
sum 4 ; 6 end --> 10
let x = 2 in sum x ; x end --> 4
... etc ...

Code for original syntax

Here’s the code to extend the original syntax with this new syntactic form. (For grins) It’s written in original syntax, but could have been written in revised syntax: the syntax of the extension’s source code has nothing to do with it’s usage.

open Pcaml
EXTEND
  expr: BEFORE "expr1"
  [ [ "sum";
      e =
      FOLD0 (fun e1 e2 -> <:expr< $e2$ + $e1$ >>) <:expr< 0 >>
        expr LEVEL "expr1" SEP ";";
      "end" -> e ] ] ;
END

To build:

ocamlfind ocamlc -package fmt,camlp5.extend,camlp5.extfold,camlp5.quotations \
    -syntax camlp5o -c sum_original.ml

To use it in a toplevel:

#use "topfind.camlp5";;
#camlp5o ;;
#load "sum_original.cmo";;
let x = 1 in sum x ; x ; 2 end ;;

with the obvious output

- : int = 4

Instead of loading the object, we could load the source (but this requires loading parsing kits):

#use "topfind.camlp5";;
#camlp5o ;;
#require "camlp5.extend,camlp5.extfold,camlp5.quotations";;
#use "sum_original.ml";;
let x = 1 in sum x ; x ; 2 end ;;

Code for revised syntax

Here’s the code to extend the revised syntax with this new syntactic form. It’s written in revised syntax, but could have been written in original syntax: the syntax of the extension’s source code has nothing to do with it’s usage.

open Pcaml ;
EXTEND
  expr:
  [ [ "sum";
      e =
      FOLD0 (fun e1 e2 -> <:expr< $e2$ + $e1$ >>) <:expr< 0 >>
        expr SEP ";";
      "end" -> e ] ] ;
END;

To build:

ocamlfind ocamlc -package fmt,camlp5.extend,camlp5.extfold,camlp5.quotations \
    -syntax camlp5r -c sum_revised.ml

To use it in a toplevel:

#use "topfind.camlp5";;
#camlp5r ;;
#load "sum_revised.cmo";
let x = 1 in sum x ; x ; 2 end ;

with the obvious output

- : int = 4

Instead of loading the object, we could load the source (but this requires loading parsing kits):

#use "topfind.camlp5";;
#camlp5r ;;
#require "camlp5.extend,camlp5.extfold,camlp5.quotations";;
#use "sum_revised.ml";;
let x = 1 in sum x ; x ; 2 end ;;

Discussion

The differences between the two versions are only in that, since original syntax already has the “;” operator as an infix operator on expressions, adding this new operator at the precedence level of expressions would not change that “1;2” is already an expression: hence, sum 1 ; 2 end would be parsed as sum (1 ; 2) end. By inserting this new syntax at level "expr" (which binds more tightly than “;”) we get the desired parsing behaviour.

Since in the revised syntax “;” is not an infix operator between expressions (it is only operational in “do” blocks) there’s no problem there.

There are three parsing kits used here. Briefly,

  • camlp5.extend: grammar extension syntax support
  • camlp5.extfold: support for FOLD operators in grammars
  • camlp5.quotations: support for quotations (e.g. <<:expr< 0 >>)

Using camlp5 from the commandline and toplevel is discussed in Quickstart with ocamlfind and The Ocaml Toplevel .

Tutorial: Writing Language Processors with Camlp5

In this tutorial, we’ll start with the simplest program that uses Camlp5 features, and progressively move to more and more complicated programs, until we arrive at a parser/pretty-printer for Google Protocol Buffers IDL.

For users who only wish to use Camlp5 packages (and packages built on Camlp5), it suffices to work thru the “streams-original” and “streams” examples: after that, all the examples are really for users who wish to actually write programs with grammars, extensible printers, and other Camlp5 features.

Camlp5’s syntax-extensions were meant to work with revised syntax, but typically they work just as well with original syntax. To demonstrate that, the most-complicated example (using lexer, grammar, and printer syntax-extensions) is available in both revised and original syntax (though only the revised-syntax version is described herein). A quick diff of the two versions will show that the differences are entirely superficial.

Here is a list of all the calculator examples with their locations. In each case, you’ll find a buildable tree (with Makefile), and simple tests.

  1. Stream parser (original syntax): tutorials/streams-original
  2. Stream parser (revised syntax): tutorials/streams
  3. Extensible grammars (built-in Camlp5 lexer): tutorials/calc
  4. AST, Extensible grammars, extensible printers, evaluator (ocamllex lexer): tutorials/calc+ocamllex
  5. Extensible grammars (functorial interface) and ocamllex lexer: tutorials/calc+functorial
  6. like #4, but with pa_lexer-implementation of a lexer: tutorials/calc+pa_lexer
  7. a translation of #6 into original syntax.

Stream parsers in original syntax: the simplest program using Camlp5 features

The code for this example is in tutorials/streams-original.

A very simple program that uses Camlp5 features is one that uses Stream parsers . In this example you’ll find a parser for integer expressions (addition, subtraction, multiplication, division). It is written using the “original syntax stream parsers” syntax: the code is in tutorials/streams-original/streams.ml, and the critical bit, the parser is as follows:

1. First a left-associative parser-combinator:

let pleft f sep =
let rec paux v1 = parser
  [< op = sep ; v2 = f ; rv = paux (op v1 v2) >] -> rv
| [< >] -> v1
in parser
  [< v = f ; rv = paux v >] -> rv

2. Then parser combinators for the operators:

let additives = parser
  [< 'Kwd"+" >] -> (fun x y -> x + y)
| [< 'Kwd"-" >] -> (fun x y -> x - y)

let multiplicatives = parser
  [< 'Kwd"*" >] -> (fun x y -> x * y)
| [< 'Kwd"/" >] -> (fun x y -> x / y)
  1. And finally, the main parser, which is written in typical operator-precedence style:
let rec expr strm = expr1 strm
and expr1 = parser
  [< rv = pleft expr2 additives >] -> rv

and expr2 = parser
  [< rv = pleft expr3 multiplicatives >] -> rv

and expr3 = parser
  [< 'Int n >] -> n
| [< 'Kwd"(" ; e = expr1 ; 'Kwd")" >] -> e

The documentation section Stream parsers explains the syntax and semantics of stream parsers. To compile this file, we invoke:

ocamlfind ocamlc -package camlp5,fmt -linkpkg \
    -syntax camlp5o streams.ml -o streams

The camlp5 package has built-in support for stream-parsers.

To run this example:

./streams '1+1' '1 - 1' '1 + (2*6)'

Loading and testing in the toplevel

To load and test this example in the toplevel:

$ ocaml
     OCaml version 4.10.0

#use "topfind.camlp5";;
#camlp5o ;;
#require "fmt";;
#use "streams.ml";;

val lexer : char Stream.t -> Genlex.token Stream.t = <fun>
val list_of_stream : 'a Stream.t -> 'a list = <fun>
val pleft :
  ('a Stream.t -> 'b) -> ('a Stream.t -> 'b -> 'b -> 'b) -> 'a Stream.t -> 'b =
  <fun>
val additives : Genlex.token Stream.t -> int -> int -> int = <fun>
val multiplicatives : Genlex.token Stream.t -> int -> int -> int = <fun>
val expr : Genlex.token Stream.t -> int = <fun>
val expr1 : Genlex.token Stream.t -> int = <fun>
val expr2 : Genlex.token Stream.t -> int = <fun>
val expr3 : Genlex.token Stream.t -> int = <fun>
- : unit = ()

And to calculate:

# {| 1 + 1 |} |> Stream.of_string |> lexer |> expr ;;
  • : int = 2

etc.

Stream parsers in revised syntax: the simplest program using Camlp5 features

The code for this example is in tutorials/streams.

Since the rest of the tutorial will be written in The revised syntax , we have transliterated (it’s not very hard) from original to revised syntax. Most of the changes are straightforward: I’ll include only the left-associative parser combinator here:

value pleft f sep =
let rec paux v1 = parser [
  [: op = sep ; v2 = f ; rv = paux (op v1 v2) :] -> rv
| [: :] -> v1
]
in parser [
  [: v = f ; rv = paux v :] -> rv
]
;

To compile this file:

ocamlfind ocamlc -package camlp5,fmt -linkpkg -syntax camlp5r \
    streams.ml -o streams

Note that the only change in the compile line is to replace -syntax camlp5o with -syntax camlp5r. Of course, there are changes in streams.ml from original to revised syntax.

This example runs precisely as the previous one.

A Calculator using Extensible Grammars

The code for this example is in tutorials/calc.

Next, we can replace the stream-parser (and Genlex lexer) with a grammar written using Camlp5’s extensible-grammar support, and Camlp5’s built-in lexer. You can find thie example in tutorials/calc/calc.ml. The grammar is very compact:

value g = Grammar.gcreate (Plexer.gmake ());
value e = Grammar.Entry.create g "expression";

EXTEND
  e:
    [ [ x = e; "+"; y = e -> x + y
      | x = e; "-"; y = e -> x - y ]
    | [ x = e; "*"; y = e -> x * y
      | x = e; "/"; y = e -> x / y ]
    | [ x = INT -> int_of_string x
      | "("; x = e; ")" -> x ] ]
  ;
END;

The syntax and semanatics of extensible grammars are explained in Extensible grammars . The compactness of the code comes from two things:

  1. implicit and explicit support for associativity (left-associativity is the default)
  2. explicit support for precedence via “levels” in the grammar-rules.

To compile this example:

ocamlfind ocamlc -package camlp5,fmt,camlp5.extend -linkpkg \
    -syntax camlp5r calc.ml -o calc

and the only difference is that we have to add the Camlp5 package camlp5.extend which provides extensible-grammar syntax support (for the new syntax we used above, that is most definitely not normal Ocaml!)

A Calculator using Extensible Grammars & Printers (Ocamllex-based Lexer)

The code for this example is found in tutorials/calc+ocamllex.

The previous example used the built-in Camlp5 lexer, which supports Ocaml-style comments. That is, in the text which is parsed by the calculator, ocanl-style comments would be treated as comments and ignored. In this example, we’ll use an ocamllex-generated lexer, which handles C++-ctyle comments instead. To refresh, C++-style comments are thus:

int x = 1 ; // any text to end of line

and we’ll augment the language we parse with variables and assignment-statements, in addition to expressions. We’ll also add a real parse-tree and evaluator. And finally, when pretty-printing, let’s print out comments that appear immediately before statements.

Because this example will be pretty involved, we’ll go thru it step-by-step, explaining each block of code and what it does, with pointers to the relevant bits of documentation.

The Lexer

The lexer is a standard ocamllex lexer. We define regular expressions:

let ws = [' ' '\t' '\r' '\n']+
let decimal_digit = ['0'-'9']
let decimal = decimal_digit+
let comment = "//" [^ '\n']* '\n'
let ident = ['a'-'z' 'A'-'Z' '_'] ['a'-'z' 'A'-'Z' '_' '0'-'9']*

and a tokenizer that accumulates comments (notice they’re C++-style) before a token:

rule _token comments = parse
| comment { _token (comments^(Lexing.lexeme lexbuf)) lexbuf }
| ws     { _token (comments^(Lexing.lexeme lexbuf)) lexbuf }
| "(" { locate ~comments lexbuf ("","(") }
| ")" { locate ~comments lexbuf ("",")") }
| "+" { locate ~comments lexbuf ("","+") }
| "-" { locate ~comments lexbuf ("","-") }
| "*" { locate ~comments lexbuf ("","*") }
| "/" { locate ~comments lexbuf ("","/") }
| ":=" { locate ~comments lexbuf ("",":=") }
| ";" { locate ~comments lexbuf ("",";") }
| decimal as dec { locate ~comments lexbuf ("INT",dec) }
| ident as id { locate ~comments lexbuf ("IDENT",id) }
| eof { locate ~comments lexbuf ("EOI","") }

At end-of-input, we return the special token ("EOI",""), so that the grammar can explicitly require that parsing consume all the input. Notice the way we’re wrapping each return with a locate function-call. This function takes the current lexbuf and comments/whitespace so far accumulated before the token, and builds a Camlp5 location (Ploc.t) to return along with the token:

let locate ~comments lb v =
  let loc = Ploc.make_unlined (Lexing.lexeme_start lb, Lexing.lexeme_end lb) in
  let loc = Ploc.with_comment loc comments in
 (v, loc)

Also, as you can see a token (for Camlp5’s grammar engine) is always a pair of its class (a string) and the text of the token.

To make an ocamllex lexer available to Camlp5’s grammar-interpreter, there is a little bit of special sauce:

value lexer = Plexing.lexer_func_of_ocamllex_located Calclexer.token ;
value lexer = {Plexing.tok_func = lexer;
 Plexing.tok_using _ = (); Plexing.tok_removing _ = ();
 Plexing.tok_match = Plexing.default_match;
 Plexing.tok_text = Plexing.lexer_text;
 Plexing.tok_comm = None} ;

The AST

The AST is straightforward. There are expressions with unary and binary operators, integer constants, and variable-names. There are statements of two kinds: expression-statements and assignment-statements. We will see later an “environment” mapping identifiers to integers, to support these variables and assignments. Notice that most AST nodes also have a Ploc.t. In a real language-processor, this would allow to print locations in error-messages (as we’ll do in the evaluator).

type binop = [ ADD | SUB | DIV | MUL ] ;
type unop = [ PLUS | MINUS ] ;
type expr = [
  BINOP of Ploc.t and binop and expr and expr
| UNOP of Ploc.t and unop and expr
| INT of Ploc.t and int
| VAR of Ploc.t and string ]
and stmt = [
  ASSIGN of Ploc.t and string and expr
| EXPR of Ploc.t and expr
]
;

The Grammar

The grammar is what we’d expect: there are nonterminals for statements, expressions, and a list of statements that consume all the input. For nodes other than toplevel statements, we strip comments from the location. Also, Camlp5’s grammar-interpreter is a classic LL(1) engine, but there is one ambiguity which would require work to resolve: when we see an input like “x”, we don’t know if it will continue as an expression-statement, or an assignment-statement. There are standard ways (in LL(1) grammars) of resolving this, but here I’m just going to do a little bit of lookahead (one token) to check whether the next token is a “:=” (in the function check_id_coloneq). This is something pretty common in writing LL(1) parsers: instead of working hard to make the grammar LL(1), go ahead and use some lookahead.

value g = Grammar.gcreate lexer;
value expr = Grammar.Entry.create g "expression";
value stmt = Grammar.Entry.create g "statement";
value stmts = Grammar.Entry.create g "statements";
value stmts_eoi = Grammar.Entry.create g "statements_eoi";

value loc_strip_comment loc = Ploc.with_comment loc "" ;

value check_id_coloneq =
  Grammar.Entry.of_parser g "check_id_coloneq"
    (fun strm ->
       match Stream.npeek 2 strm with
       [ [("IDENT", _); ("", ":=")] -> ()
       | _ -> raise Stream.Failure ])
;

EXTEND
  GLOBAL: expr stmt stmts stmts_eoi check_id_coloneq ;
  expr:
    [ [ x = expr; "+"; y = expr -> BINOP (loc_strip_comment loc) ADD x y
      | x = expr; "-"; y = expr -> BINOP (loc_strip_comment loc) SUB x y ]
    | [ x = expr; "*"; y = expr -> BINOP (loc_strip_comment loc) MUL x y
      | x = expr; "/"; y = expr -> BINOP (loc_strip_comment loc) DIV x  y ]
    | [ "-" ; x = expr -> UNOP loc MINUS x
      | "+" ; x = expr -> UNOP loc PLUS x ]
    | [ x = INT -> INT loc (int_of_string x)
      | x = IDENT -> VAR loc x
      | "("; x = expr; ")" -> x
      ]
    ]
  ;
  stmt:
    [ [ check_id_coloneq ; id = IDENT ; ":=" ; x = expr -> ASSIGN loc id x
      | x = expr -> EXPR loc x ]
    ]
  ;
  stmts : [ [ l = LIST1 stmt SEP ";" -> l ] ] ;
  stmts_eoi : [ [ l = stmts ; EOI -> l ] ] ;
END;

The Pretty-Printer

We could write the pretty-printer as a recursive function over the types expr and stmt. But instead, we’ll write it using Ocaml’s Extensible printers support. This allows to extend a printer with new rules after it’s been defined (though we won’t do that here). Please consult the documentation on the Pretty module and pprintf to understand how the pretty-printing actually works.

NOTE: this actually really ugly pretty-printing. I haven’t completely figured out how to use pprintf to get nice indentation; when I do, this tutorial will be updated.

First, some setup (defining the printers, and convenience functions that call them:

value parse_expr = Grammar.Entry.parse expr ;
value parse_stmt = Grammar.Entry.parse stmt ;
value parse_stmts = Grammar.Entry.parse stmts ;
value parse_stmts_eoi = Grammar.Entry.parse stmts_eoi ;

value pr_expr = Eprinter.make "expr";
value pr_stmt = Eprinter.make "stmt";
value pr_stmts = Eprinter.make "stmts";

value print_expr = Eprinter.apply pr_expr;
value print_stmt = Eprinter.apply pr_stmt;

Here’s a function that prints out statement, and the comment prior to it, if that comment string is nonempty:

value print_commented_stmt pc stmt =
  let loc = loc_of_stmt stmt in
  let comment = Ploc.comment loc in
  let comment = if has_nonws comment then comment else "" in
  let pp = (fun () -> pprintf pc "%s%p" comment print_stmt stmt) in
    Pretty.horiz_vertic pp pp
;

value print_stmts = Eprinter.apply pr_stmts;

value plist_semi f sh pc l =
  let l = List.map (fun s -> (s, ";")) l in
  pprintf pc "%p" (Prtools.plist f sh) l
;

And finally the printers themselves. Just as with the grammar, it’s defined in precedence levels. Each level has pattern-matching, and the default is to proceed to the next level.

::

EXTEND_PRINTER
  pr_expr:
    [ "add"
      [ BINOP _ ADD x y -> pprintf pc "%p + %p" curr x next y
      | BINOP _ SUB x y -> pprintf pc "%p - %p" curr x next y ]
    | "mul"
      [ BINOP _ MUL x y -> pprintf pc "%p * %p" curr x next y
      | BINOP _ DIV x y -> pprintf pc "%p / %p" curr x next y ]
    | "uminus"
      [ UNOP _ PLUS x -> pprintf pc "+ %p" curr x
      | UNOP _ MINUS x -> pprintf pc "- %p" curr x ]
    | "simple"
      [ INT _ x -> pprintf pc "%d" x
      | x -> pprintf pc "(%p)" print_expr x ]
    ] ;
  pr_stmt:
    [ [ ASSIGN _ id e -> pprintf pc "@[%s := %p@]" id print_expr e
      | EXPR _ e -> pprintf pc "@[%p@]" print_expr e ]
    ]
  ;
  pr_stmts:
    [ [ l -> pprintf pc "{@;%p@;}" (plist_semi print_commented_stmt 0) l ]
    ]
  ;
END;

The Evaluator

The evaluator is bog-standard, but with the one nuance that when it cannot locate a variable in the environment, it raises an exception wrapped with a Ploc.t.

module Eval = struct
value expr env e =
  let rec erec = fun [
    BINOP _ ADD x y -> (erec x)+(erec y)
  | BINOP _ SUB x y -> (erec x)-(erec y)
  | BINOP _ DIV x y -> (erec x)/(erec y)
  | BINOP _ MUL x y -> (erec x)*(erec y)
  | UNOP _ MINUS x -> -(erec x)
  | UNOP _ PLUS x -> erec x
  | INT _ x -> x
  | VAR loc s -> match List.assoc s env with [
      x -> x
    | exception Not_found -> Ploc.raise loc (Failure (Printf.sprintf "variable %s not found in environment" s)) ]
  ]
  in erec e
;
value stmt env = fun [
  ASSIGN _ s e ->
    let v = expr env e in ([ (s, v) :: env ], v)
| EXPR _ e -> (env, expr env e)
]
;

value stmts env l =
  List.fold_left (fun (env, acc) s -> let (env, v) = stmt env s in (env, [v :: acc])) (env, []) l ;
end
;

The Results

On the input:

// foo
1+2 ;
// bar
x := 3

the output is:

{
  // foo
1 + 2;

// bar
x := 3
  } =
  [3; 3]

As I noted above, I haven’t completely figured-out the way pprintf is supposed to be used. Now how about an erroneous input?

echo "1+2+y" | ./calc
File "", line 1, characters 5-6:
Failure("variable y not found in environment")

When we pretty-print the exception, we can pretty-print the location:

try
    let l = parse_stmts_eoi (Stream.of_channel stdin) in do {
      let print_int pc n = pprintf pc "%d" n in
      printf "%s" (pprintf Pprintf.empty_pc "%p =@;@[[%p]@]\n"
                     print_stmts l
                     (plist_semi print_int 2) (List.rev(snd(Eval.stmts [] l))))
    }
with [ Ploc.Exc loc exc ->
    Fmt.(pf stderr "%s%a@.%!" (Ploc.string_of_location loc) exn exc)
  | exc -> Fmt.(pf stderr "%a@.%!" exn exc)
]

A Calculator using Extensible Grammars (the Functorial Interface)

The code for this example is in tutorials/calc+functorial. This example is in the style of “A Calculator using Extensible Grammars”_, but with the functorial interface to grammars. It also uses an ocamllex-based lexer. Here’s the code for the functorial bits:

module Ocamllex_L = struct
type te = (string * string) ;
value lexer = Plexing.lexer_func_of_ocamllex Calclexer.token ;
value lexer = {Plexing.tok_func = lexer;
 Plexing.tok_using _ = (); Plexing.tok_removing _ = ();
 Plexing.tok_match = Plexing.default_match;
 Plexing.tok_text = Plexing.lexer_text;
 Plexing.tok_comm = None} ;
end ;

module Gram = Grammar.GMake(Ocamllex_L) ;

The rest is pretty straightforward and just like the tutorials/calc example.

A Calculator using Extensible Grammars & Printers (pa_lexer-based Lexer)

The code for this example is in tutorials/calc+pa_lexer.

This example replaces the ocamllex-based lexer with one using Camlp5’s builtin pa_lexer syntax extension.

TODO: finish this sub-section.

Transitional and Strict modes

Transitional and Strict modes

Since version 5.00, Camlp5 has been able to be installed in two modes: the transitional mode and the strict mode. When Camlp5 is installed, it works with one only of these modes (the two modes contain indeed different definitions of some interfaces and are incompatible with one another). The user must choose in which mode he wants to use Camlp5.

This notion has been introduced to ensure backward compatibility of the Camlp5 syntax tree, together with the usage of a new quotation kit “q_ast.cmo”, which allows to use Camlp5 syntax tree quotations in user syntax (with all its possible extensions).

A short example of these syntax tree quotations:

If the syntax of the extensible grammars has been added, it is possible to write things like:

<:expr< EXTEND a: [ [ c = d -> $e$ ] ]; END >>;

representing the syntax tree of this statement: this is not possible with the classical quotation kit “q_MLast.cmo” because all quotations must be there only in revised syntax and without syntax extensions.

Here are the differences between the two modes:

Transitional
Compatibility
The syntax tree is fully compatible with the previous versions of Camlp5, no changes has to be done in the users’ programs.
Quotation kit “q_ast.cmo
The antiquotations are not available: when used, a syntax error message is displayed.
Strict
Compatibility
The syntax tree is different, users’ programs may have to be modified, but not necessarily.
Quotation kit “q_ast.cmo
All antiquotations are available.

In strict mode, the programs have more chances to be compatible with the previous versions if they use syntax tree quotations rather than syntax tree nodes. A solution is therefore to change the expressions and patterns using nodes into expressions and patterns using quotations (which is backward compatible).

Which mode is installed ?

To determine the mode of an installed version of Camlp5, type:

camlp5 -pmode

Selecting mode when compiling Camlp5

When compiling Camlp5 from source, the mode must first be selected at configuration time. The configure script must be run with one of these options:

./configure -strict
./configure -transitional

The default is “transitional”, i.e. without option, the sources are compiled in transitional mode.

Parsing and Printing tools

Parsing and Printing tools

Camlp5 provides two parsing tools:

  • stream parsers
  • extensible grammars

The first parsing tool, the stream parsers, is the elementary system. It is pure syntactic sugar, i.e. the code is directly converted into basic OCaml statements: essentially functions, pattern matchings, try. A stream parser is just a function. But the system does not manage associativity, nor parsing level. Left recursion results on infinite loops, just like functions whose first action would be a call to itself.

The second parsing tool, the extensible grammars, are more sophisticated. A grammar written with them is more readable, and look like grammars written with tools like “yacc”. They take care of associativity, left recursion, and level of parsing. They are dynamically extensible, what allows the syntax extensions what Camlp5 provides for OCaml syntax.

In both cases, the input data are streams.

Camlp5 also provides:

  • a pretty printing module
  • extensible printers

The next sections give an overview of the parsing and printing tools.

Stream parsers

The stream parsers is a system of recursive descendant parsing. Streams are actually lazy lists. At each step, the head of the list is compared against a stream pattern. There are three kinds of streams parsers:

  • The imperative streams parsers, where the elements are removed from the stream as long as they are parsed. Parsers return either:
    • A value, in case of success,
    • The exception “Stream.Failure” when the parser does not apply and no elements have been removed from the stream, indicating that, possibly, other parsers may apply,
    • The exception “Stream.Error” when the parser does not apply, but one or several elements have been removed from the stream, indicating that nothing can to be done to make up the error.
  • The functional stream parsers where the elements are not removed from the stream during the parsing. These parsers return a value of type “option”, i.e either:
    • “Some” a value and the remaining stream, in case of success,
    • “None”, in case of failure.
  • The backtracking stream parsers which are like the functional stream parsers but with a backtracking algorithm, testing all possibilities. These parsers also return a value of type “option” different from the functional stream parsers, i.e either:
    • “Some” a value, the remaining stream and a continuation, in case of success,
    • “None”, in case of failure.

The differences are about:

  • Syntax errors: in the imperative version, the location of the error is clear, it is at the current position of the stream, and the system provides a specific error message (typically, that some “element” was “expected”). On the other hand, in the functional and backtracking version, the position is not clear since it returns nothing and the initial stream is unaffected. The only solution to know where the error happened is to analyze that stream to see how many elements have be unfrozen. No clear error message is available, just “syntax error” (but this could be improved in a future version).
  • Power: in the imperative version, when a rule raises the exception “Stream.Error”, the parsing cannot continue. In the functional version, the parsing can continue by analyzing the next rule with the initial unaffected stream: this is limited backtrack. In the backtracking version, more powerful, the parsing continues by analyzing the next case of the previous symbol of the rule; moreover it is possible to get the list of all possible solutions.
  • Neatness: functional streams are neater, just like functional programming is neater than imperative programming.

The imperative parsers implement what is called “predictive parsing”, i.e. recursive descendant parsing without backtrack.

In the imperative version, there also exist lexers, a shorter syntax when the stream elements are of the specific type ‘char’.

Extensible grammars

Extensible grammars manipulate grammar entries. Grammar entries are abstract values internally containing mutable stream parsers. When a grammar entry is created, its internal parser is empty, i.e. it always fails when used. A specific syntactic construction, with the keyword “EXTEND” allows one to extend grammar entries with new grammar rules.

In opposition to stream parsers, grammar entries manage associativity, left factorization, and levels. Moreover, these grammars allow optional calls, lists and lists with separators. They are not however functions and hence cannot have parameters.

Since the internal system is stream parsers, extensible grammars use recursive descendant parsing.

The parsers of the OCaml language in Camlp5 are written with extensible grammars.

Pretty module

The “Pretty” module is an original tool allowing control over the displaying of lines. The user must specify two functions where:

  • the data is printed on a single line
  • the data is printed on several lines

The system first tries to print on a single line. At any time, if the line overflows, i.e. if its size is greater than some “line length” specified in the module interface, or if it contains newlines, the function is aborted and control is given to the second function, to print on several lines.

This is a basic, but powerful, system. It supposes that the programmer takes care of the current indentation, and the beginning and the end of its lines.

The module will be extended in the future to hide the management of indendations and line continuations, and by the supply of functions combinating the two cases above, in which the programmer can specify the possible places where newlines can be inserted.

Extensible printers

The extensible printers are symmetric to the extensible grammars. The extensible grammars take syntax rules and return syntax trees. The extensible printers are actually extensible functions taking syntax trees as parameters and returning the pretty printed statements in strings.

The extensible printers can have printing levels, just like grammars have parsing levels, and it is possible to take the associativity into account by provided functions to call either the current level or the next level.

The printers of the OCaml language are written with extensible printers.

Parsing tools

Stream parsers

Stream parsers

We describe here the syntax and the semantics of the parsers of streams of Camlp5. Streams are kinds of lazy lists. The parsers of these streams use recursive descendent method without backtracking, which is the most natural one in functional languages. In particular, parsers are normal functions.

Notice that the parsers have existed in OCaml since many years (the beginning of the 90ies), but some new features have been added in 2007 (lookahead, “no error” optimization, let..in statement and left factorization) in Camlp5 distribution. This chapter describes them also.

Introduction

Parsers apply to values of type “Stream.t” defined in the module “Stream” of the standard library of OCaml. Like the type “list”, the type “Stream.t” has a type parameter, indicating the type of its elements. They differ from the lists that they are lazy (the elements are evaluated as long as the parser need them for its actions), and imperative (parsers deletes their first elements when they take their parsing decisions): notice that purely functional parsers exist in Camlp5, where the corresponding streams are lazy and functional, the analyzed elements remaining in the initial stream and the semantic action returning the resulting stream together with the normal result, which allow natural limited backtrack but have the drawback that it is not easy to find the position of parsing errors when they happen.

Parsers of lazy+imperative streams, which are described here, use a method named “recursive descendent”: they look at the first element, they decide what to do in function of its value, and continue the parsing with the remaining elements. Parsers can call other parsers, and can be recursive, like normal functions.

Actually, parsers are just pure syntactic sugar. When writing a parser in the syntax of the parser, Camlp5 transforms them into normal call to functions, use of patterns matchings and try..with statements. The pretty printer of Camlp5, by default, displays this expanded result, without syntax of parsers. A pretty printing kit, when added, can rebuild the parsers in their initial syntax and display it.

Syntax

The syntax of the parsers, when loading “pa_rp.cmo” (or already included in the command “camlp5r”), is the following:

          expression ::= parser
                       | match-with-parser
              parser ::= "parser" pos-opt "[" parser-cases "]"
                       | "parser" pos-opt parser-case
   match-with-parser ::= "match" expression "with" parser
        parser-cases ::= parser-cases parser-case
                       | <nothing>
         parser-case ::= "[:" stream-pattern ":]" pos-opt "->" expression
      stream-pattern ::= stream-patt-comp
                       | stream-patt-comp ";" stream-patt-cont
                       | "let" LIDENT "=" expression "in" stream-pattern
                       | <nothing>
    stream-patt-cont ::= stream-patt-comp-err
                       | stream-patt-comp-err ";" stream-patt-cont
                       | "let" LIDENT "=" expression "in" stream-patt-cont
stream-patt-comp-err ::= stream-patt-comp
                       | stream-patt-comp "?" expression
                       | stream-patt-comp "!"
    stream-patt-comp ::= "`" pattern
                       | "`" pattern "when" expression
                       | "?=" lookaheads
                       | pattern "=" expression
                       | pattern
          lookaheads ::= lookaheads "|" lookahead
                       | lookahead
           lookahead ::= "[" patterns "]"
            patterns ::= patterns pattern
                       | pattern
             pos-opt ::= pattern
                       | <nothing>

Streams

The parsers are functions taking streams as parameter. Streams are are values of type “Stream.t a” for some type “a”. It is possible to build streams using the functions defined in the module “Stream”:

Stream.from

Stream.from f” returns a stream built from the function “f”. To create a new stream element, the function “f” is called with the current stream count, starting with zero. The user function “f” must return either “Some <value>” for a value or “None” to specify the end of the stream.

Stream.of_list

Return a stream built from the list in the same order.

Stream.of_string

Return a stream of the characters of the string parameter.

Stream.of_channel

Return a stream of the characters read from the input channel parameter.

Semantics of parsers

Parser

A parser, defined with the syntax “parser” above, is of type “Stream.t a -> b” where “a” is the type of the elements of the streams and “b” the type of the result. The parser cases are tested in the order they are defined until one of them applies. The result is the semantic action of the parser case which applies. If no parser case applies, the exception “Stream.Failure” is raised.

When testing a parser case, if the first stream pattern component matches, all remaining stream pattern components of the stream pattern must match also. If one does not match, the parser raises the exception “Stream.Error” which has a parameter of type string: by default, this string is the empty string, but if the stream pattern component which does not match is followed by a question mark and an expression, this expression is evaluated and given as parameter to “Stream.Error”.

In short, a parser can return with three ways:

  • A normal result, of type “b” for a parser of type “Stream.t a -> b”.
  • Raising the exception “Stream.Failure”.
  • Raising the exception “Stream.Error”.

Fundamentally, the exception “Stream.Failure” means “this parser does not apply and no element have been removed from the initial stream”. This is a normal case when parsing: the parser locally fails, but the parsing can continue.

Conversely, the exception “Stream.Error” means that “this parser encountered a syntax error and elements have probably been removed from the stream”. In this case, there is no way to recover the parsing, and it definitively fails.

Left factorization

In parsers, consecutive rules starting with the same components are left factorized. It means that they are transformed into one only rule starting with the common path, and continuing with a call to a parser separating the two cases. The order is kept, except that the possible empty rule is inserted at the end.

For example, the parser:

parser
[ [: `If; e1 = expr; `Then; e2 = expr; `Else; e3 = expr :] -> f e1 e2 e3
| [: `If; e1 = expr; `Then; e2 = expr :] -> g e1 e2 ]

is transformed into:

parser
  [: `If; e1 = expr; `Then; e2 = expr;
     a =
       parser
       [ [: `Else; e3 = expr :] -> f e1 e2 e3
       | [: :] -> g e1 e2 ] :] -> a

The version where rules are inverted:

parser
[ [: `If; e1 = expr; `Then; e2 = expr :] -> g e1 e2
| [: `If; e1 = expr; `Then; e2 = expr; `Else; e3 = expr :] -> f e1 e2 e3 ]

is transformed into the same parser.

Notice that:

  • Only consecutive rules are left factorized. In the following parser:

    parser
    [ [: `If; e1 = expr; `Then; e2 = expr; `Else; e3 = expr :] -> ...
    | [: a = b :] -> a
    | [: `If; e1 = expr; `Then; e2 = expr :] -> ... ]
    

    the two rules starting with “If” are not left factorized, and the second “If” rule will never work.

  • The components which are not identical are not factorized. In the following parser:

    parser
    [ [: `If; e1 = expr; `Then; e2 = expr; `Else; e3 = expr :] -> ...
    | [: `If; e4 = expr; `Then; e2 = expr :] -> ... ]
    

    only the first component, “If” is factorized, the second one being different because of different patterns (”e1” and “e4”).

Match with parser

The syntax “match expression with parser” allows to match a stream against a parser. It is, for “parser”, the equivalent of “match expression with” for “fun”. The same way we could say:

match expression with ...

could be considered as an equivalent to:

(fun ...) expression

we could consider that:

match expression with parser ...

is an equivalent to:

(parser ...) expression

Error messages

A “Stream.Error” exception is raised when a stream pattern component does not match and that it is not the first one of the parser case. This exception has a parameter of type string, useful to specify the error message. By default, this is the empty string. To specify an error message, add a question mark and an expression after the stream pattern component. A typical error message is “that stream pattern component expected”. Example with the parser of “if..then..else..” above:

parser
  [: `If; e1 = expr ? "expression expected after 'if'";
     `Then ? "'then' expected";
     e2 = expr ? "expression expected after 'then'";
     a =
       parser
       [ [: `Else; e3 = expr ? "expression expected" :] -> f e1 e2 e3
       | [: :] -> g e1 e2 ] :] -> a

Notice that the expression after the question mark is evaluated only in case of syntax error. Therefore, it can be a complicated call to a complicated function without slowing down the normal parsing.

Stream pattern component

In a stream pattern (starting with “[:” and ending with “:]”), the stream pattern components are separated with the semicolon character. There are three cases of stream pattern components with some sub-cases for some of them, and an extra syntax can be used with a “let..in” construction. The three cases are:

  • A direct test of one or several stream elements (called terminal symbol), in three ways:

    1. The character “backquote” followed by a pattern, meaning: if the stream starts with an element which is matched by this pattern, the stream pattern component matches, and the stream element is removed from the stream.
    2. The character “backquote” followed by a pattern, the keyword “when” and an expression of type “bool”, meaning: if the stream starts with an element which is matched by this pattern and if the evaluation of the expression is “True”, the stream pattern component matches, and the first element of the stream is removed.
    3. The character “question mark” followed by the character “equal” and a lookahead expression (see further), meaning: if the lookahead applies, the stream pattern component matches. The lookahead may unfreeze one or several elements on the stream, but does not remove them.
  • A pattern followed by the “equal” sign and an expression of type “Stream.t x -> y” for some types “x” and “y”. This expression is called a non terminal symbol. It means: call the expression (which is a parser) with the current stream. If this sub-parser:

    1. Returns an element, the pattern is bound to this result and the next stream pattern component is tested.

    2. Raises the exception “Stream.Failure”, there are two cases:

      • if the stream pattern component is the first one of the stream case, the current parser also fails with the exception “Stream.Failure”.
      • if the stream pattern component is not the first one of the stream case, the current parser fails with the exception “Stream.Error”.

      In this second case:

      • If the stream pattern component is followed by a “question mark” and an expression (which must be of type “string”), the expression is evaluated and given as parameter of the exception “Stream.Error”.
      • If the expression is followed by an “exclamation mark”, the test and conversion from “Stream.Failure” to “Stream.Error” is not done, and the parser just raises “Stream.Failure” again. This is an optimization which must be assumed by the programmer, in general when he knows that the sub-parser called never raises “Stream.Failure” (for example if the called parser ends with a parser case containing an empty stream pattern). See “no error optionization” below.
      • Otherwise the exception parameter is the empty string.
  • A pattern, which is bound to the current stream.

Notice that patterns are bound immediately and can be used in the next stream pattern component.

Let statement

Between stream pattern components, it is possible to use the “let..in” construction. This is not considered as a real stream pattern component, in the fact that is is not tested against the exception “Stream.Failure” it may raise. It can be useful for intermediate computation. In particular, it is used internally by the lexers (see chapter about lexers as character stream parsers).

Example of use, when an expression have to be used several times (in the example, “d a”, which is bound to the variable “c”):

parser
  [: a = b;
     let c = d a in
     e =
       parser
       [ [: f = g :] -> h c
       | [: :] -> c ] :] -> e

Lookahead

The lookahead feature allows to look at several terminals in the stream without removing them, in order to take decisions when more than one terminal is necessary.

For example, when parsing the normal syntax of the OCaml language, there is a problem, in recursing descendent parsing, for the cases where to treat and differentiate the following inputs:

(-x+1)
(-)

The first case is treated in a rule, telling: “a left parenthesis, followed by an expression, and a right parenthesis”. The second one is “a left parenthesis, an operator, a right parenthesis”. Programming it like this (left factorizing the first parenthesis):

parser
  [: `Lparen;
     e =
       parser
       [ [: e = expr; `Rparen :] -> e
       | [: `Minus; `Rparen :] -> minus_op ] :] -> e

does not work if the input is “(-)” because the rule “e = expr” accepts the minus sign as expression start, removing it from the input stream and fails as parsing error, while encountering the right parenthesis.

Conversely, writing it this way:

parser
  [: `Lparen;
     e =
       parser
       [ [: `Minus; `Rparen :] -> minus_op
       | [: e = expr; `Rparen :] -> e ] :] -> e

does not help, because if the input is “(-x+1)” the rule above starting with “`Minus” is accepted and the exception “Stream.Error” is raised while encountering the variable “x” since a right parenthesis is expected.

In general, this kind of situation is best resolved by a left factorization of the parser cases (see the section “Semantics” above), but that is not possible in this case. The solution is to test whether the character after the minus sign is a right parenthesis:

parser
  [: `Lparen;
     e =
       parser
       [ [: ?= [ _ ; Rparen ]; `Minus; `Rparen :] -> minus_op
       | [: e = expr; `Rparen :] -> e ] :] -> e

It is possible to put several lists of (semicolon-separated) patterns separated by a vertical bar in the lookahead construction, but with a limitation (due to the implementation): all lists of patterns must have the same number of elements.

No error optimization

The “no error optimization” is the fact to end a stream pattern component of kind “non-terminal” (“pattern” “equal” “expression”) by the character “exclamation mark”. Like said above, this inhibits the transformation of the exception “Stream.Failure”, possibly raised by the called parser, into the exception “Stream.Error”.

The code:

parser [: a = b; c = d ! :] -> e

is equivalent to:

parser [: a = b; s :] -> let c = d s in e

One interest of the first syntax is that it shows to readers that “d” is indeed a syntactic sub-parser. In the second syntax, it is called in the semantic action, which makes the parser case not so clear, as far as readability is concerned.

If the stream pattern component is at end of the stream pattern, this allow possible tail recursion by the OCaml compiler, in the following case:

parser [: a = b; c = d ! :] -> c

since it is equivalent (with the fact that “c” is at the same time the pattern of the last case and the expression of the parser case semantic action) to:

parser [: a = b; s :] -> d s

The call to “d s” can be a tail recursive call. Without the use of the “exclamation mark” in the rule, the equivalent code is:

parser [: a = b; s :] ->
  try d s with [ Stream.Failure -> raise (Stream.Error "") ]

which is not tail recursive (due to the “try..with” construction pushes a context), preventing the compiler to optimize its code. This can be important when many recursive calls happen, since it can overflow the OCaml stack.

Position

The optional “pattern” before and after a stream pattern is bound to the current stream count. Indeed, streams internally contain a count of their elements. At the beginning the count is zero. When an element is removed, the count is incremented. The example:

parser [: a = b :] ep -> c

is equivalent to:

parser [: a = b; s :] -> let ep = Stream.count s in c

There is no direct syntax equivalent to the optional pattern at beginning of the stream pattern:

parser bp [: a = b :] -> c

These optional patterns allow disposal of the stream count at the beginning and at the end of the parser case, allowing to compute locations of the rule in the source. In particular, if the stream is a stream of characters, these counts are the source location in number of characters.

Semantic action

In a parser case, after the stream pattern, there is an “arrow” and an expression, called the “semantic action”. If the parser case is matched the parser returns with the evaluated expression whose environment contains all values bound in the stream pattern.

Remarks

Simplicity vs Associativity

This parsing technology has the advantage of simplicity of use and understanding, but it does not treat the associativity of operators. For example, if you write a parser like this (to compute arithmetic expressions):

value rec expr =
  parser
  [ [: e1 = expr; `'+'; e2 = expr :] -> e1 + e2
  | [: `('0'..'9' as c) :] -> Char.code c - Char.code '0' ]

this would loop endlessly, exactly as if you wrote code starting with:

value rec expr e =
  let e1 = expr e in
  ...

One solution is to treat the associativity “by hand”: by reading a sub-expression, then looping with a parser which parses the operator and another sub-expression, and so on.

An alternative solution is to write parsing “combinators”. Indeed, parsers being normal functions, it is possible to make a function which takes a parser as parameter and returning a parser using it. For example, left and right associativity parsing combinators:

value rec left_assoc op elem =
  let rec op_elem x =
    parser
    [ [: t = op; y = elem; r = op_elem (t x y) :] -> r
    | [: :] -> x ]
  in
  parser [: x = elem; r = op_elem x :] -> r
;

value rec right_assoc op elem =
  let rec op_elem x =
    parser
    [ [: t = op; y = elem; r = op_elem y :] -> t x r
    | [: :] -> x ]
  in
  parser [: x = elem; r = op_elem x :] -> r
;

which can be used, e.g. like this:

value expr =
  List.fold_right (fun op elem -> op elem)
    [left_assoc (parser [: `'+' :] -> fun x y -> x +. y);
     left_assoc (parser [: `'*' :] -> fun x y -> x *. y);
     right_assoc (parser [: `'^' :] -> fun x y -> x ** y)]
    (parser [: `('0'..'9' as c) :] -> float (Char.code c - Char.code '0'))
;

and tested, e.g. in the toplevel, like that:

expr (Stream.of_string "2^3^2+1");

The same way, it is possible to parse non-context free grammars, by programming parsers returning other parsers.

A third solution, to resolve the problem of associativity, is to use the grammars of Camlp5, which have the other advantage that they are extensible.

Lexing vs Parsing

In general, while analyzing a language, there are two levels:

  • The level where the input, considered as a stream of characters, is read to make a stream of tokens (for example “words”, if it is a human language, or punctuation). This level is generally called “lexing”.
  • The level where the input is a stream of tokens where grammar rules are parsed. This level is generally called “parsing”.

The “parser” construction described here can be used for both, thanks to the polymorphism of OCaml:

  • The lexing level is a “parser” of streams of characters returning tokens.
  • The parsing level is a “parser” of streams of tokens returning syntax trees.

By comparison, the programs “lex” and “yacc” use two different technologies. With “parser”s, it is possible to use the same one for both.

Lexer syntax vs Parser syntax

For “lexers”, i.e. for the specific case of parsers when the input is a stream of characters, it is possible to use a shorter syntax. See the chapter on lexers. They have another syntax, shorter and adapted for the specific type “char”. But they still are internally parsers of streams with the same semantics.

Purely functional parsers

This system of parsers is imperative: while parsing, the stream advances and the already parsed terminals disappear from the stream structure. This is useful because it is not necessary to return the remaining stream together with the normal result. This is the reason there is this “Stream.Error” exception: when it is raised, it means that some terminals have been consummed from the stream, which are definitively lost, and therefore that are no more possible parser cases to try.

An alternative is to use functional parsers which use a new stream type, lazy but not destructive. Their advantage is that they use a limited backtrack: the case of “if..then..else..” and the shorter “if..then..” work without having to left factorize the parser cases, and there is no need to lookahead. They have no equivalent to the exception “Stream.Error”: when all cases are tested, and have failed, the parsers return the value “None”. The drawback is that, when a parsing error happens, it is not easily possible to know the location of the error in the input, as the initial stream has not been modified: the system would indicate a failure at the first character of the first line: this is a general drawback of backtracking parsers. See the solutions found to this problem in the chapter about purely functional parsers.

A second alternative is to use the backtracking parsers. They use the same stream type as the functional parsers, but they test more cases than them. They have the same advantages and drawbacks than the functional parsers.

Stream lexers

Stream lexers

The file “pa_lexer.cmo” is a Camlp5 syntax extension kit for parsers of streams of the type ‘char’. This syntax is shorter and more readable than its equivalent version written with classical stream parsers. Like classical parsers, they use recursive descendant parsing. They are also pure syntax sugar, and each lexer written with this syntax can be written using normal parsers syntax.

(An old version, named “pa_lex.cmo” was provided before with a different syntax. It is no longer distributed, its proposed syntax being confusing.)

Introduction

Classical parsers in OCaml apply to streams of any type of values. For the specific type “char”, it has been possible to shorten the encoding in several ways, in particular by using strings to group several characters together, and by hidding the management of a “lexing buffer”, a data structure recording the matched characters.

Let us take an example. The following function parses a left bracket followed by a less, a colon or nothing, and record the result in a buffer. In classical parsers syntax, this could be written like this:

fun buf ->
  parser
  [ [: `'['; `'<' :] ->
      Plexing.Lexbuf.add '<' (Plexing.Lexbuf.add '[' buf)
  | [: `'['; `':' :] ->
      Plexing.Lexbuf.add ':' (Plexing.Lexbuf.add '[' buf)
  | [: `'[' :] ->
      Plexing.Lexbuf '[' buf ]

With the new syntax, it is possible to write it as:

lexer [ "[<" | "[:" | "[" ]

The two codes are strictly equivalent, but the lexer version is easier to write and understand, and is much shorter.

Syntax

When loading the syntax extension pa_lexer.cmo, the OCaml syntax is extended as follows:

        expression ::= lexer
             lexer ::= "lexer" "[" rules "]"
             rules ::= ne-rules rule
                     | <nothing>
          ne-rules ::= ne-rules "|" rule
                     | rule
              rule ::= symbols [ "->" action ]
           symbols ::= symbols symbol err
                     | <nothing>
            symbol ::= "_" no-record-opt
                     | CHAR no-record-opt
                     | CHAR "-" CHAR no-record-opt
                     | STRING no-record-opt
                     | simple-expression
                     | "?=" "[" lookaheads "]"
                     | "[" rules "]"
     no-record-opt ::= "/"
                     | <nothing>
 simple-expression ::= LIDENT
                     | "(" <expression> ")"
        lookaheads ::= lookaheads "|" lookahead-sequence
                     | lookahead-sequence
lookahead-sequence ::= lookahead-symbols
                     | STRING
 lookahead-symbols ::= lookahead-symbols lookahead-symbol
                     | lookahead-symbol
  lookahead-symbol ::= CHAR
                     | CHAR "-" CHAR
                     | "_"
               err ::= "?" simple-expression
                     | "!"
                     | <nothing>
            action ::= expression

The identifiers “STRING”, “CHAR” and “LIDENT” above represent the OCaml tokens corresponding to string, character and lowercase identifier (identifier starting with a lowercase character).

Moreover, together with that syntax extension, another extension is added the entry expression, typically for the semantics actions of the “lexer” statement above, but not only. It is:

expression ::= "$" "add" STRING
             | "$" "buf"
             | "$" "empty"
             | "$" "pos"

Remark: the identifiers “add”, “buf”, “empty” and “pos” are not keywords (they are not reserved words) but just identifiers. On the contrary, the identifier “lexer”, which introduces the syntax, is a new keyword and cannot be used as variable identifier any more.

Semantics

A lexer defined in the syntax above is a shortcut version of a parser applied to the specific case of streams of characters. It could be written with a normal parser. The proposed syntax is much shorter, easier to use and to understand, and silently takes care of the lexing buffer for the programmer. The lexing buffers are data structures, which are passed as parameters to called lexers and returned by them.

Our lexers are of the type:

Plexing.Lexbuf.t -> Stream.t char -> u

where “u” is a type which depends on what the lexer returns. If there is no semantic action (since it it optional), this type is automatically “Plexing.Lexbuf.t” also.

A lexer is, actually, a function with two implicit parameters: the first one is the lexing buffer itself, and the second one the stream. When called, it tries to match the stream against its first rule. If it fails, it tries its second rule, and so on, up to its last rule. If the last rule fails, the lexer fails by raising the exception “Stream.Failure”. All of this is the usual behaviour of stream parsers.

In a rule, when a character is matched, it is inserted into the lexing buffer, except if the “no record” feature is used (see further).

Rules which have no semantic action return the lexing buffer itself.

Symbols

The different kinds or symbols in a rule are:

  • The token “underscore”, which represents any character. Fails only if the stream is empty.
  • A character which represents a matching of this character.
  • A character followed by the minus sign and by another character which represent all characters in the range between the two characters in question.
  • A string with represents a matching of all its characters, one after the other.
  • An expression corresponding to a call to another lexer, which takes the buffer as first parameter and returns another lexing buffer with all characters found in the stream added to the initial lexing buffer.
  • The sequence “?=” introducing lookahead characters.
  • A rule, recursively, between brackets, inlining a lexer.

In the cases matching characters (namely underscore, character, characters range and string), the symbol can be optionally followed by the “no record” character “slash” specifying that the found character(s) are not added into the lexing buffer. By default, they are. This feature is useful, for example, writing a lexer which parses strings, when the initial double quote and final double quote should not be part of the string itself.

Moreover, a symbol can be followed by an optional error indicator, which can be:

  • The character ? (question mark) followed by a string expression, telling that, if there is a syntax error at this point (i.e. the symbol is not matched although the beginning of the rule was), the exception Stream.Error is raised with that string as parameter. Without this indicator, it is raised with the empty string. This is the same behaviour than with classical stream parsers.
  • The character ! (exclamation mark), which is just an indicator to let the syntax expander optimize the code. If the programmer is sure that the symbol never fails (i.e. never raises Stream.Failure), in particular if this symbol recognizes the empty rule, he can add this exclamation mark. If it is used correctly (the compiler cannot check it), the behaviour is identical as without the !, except that the code is shorter and faster, and can sometimes be tail recursive. If the indication is not correct, the behaviour of the lexer is undefined.

Specific expressions

When loading this syntax extension, the entry <expression>, at level labelled “simple” of the OCaml language is extended with the following rules:

  • $add followed by a string, specifing that the programmer wants to add all characters of the string in the lexing buffer. It returns the new lexing buffer. It corresponds to an iteration of calls to Plexing.Lexbuf.add with all characters of the string with the current lexing buffer as initial parameter.
  • $buf which returns the lexing buffer converted into string.
  • $empty which returns an empty lexing buffer.
  • $pos which returns the current position of the stream in number of characters (starting at zero).

Lookahead

Lookahead is useful in some cases, when factorization of rules is impossible. To understand how it is useful, a first remark must be done, about the usual behaviour of Camlp5 stream parsers.

Stream parsers (including these lexers) use a limited parsing algorithm, in a way that when the first symbol of a rule is matched, all the following symbols of the same rule must apply, otherwise it is a syntax error. There is no backtrack. In most of the cases, left factorization of rules resolve conflicting problems. For example, in parsers of tokens (which is not our case here, since we parse only characters), when one writes a parser to recognize both typical grammar rules “if..then..else” and the shorter “if..then..”, the system transforms them into a single rule starting with “if..then..” followed by a call to a parser recognizing “else..” or nothing.

Sometimes, however, this left factorization is not possible. A lookahead of the stream to check the presence of some elements (these elements being characters, if we are using this “lexer” syntax) might be necessary to decide if is a good idea to start the rule. This lookahead feature may unfreeze several characters from the input stream but without removing them.

Syntactically, a lookahead starts with ?= and is followed by one or several lookahead sequences separated by the vertical bar |, the whole list being enclosed by braces.

If there are several lookaheads, they must all be of the same size (contain the same number of characters).

If the lookahead sequence is just a string, it corresponds to all characters of this string in the order (which is different for strings outside lookahead sequences, representing a choice of all characters).

Examples of lookaheads:

?= [ _ ''' | '\\' _ ]
?= [ "<<" | "<:" ]

The first line above matches a stream whose second character is a quote or a stream whose first character is a backslash (real example in the lexer of OCaml, in the library of Camlp5, named “plexer.ml”). The second line matches a stream starting with the two characters < and < or starting with the two characters < and : (this is another example in the same file).

Semantic actions of rules

By default, the result of a “lexer” is the current lexing buffer, which is of type “Plexing.Lexbuf.t”. But it is possible to return other values, by adding “->” at end of rules followed by the expression you want to return, as in usual pattern matching in OCaml.

An interesting result, for example, could be the string corresponding to the characters of the lexing buffer. This can be obtained by returning the value “$buf”.

A complete example

A complete example can be seen in the sources of Camlp5, file “lib/plexer.ml”. This is the lexer of OCaml, either “normal” or “revised” syntax.

Compiling

To compile a file containing lexers, just load pa_lexer.cmo using one of the following methods:

  • Either by adding pa_lexer.cmo among the Camlp5 options. See the Camlp5 manual page or documentation.
  • Or by adding #load "pa_lexer.cmo"; anywhere in the file, before the usages of this “lexer” syntax.

How to display the generated code

You can see the generated code, for a file “bar.ml” containing lexers, by typing in a command line:

camlp5r pa_lexer.cmo pr_r.cmo bar.ml

To see the equivalent code with stream parsers, use:

camlp5r pa_lexer.cmo pr_r.cmo pr_rp.cmo bar.ml

Functional parsers

Functional parsers

Purely functional parsers are an alternative of stream parsers where the used stream type is a lazy non-destructive type. These streams are lazy values, as in classical stream parsers, but the values are not removed as long as the parsing advances.

To make them work, the parsers of purely functional streams return, not the simple values, but a value of type option : “None” meaning “no match” (the equivalent of the exception “Parse.Failure” of normal streams) and “Some (r, s)” meaning “the result is r and the remaining stream is s”.

Syntax

The syntax of purely functional parsers, when loading “pa_fstream.cmo”, is the following:

        expression ::= fparser
                     | match-with-fparser
           fparser ::= "fparser" pos-opt "[" parser-cases "]"
                     | "fparser" pos-opt parser-case
match-with-fparser ::= "match" expression "with" fparser
      parser-cases ::= parser-cases parser-case
                     | <nothing>
       parser-case ::= "[:" stream-pattern ":]" pos-opt "->" expression
                     | "[:" ":]" pos-opt "->" expression
    stream-pattern ::= stream-patt-comp
                     | stream-patt-comp ";" stream-pattern
  stream-patt-comp ::= "`" pattern
                     | "`" pattern "when" expression
                     | pattern "=" expression
                     | pattern
                     | "when" expression
           pos-opt ::= pattern
                     | <nothing>

Notice that, unlike classical parsers, there is no difference, in a stream pattern, between the first stream pattern component and the other ones. In particular, there is no “question mark” syntax and expression optionnally ending those components. Moreover, the “lookahead” case is not necessary, we see further why. The syntaxes “pattern when” and “let..in” inside stream patterns we see in classical parsers are not implemented.

Streams

The functional parsers are functions taking as parameters functional streams, which are values of type “Fstream.t   a” for some type “a”. It is possible to build functional streams using the functions defined in the module “Fstream”:

Fstream.from

Fstream.from f” returns a stream built from the function “f”. To create a new stream element, the function “f” is called with the current stream count, starting with zero. The user function “f” must return either “Some <value>” for a value or “None” to specify the end of the stream.

Fstream.of_list

Return a stream built from the list in the same order.

Fstream.of_string

Return a stream of the characters of the string parameter.

Fstream.of_channel

Return a stream of the characters read from the input channel parameter.

Semantics of parsers

Fparser

The purely functional parsers act like classical parsers, with a recursive descent algorithm, except that:

  • If the first stream pattern component matches the beginning of the stream, there is no error if the following stream patterns components do not match: the control simply passes to the next parser case with the initial stream.
  • If the semantic actions are of type “t”, the result of the parser is of type “option (t * Fstream.t)”, not just “t” like in classical parsers. If a stream pattern matches, the semantic action is evaluated, giving some result “e” and the result of the parser is “Some (e, strm)” where “strm” is the remaining stream.
  • If no parser case matches, the result of the parser is “None”.

Error position

A difficulty, with purely functional parsers, is how to find the position of the syntax error, when the input is wrong. Since the system tries all parsers cases before returning “None”, and that the initial stream is not affected, it is not possible to directly find where the error happened. This is a problem for parsing using backtracking (here, it is limited backtracking, but the problem is the same).

The solution is to use the function “Fstream.count_unfrozen” applied to the initial stream. Like its name says, it returns the number of unfrozen elements of the stream, which is exactly the longest match found. If the input is a stream of characters, the return of this function is exactly the position in number of characters from the beginning of the stream.

However, it is not possible to know directly which rule failed and therefore it is not possible, as in classical parsers, to specify and get clear error messages. Future versions of purely functional parsers may propose solutions to resolve this problem.

Notice that, if using the “count_unfrozen” method, it is not possible to reuse that same stream to call another parser, and hope to get the right position of the error, if another error happens, since it may test less terminals than the first parser. Use a fresh stream in this case, if possible.

Backtracking parsers

Backtracking parsers

Backtracking parsers are a second alternative of stream parsers and functional parsers.

Backtracking parsers are close to functional parsers: they use the same stream type, “Fstream.t”, and their syntax is almost identical, its introducing keyword being “bparser” instead of “fparser”.

The difference is that they are implemented with full backtracking and that they return values of the type “option” of the triplet: 1/ value, 2/ remaining stream and 3/ continuation.

Syntax

The syntax of backtracking parsers is added together with the syntax of functional parsers, when the kit “pa_fstream.cmo” is loaded. It is:

        expression ::= bparser
                     | match-with-bparser
           bparser ::= "bparser" pos-opt "[" parser-cases "]"
                     | "bparser" pos-opt parser-case
match-with-bparser ::= "match" expression "with" bparser
      parser-cases ::= parser-cases parser-case
                     | <nothing>
       parser-case ::= "[:" stream-pattern ":]" pos-opt "->" expression
                     | "[:" ":]" pos-opt "->" expression
    stream-pattern ::= stream-patt-comp
                     | stream-patt-comp ";" stream-pattern
  stream-patt-comp ::= "`" pattern
                     | "`" pattern "when" expression
                     | pattern "=" expression
                     | pattern
                     | "when" expression
           pos-opt ::= pattern
                     | <nothing>

Semantics

Algorithm

The backtracking parsers, like classical parsers and functional parsers, use a recursive descent algorithm. But:

  • If a stream pattern component does not match the current position of the input stream, the control is given to the next case of the stream pattern component before it. If it is the first stream pattern component, the rule (the stream pattern) is left and the next rule is tested.

For example, the following grammar:

E -> X Y
X -> a b | a
Y -> b

works, with the backtracking algorithm, for the input “a   b”.

Parsing with the non-terminal “E”, the non-terminal “X” first accepts the input “a b” with its first rule. Then when “Y” is called, the parsing fails since nothing remains in the input stream.

In the rule “X Y” of the non-terminal “E”, the non-terminal “Y” having failed, the control is given the the continuation of the non-terminal “X”. This continuation is its second rule containing only “a”. Then “Y” is called and accepted.

This case does not work with functional parsers since when the rule “a b” of the non-terminal “X” is accepted, it is definitive. If the input starts with “a b”, there is no way to apply its second rule.

Backtracking parsers are strictly more powerful than functional parsers.

Type

A backtracking parser whose stream elements are of type “t1”, and whose semantic actions are of some type “t2”, is of type:

Fstream.t t1 -> option (t * Fstream.t t1 * Fstream.kont t1 t2)

If the backtracking parsers fails, its returning value is “None”.

If it succeeds, its returning value is “Some (x, strm, k)” where “x” is its result, “strm” the remaining stream, and “k” the continuation.

The continuation is internally used in the backtracking algorithm, but is can also be used in the main call to compute the next solution, using the function “Fstream.bcontinue”.

It is also possible to directly get the list of all solutions by calling the function “Fstream.bparse_all”.

Syntax errors

Like for functional parsers, in case of syntax error, the error position can be found by using the function “Fstream.count_unfrozen”, the token in error being the last unfrozen element of the stream.

A syntax error is not really an error: for the backtracking parsers, like for functional parsers, it is viewed as a “non-working” case and another solution is searched.

In the backtracking algorithm, depending on the grammar and the input, the search of the next solution can be very long. A solution is proposed for that in the extensible grammars system when the parsing algorithm is set to “backtracking”.

Example

Here is an example which just shows the backtracking algorithm but without parsing, an empty stream being given as parameter and never referred.

It creates a list of three strings, each of them being choosen between "A", "B" and "C".

The code is:

#load "pa_fstream.cmo";
value choice = bparser [ [: :] -> "A" | [: :] -> "B" | [: :] -> "C" ];
value combine = bparser [: x = choice; y = choice; z = choice :] -> [x; y; z];

The function “combine” returns the first solution:

# combine (fstream [: :]);
- : option (list string * Fstream.t '_a * Fstream.kont '_a (list string)) =
Some (["A"; "A"; "A"], <abstr>, Fstream.K <fun>)

The function “Fstream.bparse_all” returns the list of all solutions, showing the interest of the backtracking:

# Fstream.bparse_all combine (fstream [: :]);
- : list (list string) =
[["A"; "A"; "A"]; ["A"; "A"; "B"]; ["A"; "A"; "C"]; ["A"; "B"; "A"];
 ["A"; "B"; "B"]; ["A"; "B"; "C"]; ["A"; "C"; "A"]; ["A"; "C"; "B"];
 ["A"; "C"; "C"]; ["B"; "A"; "A"]; ["B"; "A"; "B"]; ["B"; "A"; "C"];
 ["B"; "B"; "A"]; ["B"; "B"; "B"]; ["B"; "B"; "C"]; ["B"; "C"; "A"];
 ["B"; "C"; "B"]; ["B"; "C"; "C"]; ["C"; "A"; "A"]; ["C"; "A"; "B"];
 ["C"; "A"; "C"]; ["C"; "B"; "A"]; ["C"; "B"; "B"]; ["C"; "B"; "C"];
 ["C"; "C"; "A"]; ["C"; "C"; "B"]; ["C"; "C"; "C"]]

Extensible grammars

Extensible grammars

This chapter describes the syntax and semantics of the extensible grammars of Camlp5.

The extensible grammars are the most advanced parsing tool of Camlp5. They apply to streams of characters using a lexer which has to be previously defined by the programmer. In Camlp5, the syntax of the OCaml language is defined with extensible grammars, which makes Camlp5 a bootstrapped system (it compiles its own features by itself).

Getting started

The extensible grammars are a system to build grammar entries which can be extended dynamically. A grammar entry is an abstract value internally containing a stream parser. The type of a grammar entry is "Grammar.Entry.e t" where "t" is the type of the values returned by the grammar entry.

To start with extensible grammars, it is necessary to build a grammar, a value of type “Grammar.g”, using the function “Grammar.gcreate”:

value g = Grammar.gcreate lexer;

where “lexer” is a lexer previously defined. See the section explaining the interface with lexers. In a first time, it is possible to use a lexer of the module “Plexer” provided by Camlp5:

value g = Grammar.gcreate (Plexer.gmake ());

Each grammar entry is associated with a grammar. Only grammar entries of the same grammar can call each other. To create a grammar entry, one has to use the function “Grammar.Entry.create” with takes the grammar as first parameter and a name as second parameter. This name is used in case of syntax errors. For example:

value exp = Grammar.Entry.create g "expression";

To apply a grammar entry, the function “Grammar.Entry.parse” can be used. Its first parameter is the grammar entry, the second one a stream of characters:

Grammar.Entry.parse exp (Stream.of_string "hello");

But if you experiment this, since the entry was just created without any rules, you receive an error message:

Stream.Error "entry [expression] is empty"

To add grammar rules to the grammar entry, it is necessary to extend it, using a specific syntactic statement: “EXTEND”.

Syntax of the EXTEND statement

The “EXTEND” statement is added in the expressions of the OCaml language when the syntax extension kit “pa_extend.cmo” is loaded. Its syntax is:

  expression ::= extend
      extend ::= "EXTEND" extend-body "END"
 extend-body ::= global-opt entries
  global-opt ::= "GLOBAL" ":" entry-names ";"
               | <nothing>
 entry-names ::= entry-name entry-names
               | entry-name
       entry ::= entry-name ":" position-opt "[" levels "]"
position-opt ::= "FIRST"
               | "LAST"
               | "BEFORE" label
               | "AFTER" label
               | "LIKE" string
               | "LEVEL" label
               | <nothing>
      levels ::= level "|" levels
               | level
       level ::= label-opt assoc-opt "[" rules "]"
   label-opt ::= label
               | <nothing>
   assoc-opt ::= "LEFTA"
               | "RIGHTA"
               | "NONA"
               | <nothing>
       rules ::= rule "|" rules
               | rule
        rule ::= psymbols-opt "->" expression
               | psymbols-opt
psymbols-opt ::= psymbols
               | <nothing>
    psymbols ::= psymbol ";" psymbols
               | psymbol
     psymbol ::= symbol
               | pattern "=" symbol
      symbol ::= keyword
               | token
               | token string
               | entry-name
               | entry-name "LEVEL" label
               | "SELF"
               | "NEXT"
               | "LIST0" symbol
               | "LIST0" symbol "SEP" symbol opt-opt-sep
               | "LIST1" symbol
               | "LIST1" symbol "SEP" symbol opt-opt-sep
               | "OPT" symbol
               | "FLAG" symbol
               | "V" symbol opt-strings
               | "[" rules "]"
               | "(" symbol ")"
 opt-opt-sep ::= "OPT_SEP"
               | <nothing>
 opt-strings ::= string opt-strings
               | <nothing>
     keyword ::= string
       token ::= uident
       label ::= string
  entry-name ::= qualid
      qualid ::= qualid "." qualid
               | uident
               | lident
      uident ::= 'A'-'Z' ident
      lident ::= ('a'-'z' | '_' | misc-letter) ident
       ident ::= ident-char*
  ident-char ::= ('a'-'a' | 'A'-'Z' | '0'-'9' | '_' | ''' | misc-letter)
 misc-letter ::= '\128'-'\255'

Other statements, “GEXTEND”, “DELETE_RULE”, “GDELETE_RULE” are also defined by the same syntax extension kit. See further.

In the description above, only “EXTEND” and “END” are new keywords (reserved words which cannot be used in variables, constructors or module names). The other strings (e.g. “GLOBAL”, “LEVEL”, “LIST0”, “LEFTA”, etc.) are not reserved.

Semantics of the EXTEND statement

The EXTEND statement starts with the “EXTEND” keyword and ends with the “END” keyword.

GLOBAL indicator

After the first keyword, it is possible to see the identifier “GLOBAL” followed by a colon, a list of entries names and a semicolon. It says that these entries correspond to visible (previously defined) entry variables, in the context of the EXTEND statement, the other ones being locally and silently defined inside.

  • If an entry, which is extended in the EXTEND statement, is in the GLOBAL list, but is not defined in the context of the EXTEND statement, the OCaml compiler will fail with the error “unbound value”.
  • If there is no GLOBAL indicator, and an entry, which is extended in the EXTEND statement, is not defined in the contex of the EXTEND statement, the OCaml compiler will also fail with the error “unbound value”.

Example:

value exp = Grammar.Entry.create g "exp";
EXTEND
  GLOBAL: exp;
  exp: [ [ x = foo; y = bar ] ];
  foo: [ [ "foo" ] ];
  bar: [ [ "bar" ] ];
END;

The entry “exp” is an existing variable (defined by value exp = …). On the other hand, the entries “foo” and “bar” have not been defined. Because of the GLOBAL indicator, the system define them locally.

Without the GLOBAL indicator, the three entries would have been considered as global variables, therefore the OCaml compiler would say “unbound variable” under the first undefined entry, “foo”.

Entries list

Then the list of entries extensions follow. An entry extension starts with the entry name followed by a colon. An entry may have several levels corresponding to several stream parsers which call the ones the others (see further).

Optional position

After the colon, it is possible to specify a where to insert the defined levels:

  • The identifier “FIRST” (resp. “LAST”) indicates that the level must be inserted before (resp. after) all possibly existing levels of the entry. They become their first (resp. last) levels.
  • The identifier “BEFORE” (resp. “AFTER”) followed by a level label (a string) indicates that the levels must be inserted before (resp. after) that level, if it exists. If it does not exist, the extend statement fails at run time.
  • The identifier “LIKE” followed by a string indicates that the first level defined in the extend statement must be inserted in the first already existing level with a rule containing this string as keyword or token name. For example, “LIKE "match"” is the first level having “match” as keyword. If there is no level with this string, the extend statement fails at run time.
  • The identifier “LEVEL” followed by a level label indicates that the first level defined in the extend statement must be inserted at the given level, extending and modifying it. The other levels defined in the statement are inserted after this level, and before the possible levels following this level. If there is no level with this label, the extend statement fails at run time.
  • By default, if the entry has no level, the levels defined in the statement are inserted in the entry. Otherwise the first defined level is inserted at the first level of the entry, extending or modifying it. The other levels are inserted afterwards (before the possible second level which may previously exist in the entry).

Levels

After the optional “position”, the level list follow. The levels are separated by vertical bars, the whole list being between brackets.

A level starts with an optional label, which corresponds to its name. This label is useful to specify this level in case of future extensions, using the position (see previous section) or for possible direct calls to this specific level.

The level continues with an optional associativity indicator, which can be:

  • LEFTA for left associativity (default),
  • RIGHTA for right associativity,
  • NONA for no associativity.

Rules

At last, the grammar rule list appear. The rules are separated by vertical bars, the whole list being brackets.

A rule looks like a match case in the “match” statement or a parser case in the “parser” statement: a list of psymbols (see next paragraph) separated by semicolons, followed by a right arrow and an expression, the semantic action. Actually, the right arrow and expression are optional: in this case, it is equivalent to an expression which would be the unit “()” constructor.

A psymbol is either a pattern, followed with the equal sign and a symbol, or by a symbol alone. It corresponds to a test of this symbol, whose value is bound to the pattern if any.

Symbols

A symbol is an item in a grammar rule. It is either:

  • a keyword (a string): the input must match this keyword,
  • a token name (an identifier starting with an uppercase character), optionally followed by a string: the input must match this token (any value if no string, or that string if a string follows the token name), the list of the available tokens depending on the associated lexer (the list of tokens available with “Plexer.gmake ()” is: LIDENT, UIDENT, TILDEIDENT, TILDEIDENTCOLON, QUESTIONIDENT, INT, INT_l, INT_L, INT_n, FLOAT, CHAR, STRING, QUOTATION, ANTIQUOT and EOI; other lexers may propose other lists of tokens),
  • an entry name, which correspond to a call to this entry,
  • an entry name followed by the identifier “LEVEL” and a level label, which correspond to the call to this entry at that level,
  • the identifier “SELF” which is a recursive call to the present entry, according to the associativity (i.e. it may be a call at the current level, to the next level, or to the top level of the entry): “SELF” is equivalent to the name of the entry itself,
  • the identifier “NEXT”, which is a call to the next level of the current entry,
  • a left brace, followed by a list of rules separated by vertical bars, and a right brace: equivalent to a call to an entry, with these rules, inlined,
  • a meta symbol (see further),
  • a symbol between parentheses.

The syntactic analysis follow the list of symbols. If it fails, depending on the first items of the rule (see the section about the kind of grammars recognized):

  • the parsing may fail by raising the exception “Stream.Error
  • the parsing may continue with the next rule.

Meta symbols

Extra symbols exist, allowing to manipulate lists or optional symbols. They are:

  • LIST0 followed by a symbol: this is a list of this symbol, possibly empty,
  • LIST0 followed by a symbol, SEP and another symbol, and optional OPT_SEP: this is a list, possibly empty, of the first symbol separated by the second one, possibly ended with the separator if OPT_SEP is present,
  • LIST1 followed by a symbol: this is a list of this symbol, with at least one element,
  • LIST1 followed by a symbol, SEP and another symbol, and optional OPT_SEP: this is a list, with at least one element, of the first symbol separated by the second one, possibly ended with the separator if OPT_SEP is present,
  • OPT followed by a symbol: equivalent to “this symbol or nothing” returning a value of type “option”.
  • FLAG followed by a symbol: equivalent to “this symbol or nothing”, returning a boolean.

The V meta symbol

The V meta symbol is destinated to allow antiquotations while using the syntax tree quotation kit `q_ast.cmo <q_ast.html>`__. It works only in strict mode. In transitional mode, it is just equivalent to its symbol parameter.

Antiquotation kind

The antiquotation kind is the optional identifier between the starting “$” (dollar) and the “:” (colon) in a quotation of syntax tree (see the chapter syntax tree).

The optional list of strings following the “V” meta symbol and its symbol parameter gives the allowed antiquotations kinds.

By default, this string list, i.e. the available antiquotation kinds, is:

  • ["flag"] for FLAG
  • ["list"] for LIST0 and LIST1
  • ["opt"] for OPT

For example, the symbol:

V (FLAG "rec")

is like “FLAG” while normally parsing, allowing to parse the keyword “rec”. While using it in quotations, also allows the parse the keyword “rec” but, moreover, the antiquotation “$flag:..$” where “..” is an expression or a pattern depending on the position of the quotation.

There are also default antiquotations kinds for the tokens used in the OCaml language predefined parsers “pa_r.cmo” (revised syntax) and “pa_o.cmo” (normal syntax), actually all parsers using the provided lexer “Plexer” (see the chapter Library). They are:

  • ["chr"] for CHAR
  • ["flo"] for FLOAT
  • ["int"] for INT
  • ["int32"] for INT_l
  • ["int64"] for INT_L
  • ["nativeint"] for INT_n
  • ["lid"] for LIDENT
  • ["str"] for STRING
  • ["uid"] for UIDENT

It is also possible to use the “V” meta symbol over non-terminals (grammars entries), but there is no default antiquotation kind. For example, while parsing a quotation, the symbol:

V foo "bar" "oops"

corresponds to either a call to the grammar entry “foo”, or to the antiquotations “$bar:...$” or “$oops:...$”.

Type

The type of the value returned by a V meta symbol is:

  • in transitional mode, the type of its symbol parameter,
  • in strict mode, “Ploc.vala t”, where “t” is its symbol parameter.

In strict mode, if the symbol parameter is found, whose value is, say, “x”, the result is “Ploc.VaVal x”. If an antiquotation is found the result is “Ploc.VaAnt s” where “s” is some string containing the antiquotation text and some other internal information.

Rules insertion

Remember that “EXTEND” is a statement, not a declaration: the rules are added in the entries at run time. Each rule is internally inserted in a tree, allowing the left factorization of the rule. For example, with this list of rules (borrowed from the Camlp5 sources):

"method"; "private"; "virtual"; l = label; ":"; t = poly_type
"method"; "virtual"; "private"; l = label; ":"; t = poly_type
"method"; "virtual"; l = label; ":"; t = poly_type
"method"; "private"; l = label; ":"; t = poly_type; "="; e = expr
"method"; "private"; l = label; sb = fun_binding
"method"; l = label; ":"; t = poly_type; "="; e = expr
"method"; l = label; sb = fun_binding

the rules are inserted in a tree and the result looks like:

"method"
   |-- "private"
   |       |-- "virtual"
   |       |       |-- label
   |       |             |-- ":"
   |       |                  |-- poly_type
   |       |-- label
   |             |-- ":"
   |             |    |-- poly_type
   |             |            |-- ":="
   |             |                 |-- expr
   |             |-- fun_binding
   |-- "virtual"
   |       |-- "private"
   |       |       |-- label
   |       |             |-- ":"
   |       |                  |-- poly_type
   |       |-- label
   |             |-- ":"
   |                  |-- poly_type
   |-- label
         |-- ":"
         |    |-- poly_type
         |            |-- "="
         |                 |-- expr
         |-- fun_binding

This tree is built as long as rules are inserted. When used, by applying the function “Grammar.Entry.parse” to the current entry, the input is matched with that tree, starting from the tree root, descending on it as long as the parsing advances.

There is a different tree by entry level.

Semantic action

The semantic action, i.e. the expression following the right arrow in rules, contains in its environment:

  • the variables bound by the patterns of the symbols found in the rules,
  • the specific variable “loc” which contain the location of the whole rule in the source.

The location is an abstract type defined in the module “Ploc” of Camlp5.

It is possible to change the name of this variable by using the option “-loc” of Camlp5. For example, compiling a file like this:

camlp5r -loc foobar file.ml

the variable name, for the location will be “foobar” instead of “loc”.

The DELETE_RULE statement

The “DELETE_RULE” statement is also added in the expressions of the OCaml language when the syntax extension kit “pa_extend.cmo” is loaded. Its syntax is:

      expression ::= delete-rule
     delete-rule ::= "DELETE_RULE" delete-rule-body "END"
delete-rule-body ::= entry-name ":" symbols
         symbols ::= symbol symbols
                   | symbol

See the syntax of the EXTEND statement for the meaning of the syntax entries not defined above.

The entry is scanned for a rule matching the giving symbol list. When found, the rule is removed. If no rule is found, the exception “Not_found” is raised.

Extensions FOLD0 and FOLD1

When loading “pa_extfold.cmo” after “pa_extend.cmo”, the entry “symbol” of the EXTEND statement is extended with what is named the fold iterators, like this:

     symbol ::= "FOLD0" simple_expr simple_expr symbol
              | "FOLD1" simple_expr simple_expr symbol
              | "FOLD0" simple_expr simple_expr symbol "SEP" symbol
              | "FOLD1" simple_expr simple_expr symbol "SEP" symbol
simple_expr ::= expr (level "simple")

Like their equivalent with the lists iterators: “LIST0”, “LIST1”, “LIST0SEP”, “LIST1SEP”, they read a sequence of symbols, possibly with the separators, but instead of building the list of these symbols, apply a fold function to each symbol, starting at the second “expr” (which must be a expression node) and continuing with the first “expr” (which must be a function taking two expressions and returing a new expression).

The list iterators can be seen almost as a specific case of these fold iterators where the initial “expr” would be:

<:expr< [] >>

and the fold function would be:

fun e1 e2 -> <:expr< [$e1$ :: $e2$ ] >>

except that, implemented like that, they would return the list in reverse order.

Actually, a program using them can be written with the lists iterators with the semantic action applying the function “List.fold_left” to the returned list, except that with the fold iterators, this operation is done as long as the symbols are read on the input, no intermediate list being built.

Example, file “sum.ml”:

#load "pa_extend.cmo";
#load "pa_extfold.cmo";
#load "q_MLast.cmo";
let loc = Ploc.dummy in
EXTEND
  Pcaml.expr:
    [ [ "sum";
        e =
          FOLD0 (fun e1 e2 -> <:expr< $e2$ + $e1$ >>) <:expr< 0 >>
            Pcaml.expr SEP ";";
        "end" -> e ] ]
  ;
END;

which can be compiled like this:

ocamlc -pp camlp5r -I +camlp5 -c sum.ml

and tested:

  ocaml -I +camlp5 camlp5r.cma sum.cmo
          Objective Caml version ...

          Camlp5 Parsing version ...

  # sum 3;4;5 end;
- : int = 12

Grammar machinery

We explain here the detail of the mechanism of the parsing of an entry.

Start and Continue

At each entry level, the rules are separated into two trees:

  • The tree of the rules not starting with the current entry name nor by “SELF”.
  • The tree of the rules starting with the current entry name or by the identifier “SELF”, this symbol not being included in the tree.

They determine two functions:

  • The function named “start”, analyzing the first tree.
  • The function named “continue”, taking, as parameter, a value previously parsed, and analyzing the second tree.

A call to an entry, using “Grammar.Entry.parse” correspond to a call to the “start” function of the first level of the entry.

The “start” function tries its associated tree. If it works, it calls the “continue” function of the same level, giving the result of “start” as parameter. If this “continue” function fails, this parameter is simply returned. If the “start” function fails, the “start” function of the next level is tested. If there is no more levels, the parsing fails.

The “continue” function first tries the “continue” function of the next level. If it fails, or if it is the last level, it tries its associated tree, then calls itself again, giving the result as parameter. If its associated tree fails, it returns its extra parameter.

Associativity

While testing the tree, there is a special case for rules ending with SELF or with the current entry name. For this last symbol, there is a call to the “start” function: of the current level if the level is right associative, or of the next level otherwise.

There is no behaviour difference between left and non associative, because, in case of syntax error, the system attempts to recover the error by applying the “continue” function of the previous symbol (if this symbol is a call to an entry).

When a SELF or the current entry name is encountered in the middle of the rule (i.e. if it is neither the first nor the last symbol), there is a call to the “start” function of the first level of the current entry.

Example. Let us consider the following grammar:

EXTEND
  expr:
    [ "minus" LEFTA
      [ x = SELF; "-"; y = SELF -> x -. y ]
    | "power" RIGHTA
      [ x = SELF; "**"; y = SELF -> x ** y ]
    | "simple"
      [ "("; x = SELF; ")" -> x
      | x = INT -> float_of_int x ] ]
  ;
END

The left “SELF”s of the two levels “minus” and “power” correspond to a call to the next level. In the level “minus”, the right “SELF” also, and the left associativity is treated by the fact that the “continue” function is called (starting with the keyword “-” since the left “SELF” is not part of the tree). On the other hand, for the level “power”, the right “SELF” corresponds to a call to the current level, i.e. the level “power” again. At end, the “SELF” between parentheses of the level “simple” correspond to a call to the first level, namely “minus” in this grammar.

Parsing algorithm

By default, the kind of grammar is predictive parsing grammar, i.e. recursive descent parsing without backtrack. But with some nuances, due to the improvements (error recovery and token starting rules) indicated in the next sections.

However, it is possible to change the parsing algorithm, by calling the function “Grammar.set_algorithm”. The possible values are:

Grammar.Predictive
internally using normal parsers, with a predictive (recursive descent without backtracking) algorithm.
Grammar.Functional
internally using functional parsers, with a limited backtracking algorithm,
Grammar.Backtracking
internally using backtracking parsers, with a full backtracking algorithm,
Grammar.DefaultAlgorithm
the parsing algorithm is determined by the environment variable “CAMLP5PARAM”. If this environment variable exists and contains “f”, the parsing algorithm is “functional”; if it it “b”, the parsing algorithm is “backtracking”. Otherwise it is “predictive”.

An interesting function, when using then backtracking algorithm, is “Grammar.Entry.parse_all” which returns all solutions of a given input.

See details in the chapter Library, section “Grammar module”.

Errors and recovery

In extensible grammars, the exceptions are encapsulated with the exception “Ploc.Exc” giving the location of the error together with the exception itself.

If the parsing algorithm is “Grammar.Predictive”, the system internally uses stream parsers. Two exceptions may happen: “Stream.Failure” or “Stream.Error”. “Stream.Failure” indicates that the parsing just could not start. “Stream.Error” indicates that the parsing started but failed further.

With this algorithm, when the first symbol of a rule has been accepted, all the symbols of the same rule must be accepted, otherwise the exception “Stream.Error” is raised.

If the parsing algorithm is “Grammar.Functional” (resp. “Grammar.Backtracking”), the system internally uses functional parsers (resp backtracking parsers. If no solution is found, the exception “Stream.Error” is raised and the location of the error is the location of the last unfrozen token, i.e. where the stream advanced the farthest.

In extensible grammars, unlike stream parsers, before the “Stream.Error” exception, the system attempts to recover the error by the following trick: if the previous symbol of the rule was a call to another entry, the system calls the “continue” function of that entry, which may resolve the problem.

Tokens starting rules

Another improvement (other than error recovery) is that when a rule starts with several tokens and/or keywords, all these tokens and keywords are tested in one time, and the possible “Stream.Error” may happen, only from the symbol following them on, if any.

The Grammar module

See its section in the chapter “Library”.

Interface with the lexer

To create a grammar, the function “Grammar.gcreate” must be called, with a lexer as parameter.

A simple solution, as possible lexer, is the predefined lexer built by “Plexer.gmake ()”, lexer used for the OCaml grammar of Camlp5. In this case, you can just put it as parameter of “Grammar.gcreate” and it is not necessary to read this section.

The section first introduces the notion of “token patterns” which are the way the tokens and keywords symbols in the EXTEND statement are represented. Then follow the description of the type of the parameter of “Grammar.gcreate”.

Token patterns

A token pattern is a value of the type defined like this:

type pattern = (string * string);

This type represents values of the token and keywords symbols in the grammar rules.

For a token symbol in the grammar rules, the first string is the token constructor name (starting with an uppercase character), the second string indicates whether the match is “any” (the empty string) or some specific value of the token (an non-empty string).

For a keyword symbol, the first string is empty and the second string is the keyword itself.

For example, given this grammar rule:

"for"; i = LIDENT; "="; e1 = SELF; "to"; e2 = SELF

the different symbols and keywords are represented by the following couples of strings:

  • the keyword “for” is represented by ("", "for"),
  • the keyword “=” by ("", "="),
  • the keyword “to” by ("", "to")),
  • and the token symbol LIDENT by ("LIDENT", "").

The symbol UIDENT "Foo" in a rule would be represented by the token pattern:

("UIDENT", "Foo")

Notice that the symbol “SELF” is a specific symbol of the EXTEND syntax: it does not correspond to a token pattern and is represented differently. A token constructor name must not belong to the specific symbols: SELF, NEXT, LIST0, LIST1, OPT and FLAG.

The lexer record

The type of the parameter of the function “Grammar.gcreate” is “lexer”, defined in the module “Plexing”. It is a record type with the following fields:

tok_func

It is the lexer itself. Its type is:

Stream.t char -> (Stream.t (string * string) * location_function);

The lexer takes a character stream as parameter and return a couple of containing: a token stream (the tokens being represented by a couple of strings), and a location function.

The location function is a function taking, as parameter, a integer corresponding to a token number in the stream (starting from zero), and returning the location of this token in the source. This is important to get good locations in the semantic actions of the grammar rules.

Notice that, despite the lexer taking a character stream as parameter, it is not mandatory to use the stream parsers technology to write the lexer. What is important is that it does the job.

tok_using

Is a function of type:

pattern -> unit

The parameter of this function is the representation of a token symbol or a keyword symbol in grammar rules. See the section about token patterns.

This function is called for each token symbol and each keyword encountered in the grammar rules of the EXTEND statement. Its goal is to allow the lexer to check that the tokens and keywords do respect the lexer rules. It checks that the tokens exist and are not mispelled. It can be also used to enter the keywords in the lexer keyword tables.

Setting it as the function that does nothing is possible, but the check of correctness of tokens is not done.

In case or error, the function must raise the exception “Plexing.Error” with an error message as parameter.

tok_removing

Is a function of type:

pattern -> unit

It is possibly called by the DELETE_RULE statement for tokens and keywords no longer used in the grammar. The grammar system maintains a number of usages of all tokens and keywords and calls this function only when this number reaches zero. This can be interesting for keywords: the lexer can remove them from its tables.

tok_match

Is a function of type:

pattern -> ((string * string) -> unit)

The function tells how a token of the input stream is matched against a token pattern. Both are represented by a couple of strings.

This function takes a token pattern as parameter and return a function matching a token, returning the matched string or raising the exception “Stream.Failure” if the token does not match.

Notice that, for efficiency, it is necessary to write this function as a match of token patterns returning, for each case, the function which matches the token, not a function matching the token pattern and the token together and returning a string for each case.

An acceptable function is provided in the module “Plexing” and is named “default_match”. Its code looks like this:

value default_match =
  fun
  [ (p_con, "") ->
      fun (con, prm) -> if con = p_con then prm else raise Stream.Failure
  | (p_con, p_prm) ->
      fun (con, prm) ->
        if con = p_con && prm = p_prm then prm else raise Stream.Failure ]
;

tok_text

Is a function of type:

pattern -> string

Designed for error messages, it takes a token pattern as parameter and returns the string giving its name.

It is possible to use the predefined function “lexer_text” of the Plexing module. This function just returns the name of the token pattern constructor and its parameter if any.

For example, with this default function, the token symbol IDENT would be written as IDENT in error message (e.g. “IDENT expected”). The “text” function may decide to print it differently, e.g., as “identifier”.

tok_comm

Is a mutable field of type:

option (list location)

It asks the lexer (the lexer function should do it) to record the locations of the comments in the program. Setting this field to “None” indicates that the lexer must not record them. Setting it to “Some []” indicated that the lexer must put the comments location list in the field, which is mutable.

Minimalist version

If a lexer have been written, named “lexer”, here is the minimalist version of the value suitable as parameter to “Grammar.gcreate”:

{Plexing.tok_func = lexer;
 Plexing.tok_using _ = (); Plexing.tok_removing _ = ();
 Plexing.tok_match = Plexing.default_match;
 Plexing.tok_text = Plexing.lexer_text;
 Plexing.tok_comm = None}

Functorial interface

The normal interface for grammars described in the previous sections has two drawbacks:

  • First, the type of tokens of the lexers must be “(string *       string)
  • Second, since the entry type has no parameter to specify the grammar it is bound to, there is no static check that entries are compatible, i.e. belong to the same grammar. The check is done at run time.

The functorial interface resolve these two problems. The functor takes a module as parameter where the token type has to be defined, together with the lexer returning streams of tokens of this type. The resulting module define entries compatible the ones to the other, and this is controlled by the OCaml type checker.

The syntax extension must be done with the statement GEXTEND, instead of EXTEND, and deletion by GDELETE_RULE instead of DELETE_RULE.

The lexer type

In the section about the interface with the lexer, we presented the “Plexing.lexer” type as a record without type parameter. Actually, this type is defined as:

type lexer 'te =
  { tok_func : lexer_func 'te;
    tok_using : pattern -> unit;
    tok_removing : pattern -> unit;
    tok_match : pattern -> 'te -> string;
    tok_text : pattern -> string;
    tok_comm : mutable option (list location) }
;

where the type parameter is the type of the token, which can be any type, different from “(string * string)”, providing the lexer function (tok_func) returns a stream of this token type and the match function (tok_match) indicates how to match values of this token type against the token patterns (which remain defined as “(string * string)”).

Here is an example of an user token type and the associated match function:

type mytoken =
  [ Ident of string
  | Int of int
  | Comma | Equal
  | Keyw of string  ]
;

value mymatch =
  fun
  [ ("IDENT", "") ->
      fun [ Ident s -> s | _ -> raise Stream.Failure ]
  | ("INT", "") ->
      fun [ Int i -> string_of_int i | _ -> raise Stream.Failure ]
  | ("", ",") ->
      fun [ Comma -> "" | _ -> raise Stream.Failure ]
  | ("", "=") ->
      fun [ Equal -> "" | _ -> raise Stream.Failure ]
  | ("", s) ->
      fun
      [ Keyw k -> if k = s then "" else raise Stream.Failure
      | _ -> raise Stream.Failure ]
  | _ -> raise (Plexing.Error "bad token in match function") ]
;

The functor parameter

The type of the functor parameter is defined as:

module type GLexerType =
  sig
    type te = 'x;
    value lexer : Plexing.lexer te;
  end;

The token type must be specified (type “te”) and the lexer also, with the interface for lexers, of the lexer type defined above, the record fields being described in the section “interface with the lexer”, but with a general token type.

The resulting grammar module

Once a module of type “GLexerType” has been built (previous section), it is possible to create a grammar module by applying the functor “Grammar.GMake”. For example:

module MyGram = Grammar.GMake MyLexer;

Notice that the function “Entry.parse” of this resulting module does not take a character stream as parameter, but a value of type “parsable”. This function is equivalent to the function “parse_parsable” of the non functorial interface. In short, the parsing of some character stream “cs” by some entry “e” of the example grammar above, must be done by:

MyGram.Entry.parse e (MyGram.parsable cs)

instead of:

MyGram.Entry.parse e cs

GEXTEND and GDELETE_RULE

The “GEXTEND” and “GDELETE_RULE” statements are also added in the expressions of the OCaml language when the syntax extension kit “pa_extend.cmo” is loaded. They must be used for grammars defined with the functorial interface. Their syntax is:

         expression ::= gextend
                      | gdelete-rule
       gdelete-rule ::= "GDELETE_RULE" gdelete-rule-body "END"
            gextend ::= "GEXTEND" gextend-body "END"
       gextend-body ::= grammar-module-name extend-body
  gdelete-rule-body ::= grammar-module-name delete-rule-body
grammar-module-name ::= qualid

See the syntax of the EXTEND statement for the meaning of the syntax entries not defined above.

An example: arithmetic calculator

Here is a small calculator of expressions. They are given as parameters of the command.

File “calc.ml”:

#load "pa_extend.cmo";

value g = Grammar.gcreate (Plexer.gmake ());
value e = Grammar.Entry.create g "expression";

EXTEND
  e:
    [ [ x = e; "+"; y = e -> x + y
      | x = e; "-"; y = e -> x - y ]
    | [ x = e; "*"; y = e -> x * y
      | x = e; "/"; y = e -> x / y ]
    | [ x = INT -> int_of_string x
      | "("; x = e; ")" -> x ] ]
  ;
END;

open Printf;

for i = 1 to Array.length Sys.argv - 1 do {
  let r = Grammar.Entry.parse e (Stream.of_string Sys.argv.(i)) in
  printf "%s = %d\n" Sys.argv.(i) r;
  flush stdout;
};

The link needs the library “gramlib.cma” provided with Camlp5:

ocamlc -pp camlp5r -I +camlp5 gramlib.cma test/calc.ml -o calc

Examples:

$ ./calc '239*4649'
239*4649 = 1111111
$ ./calc '(47+2)/3'
(47+2)/3 = 16

Printing tools

Extensible printers

Extensible printers

This chapter describes the syntax and semantics of the extensible printers of Camlp5.

Similar to the extensible grammars, the extensible printers allow to define and extend printers of data or programs. A specific statement “EXTEND_PRINTER” allow to define these extensions.

Getting started

A printer is a value of type “Eprinter.t a” where “a” is the type of the item to be printed. When applied, a printer returns a string, representing the printed item.

To create a printer, one must use the function “Eprinter.make” with, as parameter, the name of the printer, (used in error messages). A printer is created empty, i.e. it fails if it is applied.

As with grammar entries, printers may have several levels. When the function “Eprinter.apply” is applied to a printer, the first level is called. The function “Eprinter.apply_level” allows to call a printer at some specific level possibly different from the first one. When a level does not match any value of the printed item, the next level is tested. If there is no more levels, the printer fails.

In semantic actions of printers, functions are provided to recursively call the current level and the next level. Moreover, a printing context variable is also given, giving the current indentation, what should be printed before in the same line and what should be printed after in the same line (it is not mandatory to use them).

The extension of printers can be done with the “EXTEND_PRINTER” statement added by the parsing kitpa_extprint.cmo”.

Syntax of the EXTEND_PRINTER statement

      expression ::= extend-statement
extend-statement ::= "EXTEND_PRINTER" extend-body "END"
     extend-body ::= extend-printers
 extend-printers ::= extend-printer extend-printers
                   | <nothing>
  extend-printer ::= printer-name ":" position-opt "[" levels "]"
    position-opt ::= "FIRST"
                   | "LAST"
                   | "BEFORE" label
                   | "AFTER" label
                   | "LEVEL" label
                   | <nothing>
          levels ::= level "|" levels
                   | level
           level ::= label-opt "[" rules "]"
       label-opt ::= label
                   | <nothing>
           rules ::= rule "|" rules
                   | rule
            rule ::= pattern "->" expression
                   | pattern "when" expression "->" expression
    printer-name ::= qualid
          qualid ::= qualid "." qualid
                   | uident
                   | lident
          uident ::= 'A'-'Z' ident
          lident ::= ('a'-'z' | '_' | misc-byte) ident
           ident ::= ident-char*
      ident-char ::= ('a'-'a' | 'A'-'Z' | '0'-'9' | '_' | ''' | misc-byte)
       misc-byte ::= '\128'-'\255'

Semantics of EXTEND_PRINTER

Printers definition list

All printers are extended according to their corresponding definitions which start with an optional “position” and follow with the “levels” definition.

Optional position

After the colon, it is possible to specify where to insert the defined levels:

  • The identifier “FIRST” (resp. “LAST”) indicates that the level must be inserted before (resp. after) all possibly existing levels of the entry. They become their first (resp. last) levels.
  • The identifier “BEFORE” (resp. “AFTER”) followed by a level label (a string) indicates that the levels must be inserted before (resp. after) that level, if it exists. If it does not exist, the extend statement fails at run time.
  • The identifier “LEVEL” followed by a label indicates that the first level defined in the extend statement must be inserted at the given level, extending and modifying it. The other levels defined in the statement are inserted after this level, and before the possible levels following this level. If there is no level with this label, the extend statement fails at run time.
  • By default, if the entry has no level, the levels defined in the statement are inserted in the entry. Otherwise the first defined level is inserted at the first level of the entry, extending or modifying it. The other levels are inserted afterwards (before the possible second level which may previously exist in the entry).

Levels

After the optional “position”, the level list follow. The levels are separated by vertical bars, the whole list being between brackets.

A level starts with an optional label, which corresponds to its name. This label is useful to specify this level in case of future extensions, using the position (see previous section) or for possible direct calls to this specific level.

Rules

A level is a list of rules separated by vertical bars, the whole list being between brackets.

A rule is an usual pattern association (in a function or in the “match” statement), i.e. a pattern, an arrow and an expression. The expression is the semantic action which must be of type “string”.

Rules insertion

The rules are sorted by their patterns, according to the rules of the extensible functions.

Semantic action

The semantic action, i.e. the expression following the right arrow in rules, contains in its environment the variables bound by the pattern and three more variables:

  • The variable “curr” which is a function which can be called to recursively invoke the printer at the current level,
  • The variable “next” which is a function which can be called to invoke the printer at the next level,
  • The variable “pc” which contains the printing context of type “Pprintf.pr_context” (see chapter Pprintf).

The variables “curr” and “next” are of type:

pr_context -> t -> string

where “t” is the type of the printer (i.e. the type of its patterns).

The variable “curr”, “next” and “pc” have predefined names and can hide the possible identifiers having the same names in the pattern or in the environment of the “EXTEND_PRINTER” statement.

The Eprinter module

See its section in the chapter “Library”.

Examples

Parser and Printer of expressions

This example illustrates the symmetry between parsers and printers. A simple type of expressions is defined. A parser converts a string to a value of this type, and a printer converts a value of this type to a string.

In the printer, there is no use of the “pc” parameter and no use of the “Pretty” module. The strings are printed on a single line.

Here is the source (file “foo.ml”):

#load "pa_extend.cmo";
#load "pa_extprint.cmo";

open Printf;

type expr =
  [ Op of string and expr and expr
  | Int of int
  | Var of string ]
;

value g = Grammar.gcreate (Plexer.gmake ());
value pa_e = Grammar.Entry.create g "expr";
value pr_e = Eprinter.make "expr";

EXTEND
  pa_e:
    [ [ x = SELF; "+"; y = SELF -> Op "+" x y
      | x = SELF; "-"; y = SELF -> Op "-" x y ]
    | [ x = SELF; "*"; y = SELF -> Op "*" x y
      | x = SELF; "/"; y = SELF -> Op "/" x y ]
    | [ x = INT -> Int (int_of_string x)
      | x = LIDENT -> Var x
      | "("; x = SELF; ")" -> x ] ]
  ;
END;

EXTEND_PRINTER
  pr_e:
    [ [ Op "+" x y -> sprintf "%s + %s" (curr pc x) (next pc y)
      | Op "-" x y -> sprintf "%s - %s" (curr pc x) (next pc y) ]
    | [ Op "*" x y -> sprintf "%s * %s" (curr pc x) (next pc y)
      | Op "/" x y -> sprintf "%s / %s" (curr pc x) (next pc y) ]
    | [ Int x -> string_of_int x
      | Var x -> x
      | x -> sprintf "(%s)" (Eprinter.apply pr_e pc x) ] ]
  ;
END;

value parse s = Grammar.Entry.parse pa_e (Stream.of_string s);
value print e = Eprinter.apply pr_e Pprintf.empty_pc e;

if Sys.interactive.val then ()
else print_endline (print (parse Sys.argv.(1)));

Remark on the use of “curr” and “next” while printing operators: due to left associativity, the first operand uses “curr” and the second operand uses “next”. For right associativity operators, they should be inverted. For no associativity, both should use “next”.

The last line of the file allows use in either the OCaml toplevel or as standalone program, taking the string to be printed as parameter. It can be compiled this way:

ocamlc -pp camlp5r -I +camlp5 gramlib.cma foo.ml

Examples of use (notice the redundant parentheses automatically removed by the printing algorithm):

$ ./a.out "(3 * x) + (2 / y)"
3 * x + 2 / y
$ ./a.out "(x+y)*(x-y)"
(x + y) * (x - y)
$ ./a.out "x + y - z"
x + y - z
$ ./a.out "(x + y) - z"
x + y - z
$ ./a.out "x + (y - z)"
x + (y - z)

Printing OCaml programs

Complete examples of usage of extensible printers are the printers in syntaxes and extended syntaxes provided by Camlp5 in the pretty printing kits:

  • pr_r.cmo: pretty print in revised syntax
  • pr_o.cmo: pretty print in normal syntax
  • pr_rp.cmo: also pretty print the parsers in revised syntax
  • pr_op.cmo: also pretty print the parsers in normal syntax

See the chapter entitled “Printing programs”.

Pprintf

Pprintf

This chapter describes “pprintf”, a statement to pretty print data. It looks like the “sprintf” function of the OCaml library, and borrows some ideas of the Format OCaml library. Another statement, “lprintf”, is a slightly different version of “pprintf” handling with locations.

Syntax of the pprintf statement

The “pprintf” statement is added by the parsing kitpa_pprintf.cmo”.

Notice that, in opposition to “printf”, “fprintf”, “sprintf”, and all its variants, which are functions, this “pprintf” is a statement, not a function: “pprintf” is a keyword and the expander analyzes its string format parameter to generate specific statements. In particular, it cannot be used alone and has no type by itself.

       expression ::= pprintf-statement
pprintf-statement ::= "pprintf" qualid format expressions
           qualid ::= qualid "." qualid
                    | uident
                    | lident
           format ::= string
      expressions ::= expression expressions
                    | <nothing>

Semantics of pprintf

The “pprintf” statement converts the format string into a string like the “sprintf” of the OCaml library “Printf” does (see the OCaml manual for details). The string format can accept new conversion specifications, “%p” and “%q”, and some pretty printing annotations, starting with “@” like in the OCaml library “Format”.

The “pprintf” statement takes as first parameter, a value of type “pr_context” defined below. Its second parameter is the extended format string. It can take other parameters, depending on the format, like “sprintf”.

The result of “pprintf” is always a string. There is no versions applying to files or buffers.

The strings built by “pprintf” are concatened by the function “Pretty.sprintf” (see the chapter entitled “Pretty Print”) which controls the line length and prevents overflowing.

Printing context

The “pprintf” statement takes, as first parameter, a printing context. It is a value of the following type:

type pr_context =
  { ind : int;
    bef : string;
    aft : string;
    dang : string };

The fields are:

  • ind” : the current indendation
  • bef” : what should be printed before, in the same line
  • aft” : what should be printed after, in the same line
  • dang” : the dangling token to know whether parentheses are necessary

Basically, the “pprintf” statement concats the “bef” string, the formatted string and the “aft” string. The example:

pprintf pc "hello world"

is equivalent to (and indeed generates):

Pretty.sprintf "%shello world%s" pc.bef pc.aft

But if the format string contains conversion specifications “%p” or “%q”, the “bef” and “aft” strings are actually transmitted to the corresponding functions:

pprintf pc "hello %p world" f x

is equivalent to:

f {(pc) with
   bef = Pretty.sprintf "%shello " pc.bef;
   aft = Pretty.sprintf " world%s" pc.aft}
  x

Thus, the decision of including the “bef” and the “aft” strings are delayed to the called function, allowing this function to possibly concatenate “bef” and “aft” to its own strings.

A typical case is, while printing programs, when an expression needs to be printed between parentheses. The code which does that looks like:

pprintf pc "(%p)" expr e

The right parenthesis of this string format is included in the “aft” string transmitted to the function “expr”. In a situation when several right parentheses are concatened this way, the fact that all these parentheses are grouped together allows the function which eventually print them to decide to break the line or not, these parentheses being taken into account in the line length.

For example, if the code contains a print of an program containing an application whose source is:

myfunction myarg

and if the “aft” contains “))))))”, the decision of printing in one line as:

myfunction myarg))))))

or in two lines as:

myfunction
  myarg))))))

is exact, the right parentheses being added to “myarg” to determine whether the line overflows or not.

Extended format

The extended format used by “pprintf” may contain any strings and conversion specifications allowed by the “sprintf” function (see module “Printf” of the OCaml library), plus:

  • the conversion specifications: “%p” and “q”,

  • the pretty printing annotations introduced by, “@” and followed by:

    • the character “;” (semicolon), optionally followed by “<”, two numbers and “>”,

    • the character “`` ``” (space),

    • the character “[”, optionally followed by the character “<” and either:

      • the character “a
      • the character “b
      • a number

      and the character “>”, then followed by format string, and ended with “@]

The format string is applied like in the “sprintf” function. Specific actions are done for the extended features. The result is a string like for the “sprintf” function. The “string before” and “string after” defined by the fields “bef” and “aft” of the printing context are taken into account and it is not necessary to add them in the format.

Example:

pprintf pc "hello, world"

generates:

Pretty.sprintf "%shello, world%s" pc.bef pc.aft;

An empty format:

pprintf pc "";

just prints the “before” and “after” strings:

Pretty.sprintf "%s%s" pc.bef pc.aft;

Line length

The function “pprintf” uses the Camlp5 “Pretty” module. The line length can be set by changing the value of the reference “Pretty.line_length”.

The conversion specifications “p” and “q”

The “%p” conversion specification works like the “%a” of the printf statement. It takes two arguments and applies the first one to the printing context and to the second argument. The first argument must therefore have type “pr_context -> t -> unit” (for some type “t”) and the second one “t”.

Notice that this function can be called twice: one to test whether the resulting string holds in the line, and another one to possibly recall this function to print it in several lines. In the two cases, the printing context given as parameter is different.

It uses the functions defined in the “Pretty” module.

Example: the following statement:

pprintf pc "hello, %p, world" f x

is equivalent to:

f {(pc) with
   bef = Pretty.sprintf "%shello, " pc.bef;
   aft = Pretty.sprintf ", world%s" pc.aft}
  x

The “%q” conversion specification is like “%p” except that it takes a third argument which is the value of the “dang” field, useful when the syntax has “dangling” problems requiring parentheses. See chapter Extensions of printing for more explanations about dangling problems.

The same example with “%q”:

pprintf pc "hello, %q, world" f x "abc"

is equivalent to:

f {(pc) with
   bef = Pretty.sprintf "%shello, " pc.bef;
   aft = Pretty.sprintf ", world%s" pc.aft;
   dang = "abc"}
  x

The pretty printing annotations

Breaks

The pretty printing annotations allow to indicate places where lines can be broken. They all start with the “at” sign “@”. The main ones are called breaks and are:

  • @;” specifying: write a space or ‘a newline and an indentation incremented by 2 spaces’
  • ````” specifying: write a space or ‘a newline and the indentation’

Example - where “pc” is a variable of type “pr_context” (for example “Pprintf.empty_pc”):

pprintf pc "hello,@;world"

builds the string, if it holds in the line:

hello, world

and if it does not:

hello,
  world

The second form:

pprintf pc "hello,@ world"

is printed the same way, if it holds in the line, and if it does not, as:

hello,
world

The general form is:

  • @;<s o>”, which is a break with “s” spaces if the string holds in the line, or an indentation offset (incrementation of the indentation) of “o” spaces if the string does not hold in the line.

The break “@;” is therefore equivalent to “@;<1   2>” and “@ ``" is equivalent to "``@;<1   0>”.

Parentheses

A second form of the pretty printing annotations is the parenthesization of format strings possibly containing other pretty printing annotations. They start with “@[” and end with “@]”.

It allows to change the associativity of the breaks. For example:

pprintf pc "@[the quick brown fox@;jumps@]@;over the lazy dog"

If the whole string holds on the line, it is printed:

the quick brown fox jumps over the lazy dog

If the whole string does not hold on the line, but “the quick brow fox jumps” does, it is printed:

the quick brown fox jumps
  over the lazy dog

If the string “the quick brown fox jumps” does not hold on the line, the whole string is printed:

the quick brown fox
  jumps
  over the lazy dog

Conversely, if the code is right associated:

pprintf pc "the quick brown fox@;@[jumps@;over the lazy dog@]"

It can be printed:

the quick brown fox jumps over the lazy dog

or:

the quick brown fox
  jumps over the lazy dog

or:

the quick brown fox
  jumps
    over the lazy dog

The default is left associativity: without parentheses, it is printed like in the first example.

Incrementation of indentation

The open parenthesis of the parenthesized form, “@[” can be followed by “<n>” where “n” is a number. It increments the current indentation (for possible newlines in the parenthesized text) with this number.

Example:

pprintf pc "@[<4>Incrementation@;actually of six characters@]"

makes the string (if not holding in the line):

Incrementation
      actually of six characters

Break all or nothing

The open parenthesis of the parenthesized form, “@[” can be followed by “<a>”. It specifies that if the string does not hold in the line, all breaks between the parentheses (at one only level) are printed in two lines, even if sub-strings could hold on the line. For example:

pprintf pc "@[<a>the quick brown fox@;jumps@;over the lazy dog@]"

can be printed only as:

the quick brown fox jumps over the lazy dog

or as:

the quick brown fox
  jumps
  over the lazy dog

Break all

The open parenthesis of the parenthesized form, “@[” can be followed by “<b>”. It specifies that all breaks are always printed in two lines. For example:

pprintf pc "@[<b>the quick brown fox@;jumps@;over the lazy dog@]"

is printed in all circumstances:

the quick brown fox
  jumps
  over the lazy dog

Break all if

The open parenthesis of the parenthesized form, “@[” can be followed by “<i>”. Depending on the value of the boolean variable of the argument list, the breaks are all printed in two lines like with the “break all” option above, or not. For example:

pprintf pc "%s@;@[<i>%s,@;%s@]" "good" True "morning" "everybody";
pprintf pc "%s@;@[<i>%s,@;%s@]" "good" False "morning" "everybody";

are printed:

good
  morning,
    everybody
good morning, everybody

Parentheses not neighbours of breaks

In the examples above, we can remark that the left parentheses are always the begin of the string or are preceeded by a break, and that the right parentheses are always the end of the string or followed by a break.

When the parentheses “@[” and “@]” are not preceeded or followed by the string begin nor end, nor preceeded or followed by breaks, they are considered as the “bef” or “aft” part of the neighbour string. For example, the following forms:

pprintf pc "the quick brown fox@[ jumps over@]"

and:

pprintf pc "@[the quick brown fox @]jumps over"

are respectively equivalent to:

let pc = {(pc) with aft = sprintf " jumps over%s" pc.aft} in
Pretty.sprintf "%sthe quick brown fox%s" pc.bef pc.aft

and:

let pc = {(pc) with bef = sprintf "%sthe quick brown fox" pc.bef} in
Pretty.sprintf "%sjumps over%s" pc.bef pc.aft

In these examples, the results are identical, but it can be important if the non-parenthesized part contain one or several “%p”. In this case, the corresponding function receives the “bef” or “aft” part in its pr_context variable and can take it into account when printing its data.

Lprintf

“Lprintf” is like “pprintf” with the same parameters. It is equivalent to an call to the function “expand_lprintf”:

lprintf pc "..."

is equivalent to:

expand_lprintf pc loc (fun pc -< pprintf pc "...")

The function “expand_lprintf” and the variable “loc” must be defined by the user in the environment where “lprintf” is used.

“Lprintf” is used in predefined printers “pr_r.ml” and “pr_o.ml” to allow optional insertions of location comments in the output.

Comparison with the OCaml modules Printf and Format

Pprintf and Printf

The statement “pprintf” acts like the function “Printf.sprintf”. But since it requires this extra parameter of type “pr_context” and needs the “%p” and “%q” conversions specifications (which do not exist in “Printf”), it was not possible to use the “Printf” machinery directly and a new statement had to be added.

The principle of “pprintf” and “sprintf” are the same. However, “pprintf” is a syntax extension and has no type by itself. It cannot be used alone or without all its required parameters.

Pprintf and Format

The pretty printing annotations look like the ones of the OCaml module Format. Actually, they have different semantics. They do not use boxes like “Format” does. In “pprintf” statement, the machinery acts only on indentations.

Notice that, with “pprintf”, it is always possible to know the current indentation (it is the field “ind” of the “pr_context” variable) and it is therefore possible to take decisions before printing.

For example, it is possible, in a printer of OCaml statements, to decide to print all match cases symmetrically, i.e. all with one line for each case or all with newlines after the patterns.

It is what is done in the option “-flag E” added by the pretty printing kits “pr_r.cmo” (pretty print in revised syntax) and “pr_o.cmo” (pretty print in normal syntax). See chapter Commands and Files or type “camlp5 pr_r.cmo -help” or “camlp5 pr_o.cmo   -help”.

Another difference is that the internal behaviour of this printing system is accessible, and it is always possible to use the basic functions of the “Pretty” module (“horiz_vertic” and “sprintf”) if the behaviour of “pprintf” is not what is desired by the programmer.

Relation with the Camlp5 extensible printers

The extensible printers of Camlp5 (see its corresponding chapter) use the type “pr_context” of “pprintf”. It is therefore possible to use “pprintf” in the semantic actions of the extensible printers. But it is not mandatory. An extensible printer can just use the “Pretty” module or even neither “pprintf” nor “Pretty”.

The printing kits “pr_r.ml” and “pr_o.ml” (respectively pretty print in revised and in normal syntax) and some other related to them, are examples of usage of the “pprintf” statement.

The Pprintf module

See its section in the chapter “Library”.

Pretty print

Pretty print

A pretty print system is provided in the library module Pretty. It allows one to pretty print data or programs. The Pretty module contains:

  • The function “horiz_vertic” to specify how data must be printed.
  • The function “sprintf” to format strings.
  • The variable “line_length” which is a reference specifying the maximum lines lengths.

Module description

horiz_vertic

The function “horiz_vertic” takes two functions as parameters. When invoked, it calls its first function. If that function fails with some specific internal error (that the function “sprintf” below may raise), the second function is called.

The type of “horiz_vertic” is:

(unit -> 'a) -> (unit -> 'a) -> 'a

the horizontal function

The first function is said to be the “horizontal” function. It tries to pretty print the data on a single line. In the context of this function, if the strings built by the function “sprintf” (see below) contain newlines or have lengths greater than “line_length”, the function fails (with a internal exception local to the module).

the vertical function

In case of failure of the “horizontal function”, the second function of “horiz_vertic”, the “vertical” function, is called. In the context of that function, the “sprintf” function behaves like the normal “sprintf” function of the OCaml library module “Printf”.

sprintf

The function “sprintf” works like its equivalent in the module “Printf” of the OCaml library, and takes the same parameters. Its difference is that if it is called in the context of the first function (the “horizontal” function) of the function “horiz_vertic” (above), all strings built by “sprintf” are checked for newlines or length greater than the maximum line length. If either occurs, the “sprintf” function fails and the horizontal function fails.

If “sprintf” is not in the context of the horizontal function, it behaves like the usual “sprintf” function.

line_length

The variable “line_length” is a reference holding the maximum line length of lines printed horizontally. Its default is 78. This can be changed by the user before using “horiz_vertic”.

horizontally

The call “horizontally ()” returns a boolean telling whether the context is horizontal.

Example

Suppose you want to pretty print the XML code "<li>something</li>". If the “something” is short, you want to see:

<li>something</li>

If the “something” has several lines, you want to see that:

<li>
  something
</li>

A possible implementation is:

open Pretty;
horiz_vertic
  (fun () -> sprintf "<li>something</li>")
  (fun () -> sprintf "<li>\n  something\n</li>");

Notice that the “sprintf” above is the one of the library Pretty.

Notice also that, in a program displaying XML code, this “something” may contain other XML tags, and is therefore generally the result of other pretty printing functions, and the program should rather look like:

horiz_vertic
  (fun () -> sprintf "<li>%s</li>" (print something))
  (fun () -> sprintf "<li>\n  %s\n</li>" (print something))

Parts of this “something” can be printed horizontally and other vertically using other calls to “horiz_vertic” in the user function “print” above. But it is important to remark that if they are called in the context of the first function parameter of “horiz_vertic” above, only horizontal functions are accepted: the first failing “horizontal” function triggers the failure of the horizontal pretty printing.

Programming with Pretty

Hints

Just start with a call to “horiz_vertic”.

As its first function, use “sprintf” just to concat the strings without putting any newlines or indentations, e.g. just using spaces to separate pieces of data.

As its second function, consider how you want your data to be cut. At the cutting point or points, add newlines. Notice that you probably need to give the current indentation string as parameter of the called functions because they need to be taken into account in the called “horizontal” functions.

In the example below, don’t put the indentation in the sprintf function but give it as parameter of your “print” function:

horiz_vertic
  (fun () -> sprintf "<li>%s</li>" (print "" something))
  (fun () -> sprintf "<li>\n%s\n</li>" (print "  " something))

Now, the “print” function could look like, supposing you print other things with “other” of the current indentation and “things” with a new shifted one:

value print ind something =
  horiz_vertic
    (fun () -> sprintf "%sother things..." ind)
    (fun () -> sprintf "%sother\n%s  things..." ind ind);

Supposing than “other” and “things” are the result of two other functions “print_other” and “print_things”, your program could look like:

value print ind (x, y) =
  horiz_vertic
    (fun () -> sprintf "%s%s %s" ind (print_other 0 x) (print_things 0 y))
    (fun () -> sprintf "%s\n%s" (print_other ind x) (print_things (ind ^ "  ") y));

How to cancel a horizontal print

If you want to prevent a pretty printing function from being called in a horizontal context, constraining the pretty print to be on several lines in the calling function, just do:

horiz_vertic
  (fun () -> sprintf "\n")
  (fun () -> ... (* your normal pretty print *))

In this case, the horizontal print always fails, due to the newline character in the sprintf format.

Remarks

Kernel

The module “Pretty” is intended to be basic, a “kernel” module to pretty print data. It presumes that the user takes care of the indentation. Programs using “Pretty” are not as short as the ones using “Format” of the OCaml library, but are more flexible. To pretty print with a shorter syntax like in the OCaml module “Format” (with the “@” convention), see statement “pprintf” (which internally uses the module “Pretty”).

Strings vs Channels

In “Pretty”, the pretty printing is done only on strings, not on files. To pretty print files, just build the strings and print them afterwards with the usual output functions. Notice that OCaml allocates and frees strings quickly, and if pretty printed values are not huge, which is generally the case, it is not a real problem, memory sizes these days being more than enough for this job.

Strings or other types

The “horiz_vertic” function can return values of types other than “string”. For example, if you are interested only in the result of horizontal context and not on the vertical one, it is perfectly correct to write:

horiz_vertic
  (fun () -> Some (sprintf "I hold on a single line")
  (fun () -> None)

Why raising exceptions ?

One could ask why this pretty print system raises internal exceptions. Why not simply write the pretty printing program like this:

  1. first build the data horizontally (without newlines)
  2. if the string length is lower than the maximum line length, return it
  3. if not, build the string by adding newlines in the specific places

This method works but is generally very slow (exponential in time) because while printing horizontally, many useless strings are built. If, for example, the final printed data holds on 50 lines, tens of lines may be built uselessly again and again before the overflowing is corrected.

Language extensions

Locations

Locations

The location is a concept often used in Camlp5, bound to where errors occur in the source. The basic type is “Ploc.t” which is an abstract type.

Definitions

Internally a location is a pair of source positions: the beginning and the end of an element in the source (file or interactive). A located element can be a character (the end is just the beginning plus one), a token, or a longer sequence generally corresponding to a grammar rule.

A position is a count of characters since the beginning of the file, starting at zero. When a couple of positions define a location, the first position is the position of the first character of the element, and the last position is the first character not part of the element. The location length is the difference between those two numbers. Notice that the position corresponds exactly to the character count in the streams of characters.

In the extensible grammars, a variable with the specific name “loc” is predefined in all semantic actions: it is the location of the associated rule. Since the syntax tree quotations generate nodes with “loc” as location part, this allow to generate grammars without having to consider source locations.

It is possible to change the name “loc” to another name, through the parameter “-loc” of the Camlp5 commands.

Remark: the reason why the type “location” is abstract is that in future versions, it may contain other informations, such as the associated comments, the type (for expressions nodes), things like that, without having to change the already written programs.

Building locations

Tools are provided in the module “Ploc” to manage locations.

First, “Ploc.dummy” is a dummy location used when the element does not correspond to any source, or if the programmer does not want to worry about locations.

The function “Ploc.make” builds a location from three parameters:

  • the line number, starting at 1
  • the position of the first column of the line
  • a couple of positions of the location: the first one belonging to the given line, the second one being able to belong to another line, further.

If the line number is not known, it is possible to use the function “Ploc.make_unlined” taking only the couple of positions of the location. In this case, error messages may indicate the first line and a big count of characters from this line (actually from the beginning of the file). With a good text editor, it is possible, to find the good location, anyway.

If the location is built with “Ploc.make_unlined”, and if your program displays a source location itself, it is possible to use the function “Ploc.from_file” which takes the file name and the location as parameters and return, by reading that file, the line number, and the character positions of the location.

Raising with a location

The function “Ploc.raise” allows one to raise an exception together with a location. All exceptions raised in the extensible grammars use “Ploc.raise”. The raised exception is “Ploc.Exc” with two parameters: the location and the exception itself.

Notice that “Ploc.raise” just reraises the exception if it is already the exception “Ploc.Exc”, ignoring then the new given location.

A paradigm to print exceptions possibly enclosed by “Ploc.Exc” is to write the “try..with” statement like this:

try ... with exn ->
  let exn =
    match exn with
    [ Ploc.Exc loc exn -> do { ... print the location ...; exn }
    | _ -> exn ]
  in
  match exn with
  ...print the exception which is *not* located...

Other functions

Some other functions are provided:

Ploc.first_pos
returns the first position (an integer) of the location.
Ploc.last_pos
returns the last position (an integer) of the location (position of the first character not belonging to the element.
Ploc.line_nb
returns the line number of the location or -1 if the location does not contain a line number (i.e. built by “Ploc.make_unlined”).
Ploc.bol_pos
returns the position of the beginning of the line of the location. It is zero if the location does not contain a line number (i.e. built by “Ploc.make_unlined”).

And still other ones used in Camlp5 sources:

Ploc.encl
Ploc.encl loc1 loc2” returns the location starting at the smallest begin of “loc1” and “loc2” and ending at their greatest end.. In simple words, it is the location enclosing “loc1” and “loc2” and all what is between them.
Ploc.shift
Ploc.shift sh loc” returns the location “loc” shifted with “sh” characters. The line number is not recomputed.
Ploc.sub
Ploc.sub loc sh len” is the location “loc” shifted with “sh” characters and with length “len”. The previous ending position of the location is lost.
Ploc.after
Ploc.after loc sh len” is the location just after “loc” (i.e. starting at the end position of “loc”), shifted with “sh” characters, and of length “len”.

Syntax tree

Syntax tree

In Camlp5, one often uses syntax trees. For example, in grammars of the language (semantic actions), in pretty printing (as patterns), in optimizing syntax code (typically stream parsers). Syntax trees are mainly defined by sum types, one for each kind of tree: “expr” for expressions, “patt” for patterns, “ctyp” for types, “str_item” for structure items, and so on. Each node corresponds to a possible value of this type.

Transitional and Strict modes

Since version 5.00 of Camlp5, the definition of the syntax tree has been different according to the mode Camlp5 has been installed:

  • In transitional mode, this definition is the same than in the previous Camlp5 versions.
  • In strict mode, many constructor parameters have a type enclosed by the predefined type “Ploc.vala”.

The advantage of the transitional mode is that the abstract syntax tree is fully compatible with previous versions of Camlp5. Its drawback is that when using the syntax tree quotations in user syntax, it is not possible to use antiquotations, a significatant limitation.

In strict mode, the abstract syntax is not compatible with versions of Camlp5 previous to 5.00. Most of the parameters of the constructor are enclosed with the type “Ploc.vala” whose aim is to allow nodes to be either of the type argument, or an antiquotation. In this mode, the syntax tree quotations in user syntax can be used, with the same power of the previous syntax tree quotations provided by Camlp5.

Compatibility

As there is a problem of compatibility in strict mode, a good solution, for the programmer, is to always use syntax trees using quotations, which is backward compatible. See the chapter about syntax tree in strict mode.

For example, if the program made a value of the syntax tree of the “let” statement, like this:

ExLet loc rf pel e

In strict mode, to be equivalent, this expression should be rewritten like this:

ExLet loc (Ploc.VaVal rf) (Ploc.VaVal pel) e

where “Ploc.VaVal” is a value of the type “vala” defined in the module Ploc (see its section “pervasives”).

This necessary conversion is a drawback if the programmer wants that his programs remain compilable with previous versions of Camlp5.

The recommended solution is to always write this code with quotations, namely, in this example, like this:

<:expr< let $flag:rf$ $list:pel$ in $e$ >>

The quotation expanders ensure that, in strict mode, the variable “rf” is still of type “bool”, and that the variable “pel” of type “list (patt * expr)”, by enclosing them around “Ploc.VaVal”.

In transitional mode, it is equivalent to the first form above. In strict mode, it is equivalent to the second form. And the previous versions of Camlp5 also recognizes this form.

Two quotations expanders

Camlp5 provides two quotations expanders of syntax trees: “q_MLast.cmo” and “q_ast.cmo”.

Both allow writing syntax trees in concrete syntax as explained in the previous section.

The first one, “q_MLast.cmo” requires that the contents of the quotation be in revised syntax without any syntax extension (even the stream parsers). It works in transitional and in strict modes.

The second one, “q_ast.cmo” requires that the contents of the quotation be in the current user syntax (normal, revised, lisp, scheme, or other) and can accept all the syntax extensions he added to compile his program. It fully works only in strict mode. In transitional mode, the antiquotations are not available.

Syntax tree and Quotations in the two modes

For the detail of the syntax tree and the quotations forms, see the chapters about the syntax tree in transitional mode and the syntax tree in strict mode.

Syntax tree - transitional mode

Syntax tree - transitional mode

This chapter presents the Camlp5 syntax tree when Camlp5 is installed in transitional mode.

Introduction

This syntax tree is defined in the module “MLast” provided by Camlp5. Each node corresponds to a syntactic entity of the corresponding type.

For example, the syntax tree of the statement “if” can be written:

MLast.ExIfe loc e1 e2 e3

where “loc” is the location in the source, and “e1”, “e2” and “e3” are respectively the expression after the “if”, the one after the “then” and the one after the “else”.

If a program needs to manipulate syntax trees, it can use the nodes defined in the module “MLast”. The programmer must know how the concrete syntax is transformed into this abstract syntax.

A simpler solution is to use one of the quotation kit “q_MLast.cmo”. It proposes quotations which represent the abstract syntax (the nodes of the module “MLast”) into concrete syntax with antiquotations to bind variables inside. The example above can be written:

<:expr< if $e1$ then $e2$ else $e3$ >>

This representation is very interesting when one wants to manipulate complicated syntax trees. Here is an example taken from the Camlp5 sources themselves:

<:expr<
  match try Some $f$ with [ Stream.Failure -> None ] with
  [ Some $p$ -> $e$
  | _ -> raise (Stream.Error $e2$) ]
>>

This example was in a position of a pattern. In abstract syntax, it should have been written:

MLast.ExMat _
  (MLast.ExTry _ (MLast.ExApp _ (MLast.ExUid _ "Some") f)
     [(MLast.PaAcc _ (MLast.PaUid _ "Stream") (MLast.PaUid _ "Failure"),
       None, MLast.ExUid _ "None")])
  [(MLast.PaApp _ (MLast.PaUid _ "Some") p, None, e);
   (MLast.PaAny _, None,
    MLast.ExApp _ (MLast.ExLid _ "raise")
      (MLast.ExApp _
         (MLast.ExAcc _ (MLast.ExUid _ "Stream") (MLast.ExUid _ "Error"))
         e2))]

Which is less readable and much more complicated to build and update.

Instead of thinking of “a syntax tree”, the programmer can think of “a piece of program”.

Location

In all syntax tree nodes, the first parameter is the source location of the node.

In expressions

When a quotation is in the context of an expression, the location parameter is “loc” in the node and in all its possible sub-nodes. Example: if we consider the quotation:

<:sig_item< value foo : int -> bool >>

This quotation, in a context of an expression, is equivalent to:

MLast.SgVal loc "foo"
  (MLast.TyArr loc (MLast.TyLid loc "int") (MLast.TyLid loc "bool"));

The name “loc” is predefined. However, it is possible to change it, using the argument “-loc” of the Camlp5 shell commands.

Consequently, if there is no variable “loc” defined in the context of the quotation, or if it is not of the correct type, a semantic error occur in the OCaml compiler (“Unbound value loc”).

Note that in the extensible grammars, the variable “loc” is bound, in all semantic actions, to the location of the rule.

If the created node has no location, the programmer can define a variable named “loc” equal to “Ploc.dummy”.

In patterns

When a quotation is in the context of a pattern, the location parameter of all nodes and possible sub-nodes is set to the wildcard (”_”). The same example above:

<:sig_item< value foo : int -> bool >>

is equivalent, in a pattern, to:

MLast.SgVal _ "foo"
  (MLast.TyArr _ (MLast.TyLid _ "int") (MLast.TyLid _ "bool"))

However, it is possible to generate a binding of the variable “loc” on the top node by writing a “colon” before the “less” in the quotation. The same example:

<:sig_item:< value foo : int -> bool >>

is equivalent to:

MLast.SgVal loc "foo"
  (MLast.TyArr _ (MLast.TyLid _ "int") (MLast.TyLid _ "bool"))

Antiquotations

The expressions or patterns between dollar ($) characters are called antiquotations. In opposition to quotations which has its own syntax rules, the antiquotation is an area in the syntax of the enclosing context (expression or pattern). See the chapter about quotations.

If a quotation is in the context of an expression, the antiquotation must be an expression. It can be any expression, including function calls. Examples:

value f e el = <:expr< [$e$ :: $loop False el$] >>;
value patt_list p pl = <:patt< ( $list:[p::pl]$) >>;

If a quotation is in the context of an pattern, the antiquotation is a pattern. Any pattern is possible, including the wildcard character (”_”). Examples:

fun [ <:expr< $lid:op$ $_$ $_$ >> -> op ]
match p with [ <:patt< $_$ | $_$ >> -> Some p ]

Nodes and Quotations

This section describes all nodes defined in the module “MLast” of Camlp5 and how to write them with quotations. Notice that, inside quotations, one is not restricted to these elementary cases, but any complex value can be used, resulting on possibly complex combined nodes.

Variables names give information of their types:

  • e, e1, e2, e3: expr
  • p, p1, p2, p3: patt
  • t, t1, t2, e3: ctyp
  • s: string
  • b: bool
  • me, me1, me2: module_expr
  • mt, mt1, mt2: module_type
  • le: list expr
  • lp: list patt
  • lt: list ctyp
  • ls: list string
  • lse: list (string * expr)
  • lpe: list (patt * expr)
  • lpp: list (patt * patt)
  • lpee: list (patt * option expr * expr)
  • op: option patt
  • lcsi: list class_str_item
  • lcsi: list class_sig_item

expr

Expressions of the language.

ExAcc loc e1 e2 $e1$ . $e2$ access ExAnt loc e $anti:e$ antiquotation (1) ExApp loc e1 e2 $e1$ $e2$ application ExAre loc e1 e2 $e1$ .( $e2$ ) array element ExArr loc le [| $list:le$ |] array ExAsr loc e assert $e$ assert ExAss loc e1 e2 $e1$ := $e2$ assignment ExBae loc e le $e$ .{ $list:le$ } big array element ExChr loc s $chr:s$ character constant ExCoe loc e None t2 ($e$ :> $t2$) coercion ExCoe loc e (Some t1) t2 ($e$ : $t1$ :> $t2$) coercion ExFlo loc s $flo:s$ float constant ExFor loc s e1 e2 True le for $lid:s$ = $e1$ to $e2$ do { $list:le$ } for (increasing) ExFor loc s e1 e2 False le for $lid:s$ = $e1$ downto $e2$ do { $list:le$ } for (decreasing) ExFor loc s e1 e2 b le for $lid:s$ = $e1$ $to:b$ $e2$ do { $list:le$ } for ExFun loc lpee fun [ $list:lpee$ ] function (2) ExIfe loc e1 e2 e3 if $e1$ then $e2$ else $e3$ if ExInt loc s1 "" $int:s1$ integer constant ExInt loc s1 "l" $int32:s1$ integer 32 bits ExInt loc s1 "L" $int64:s1$ integer 64 bits ExInt loc s1 "n" $nativeint:s1$ native integer ExLab loc p None ~{$p$} label ExLab loc p (Some e) ~{$p$ = $e$} label ExLab loc p oe ~{$p$ $opt:oe$} label ExLaz loc e lazy $e$ lazy ExLet loc True lpe e let rec $list:lpe$ in $e$ let rec ExLet loc False lpe e let $list:lpe$ in $e$ let not rec ExLet loc b lpe e let $flag:b$ $list:lpe$ in $e$ let ExLid loc s $lid:s$ lowercase identifier ExLmd loc s me e let module $uid:s$ = $me$ in $e$ let module ExMat loc e lpee match $e$ with [ $list:lpee$ ] match (2) ExNew loc ls new $list:ls$ new ExObj loc None lcsi object $list:lcsi$ end object expression ExObj loc (Some p) lcsi object ($p$) $list:lcsi$ end object expression ExObj loc op lcsi object $opt:op$ $list:lcsi$ end object expression ExOlb loc p None ?{$p$} option label ExOlb loc p (Some e) ?{$p$ = $e$} option label ExOlb loc p oe ?{$p$ $opt:oe$} option label ExOvr loc lse {< $list:lse$ >} override ExPck loc me None (module $me$) module packing ExPck loc me (Some mt) (module $me$ : $mt$) module packing ExRec loc lpe None {$list:lpe$} record ExRec loc lpe (Some e) {($e$) with $list:lpe$} record ExSeq loc le do { $list:le$ } sequence ExSnd loc e s $e$ # $s$ method call ExSte loc e1 e2 $e1$ .[ $e2$ ] string element ExStr loc s $str:s$ string ExTry loc e lpee try $e$ with [ $list:lpee$ ] try (2) ExTup loc le ($list:le$) t-uple ExTyc loc e t ($e$ : $t$) type constraint ExUid loc s $uid:s$ uppercase identifier ExVrn loc s ` $s$ variant ExWhi loc e le while $e$ do { $list:le$ } while ============================== =================================================== ===============================

(1) Node used in the quotation expanders to tells at conversion to OCaml compiler syntax tree time, that all locations of the sub-tree is correcty located in the quotation. By default, in quotations, the locations of all generated nodes are the location of the whole quotation. This node allows to make an exception to this rule, since we know that the antiquotation belongs to the universe of the enclosing program. See the chapter about quotations and, in particular, its section about antiquotations.

(2) The variable “lpee” found in “function”, “match” and “try” statements correspond to a list of “(patt * option expr *       expr)” where the “option expr” is the “when” optionally following the pattern:

p -> e

is represented by:

(p, None, e)

and

p when e1 -> e

is represented by:

(p, Some e1, e)

patt

Patterns of the language.

Node <:patt< ... >> Comment ========================== =========================== =============================== PaAcc loc p1 p2 $p1$ . $p2$ access PaAli loc p1 p2 ($p1$ as $p2$) alias PaAnt loc p $anti:p$ antiquotation (1) PaAny loc _ wildcard PaApp loc p1 p2 $p1$ $p2$ application PaArr loc lp [| $list:lp$ |] array PaChr loc s $chr:s$ character PaFlo loc s $flo:s$ float PaInt loc s1 "" $int:s1$ integer constant PaInt loc s1 "l" $int32:s1$ integer 32 bits PaInt loc s1 "L" $int64:s1$ integer 64 bits PaInt loc s1 "n" $nativeint:s1$ native integer PaLab loc p1 None ~{$p1$} label PaLab loc p1 (Some p2) ~{$p1$ = $p2$} label PaLab loc p1 op2 ~{$p1$ $opt:op2$} label PaLaz loc p lazy $p$ lazy PaLid loc s $lid:s$ lowercase identifier PaNty loc s (type $lid:s$) new type PaOlb loc p None ?{$p$} option label PaOlb loc p (Some e) ?{$p$ = $e$} option label PaOlb loc p oe ?{$p$ $opt:oe$} option label PaOrp loc p1 p2 $p1$ | $p2$ or PaRec loc lpp { $list:lpp$ } record PaRng loc p1 p2 $p1$ .. $p2$ range PaStr loc s $str:s$ string PaTup loc lp ($list:lp$) t-uple PaTyc loc p t ($p$ : $t$) type constraint PaTyp loc ls # $list:ls$ type pattern PaUid loc s $uid:s$ uppercase identifier PaUnp loc s None (module $uid:s$) module unpacking PaUnp loc s (Some mt) (module $uid:s$ : $mt$) module unpacking PaVrn loc s ` $s$ variant ========================== =========================== ===============================

(1) Node used to specify an antiquotation area, like for the equivalent node in expressions. See above.

ctyp

Type expressions of the language.

Node <:ctyp< ... >> Comment ================================== ================================ ==================== TyAcc loc t1 t2 $t1$ . $t2$ access TyAli loc t1 t2 $t1$ as $t2$ alias TyAny loc _ wildcard TyApp loc t1 t2 $t1$ $t2$ application TyArr loc t1 t2 $t1$ -> $t2$ arrow TyCls loc ls # $list:ls$ class TyLab loc s t ~$s$: $t$ label TyLid loc s $lid:s$ lowercase identifier TyMan loc t1 True t2 $t1$ == private $t2$ manifest TyMan loc t1 False t2 $t1$ == $t2$ manifest TyMan loc t1 b t2 $t1$ == $priv:b$ $t2$ manifest TyObj loc lst True < $list:lst$ .. > object TyObj loc lst False < $list:lst$ > object TyObj loc lst b < $list:lst$ $flag:b$ > object TyOlb loc s t ?$s$: $t$ option label TyPck loc mt (module $mt$) package TyPol loc ls t ! $list:ls$ . $t$ polymorph TyQuo loc s '$s$ variable TyRec loc llsbt { $list:llsbt$ } record TySum loc llslt [ $list:llslt$ ] sum TyTup loc lt ( $list:lt$ ) t-uple TyUid loc s $uid:s$ uppercase identifier TyVrn loc lpv None [ = $list:lpv$ ] variant TyVrn loc lpv (Some None) [ > $list:lpv$ ] variant TyVrn loc lpv (Some (Some [])) [ < $list:lpv$ ] variant TyVrn loc lpv (Some (Some ls)) [ < $list:lpv$ > $list:ls$ ] variant ================================== ================================ ====================

modules…

str_item

Structure items, i.e. phrases in a “.ml” file or “struct” elements.

StCls loc lcice class $list:lcice$ class declaration StClt loc lcict class type $list:lcict$ class type declaration StDcl loc lsi declare $list:lsi$ end declare StDir loc s None # $lid:s$ directive StDir loc s (Some e) # $lid:s$ $e$ directive StDir loc s oe # $lid:s$ $opt:oe$ directive StExc loc s [] [] exception $uid:s$ exception StExc loc s lt [] exception $uid:s$ of $list:lt$ exception StExc loc s [] ls exception $uid:s$ = $list:ls$ exception StExc loc s lt ls exception $uid:s$ of $list:lt$ = $list:ls$ exception StExp loc e $exp:e$ expression StExt loc s t ls external $s$ : $t$ = $list:ls$ external StInc loc me include $me$ include StMod loc True lsme module rec $list:lsme$ module rec StMod loc False lsme module $list:lsme$ module non rec StMod loc b lsme module $flag:b$ $list:lsme$ module StMty loc s mt module type $s$ = $mt$ module type StOpn loc ls open $list:ls$ open StTyp loc ltd type $list:ltd$ type declaration StUse loc s lsil # $str:s$ $list:lsil$ … internal use … (1) StVal loc True lpe value rec $list:lpe$ value rec StVal loc False lpe value $list:lpe$ value non rec StVal loc b lpe value $flag:b$ $list:lpe$ value ======================== ============================================== ============================================

(1) Node internally used to specify a different file name applying to the whole subtree. This is generated by the directive “use” and used when converting to the OCaml syntax tree which needs the file name in its location type.

sig_item

Signature items, i.e. phrases in a “.mli” file or elements inside “sig … end”.

Node <:sig_item< ... >> Comment ======================== ================================== ============================================ SgCls loc lcict class $list:lcict$ class SgClt loc lcict class type $list:lcict$ class type SgDcl loc lsi declare $list:lsi$ end declare SgDir loc s None # $lid:s$ directive SgDir loc s (Some e) # $lid:s$ $e$ directive SgDir loc s oe # $lid:s$ $opt:oe$ directive SgExc loc s [] exception $s$ exception SgExc loc s lt exception $s$ of $list:lt$ exception SgExt loc s t ls external $s$ : $t$ = $list:ls$ external SgInc loc mt include $mt$ include SgMod loc True lsmt module rec $list:lsmt$ module rec SgMod loc False lsmt module $list:lsmt$ module non rec SgMod loc b lsmt module $flag:b$ $list:lsmt$ module SgMty loc s mt module type $s$ = $mt$ module type SgOpn loc ls open $list:ls$ open SgTyp loc ltd type $list:ltd$ type declaration SgUse loc s lsil # $str:s$ $list:lsil$ … internal use … (1) SgVal loc s t value $s$ : $t$ value ======================== ================================== ============================================

(1) Same remark as for “str_item” above.

module_expr

Node <:module_expr< ... >> Comment ========================= ================================ ====================== MeAcc loc me1 me2 $me1$ . $me2$ access MeApp loc me1 me2 $me1$ $me2$ application MeFun loc s mt me functor ($s$ : $mt$) -> $me$ functor MeStr loc lsi struct $list:lsi$ end struct MeTyc loc me mt ($me$ : $mt$) module type constraint MeUid loc s $uid:s$ uppercase identifier MeUnp loc e None (value $e$) module unpacking MeUnp loc e (Some mt) (value $e$ : $mt$) module unpacking ========================= ================================ ======================

module_type

Node <:module_type< ... >> Comment ======================= ================================== ==================== MtAcc loc mt1 mt2 $mt1$ . $mt2$ access MtApp loc mt1 mt2 $mt1$ $mt2$ application MtFun loc s mt1 mt2 functor ($s$ : $mt1$) -> $mt2$ functor MtLid loc s $lid:s$ lowercase identifier MtQuo loc s ' $s$ abstract MtSig loc lsi sig $list:lsi$ end signature MtTyo loc me module type of $me$ of module expression MtUid loc s $uid:s$ uppercase identifier MtWit loc mt lwc $mt$ with $list:lwc$ with construction ======================= ================================== ====================

classes…

class_expr

Node <:class_expr< ... >> Comment =========================== =================================== ===================== CeApp loc ce e $ce$ $e$ application CeCon loc ls lt [ $list:lt$ ] $list:ls$ constructor CeFun loc p ce fun $p$ -> $ce$ function CeLet loc True lpe ce let rec $list:lpe$ in $ce$ let rec CeLet loc False lpe ce let $list:lpe$ in $ce$ let non rec CeLet loc b lpe ce let $flag:b$ $list:lpe$ in $ce$ let CeStr loc None lcsi object $list:lcsi$ end object CeStr loc (Some p) lcsi object ($p$) $list:lcsi$ end object CeStr loc op lcsi object $opt:op$ $list:lcsi$ end object CeTyc loc ce ct ($ce$ : $ct$) class type constraint =========================== =================================== =====================

class_type

Node <:class_type< ... >> Comment =========================== =================================== =========== CtAcc loc ct1 ct2 $ct1$ . $ct2$ access CtApp loc ct1 ct2 $ct1$ $ct2$ application CtCon loc ct lt $ct$ [ $list:lt$ ] constructor CtFun loc t ct [ $t$ ] -> $ct$ arrow CtIde loc s $id:s$ identifier CtSig loc None lcsi object $list:lcsi$ end object CtSig loc (Some t) lcsi object ($t$) $list:lcsi$ end object CtSig loc ot lcsi object $opt:ot$ $list:lcsi$ end object =========================== =================================== ===========

class_str_item

Node <:class_sig_item< ... >> Comment ======================= ========================================= =============== CgCtr loc t1 t2 type $t1$ = $t2$ type constraint CgDcl loc lcsi declare $list:lcsi$ end declare CgInh loc ct inherit $ct$ inheritance CgMth loc True s t method private $lid:s$ : $t$ method CgMth loc False s t method $lid:s$ : $t$ method CgMth loc b s t method $flag:b$ $lid:s$ : $t$ method CgVal loc True s t value mutable $lid:s$ : $t$ value CgVal loc False s t value $lid:s$ : $t$ value CgVal loc b s t value $flag:b$ $lid:s$ : $t$ value CgVir loc True s t method virtual private $lid:s$ : $t$ virtual method CgVir loc False s t method virtual $lid:s$ : $t$ virtual method CgVir loc b s t method virtual $flag:b$ $lid:s$ : $t$ virtual method ======================= ========================================= ===============

other

type_decl

What is after ‘type’ or ‘and’ in a type declaration.

{tdNam=ls;tdPrm=ltv;tdPrv=True;tdDef=t;tdCon=ltt} $tp:ls$ $list:ltv$ = private $t$ $list:ltt$ {tdNam=ls;tdPrm=ltv;tdPrv=False;tdDef=t;tdCon=ltt} $tp:ls$ $list:ltv$ = $t$ $list:ltt$ {tdNam=ls;tdPrm=ltv;tdPrv=b;tdDef=t;tdCon=ltt} $tp:ls$ $list:ltv$ = $priv:b$ $t$ $list:ltt$ ====================================================== ================================================

with_constr

“With” possibly following a module type.

Node <:poly_variant< ... >> Comment ======================== ====================================== =========== PvTag loc s True [] `$s$ constructor PvTag loc s True lt `$s$ of & $list:lt$ constructor PvTag loc s False lt `$s$ of $list:lt$ constructor PvTag loc s b lt `$s$ of $flag:b$ $list:lt$ constructor PvInh loc t $t$ type ======================== ====================================== ===========

Syntax tree - strict mode

Syntax tree - strict mode

This chapter presents the Camlp5 syntax tree when Camlp5 is installed in strict mode.

Introduction

This syntax tree is defined in the module “MLast” provided by Camlp5. Each node corresponds to a syntactic entity of the corresponding type.

For example, the syntax tree of the statement “if” can be written:

MLast.ExIfe loc e1 e2 e3

where “loc” is the location in the source, and “e1”, “e2” and “e3” are respectively the expression after the “if”, the one after the “then” and the one after the “else”.

If a program needs to manipulate syntax trees, it can use the nodes defined in the module “MLast”. The programmer must know how the concrete syntax is transformed into this abstract syntax.

A simpler solution is to use one of the quotation kits “q_MLast.cmo” or “q_ast.cmo”. Both propose quotations which represent the abstract syntax (the nodes of the module “MLast”) into concrete syntax with antiquotations to bind variables inside. The example above can be written:

<:expr< if $e1$ then $e2$ else $e3$ >>

This representation is very interesting when one wants to manipulate complicated syntax trees. Here is an example taken from the Camlp5 sources themselves:

<:expr<
  match try Some $f$ with [ Stream.Failure -> None ] with
  [ Some $p$ -> $e$
  | _ -> raise (Stream.Error $e2$) ]
>>

This example was in a position of a pattern. In abstract syntax, it should have been written:

MLast.ExMat _
  (MLast.ExTry _ (MLast.ExApp _ (MLast.ExUid _ (Ploc.VaVal "Some")) f)
     (Ploc.VaVal
        [(MLast.PaAcc _ (MLast.PaUid _ (Ploc.VaVal "Stream"))
           (MLast.PaUid _ (Ploc.VaVal "Failure")),
          Ploc.VaVal None, MLast.ExUid _ (Ploc.VaVal "None"))]))
  (Ploc.VaVal
     [(MLast.PaApp _ (MLast.PaUid _ (Ploc.VaVal "Some")) p,
       Ploc.VaVal None, e);
      (MLast.PaAny _, Ploc.VaVal None,
       MLast.ExApp _ (MLast.ExLid _ (Ploc.VaVal "raise"))
         (MLast.ExApp _
            (MLast.ExAcc _ (MLast.ExUid _ (Ploc.VaVal "Stream"))
               (MLast.ExUid _ (Ploc.VaVal "Error")))
            e2))])

Which is less readable and much more complicated to build and update.

Instead of thinking of “a syntax tree”, the programmer can think of “a piece of program”.

Location

In all syntax tree nodes, the first parameter is the source location of the node.

In expressions

When a quotation is in the context of an expression, the location parameter is “loc” in the node and in all its possible sub-nodes. Example: if we consider the quotation:

<:sig_item< value foo : int -> bool >>

This quotation, in a context of an expression, is equivalent to:

MLast.SgVal loc (Ploc.VaVal "foo")
  (MLast.TyArr loc (MLast.TyLid loc (Ploc.VaVal "int"))
     (MLast.TyLid loc (Ploc.VaVal "bool")));

The name “loc” is predefined. However, it is possible to change it, using the argument “-loc” of the Camlp5 shell commands.

Consequently, if there is no variable “loc” defined in the context of the quotation, or if it is not of the good type, a semantic error occur in the OCaml compiler (“Unbound value loc”).

Note that in the extensible grammars, the variable “loc” is bound, in all semantic actions, to the location of the rule.

If the created node has no location, the programmer can define a variable named “loc” equal to “Ploc.dummy”.

In patterns

When a quotation is in the context of a pattern, the location parameter of all nodes and possible sub-nodes is set to the wildcard (”_”). The same example above:

<:sig_item< value foo : int -> bool >>

is equivalent, in a pattern, to:

MLast.SgVal _ (Ploc.VaVal "foo")
  (MLast.TyArr _ (MLast.TyLid _ (Ploc.VaVal "int"))
     (MLast.TyLid _ (Ploc.VaVal "bool")))

However, it is possible to generate a binding of the variable “loc” on the top node by writing a “colon” before the “less” in the quotation. The same example:

<:sig_item:< value foo : int -> bool >>

is equivalent to:

MLast.SgVal loc (Ploc.VaVal "foo")
  (MLast.TyArr _ (MLast.TyLid _ (Ploc.VaVal "int"))
     (MLast.TyLid _ (Ploc.VaVal "bool")))

Antiquotations

The expressions or patterns between dollar ($) characters are called antiquotations. In opposition to quotations which has its own syntax rules, the antiquotation is an area in the syntax of the enclosing context (expression or pattern). See the chapter about quotations.

If a quotation is in the context of an expression, the antiquotation must be an expression. It could be any expression, including function calls. Examples:

value f e el = <:expr< [$e$ :: $loop False el$] >>;
value patt_list p pl = <:patt< ( $list:[p::pl]$) >>;

If a quotation is in the context of an pattern, the antiquotation is a pattern. Any pattern is possible, including the wildcard character (”_”). Examples:

fun [ <:expr< $lid:op$ $_$ $_$ >> -> op ]
match p with [ <:patt< $_$ | $_$ >> -> Some p ]

Two kinds of antiquotations

Preliminary remark

In strict mode, we remark that most constructors defined of the module “MLast” are of type “Ploc.vala”. This type is defined like this:

type vala 'a =
  [ VaAnt of string
  | VaVal of 'a ]
;

The type argument is the real type of the node. For example, a value of type “bool” in transitional mode is frequently represented by a value of type “Ploc.vala bool”.

The first case of the type “vala” corresponds to an antiquotation in the concrete syntax. The second case to a normal syntax situation, without antiquotation.

Example: in the “let” statement, the fact that it is “rec” or not is represented by a boolean. This boolean is, in the syntax tree, encapsulated with the type “Ploc.vala”. The syntax tree of the two following lines:

let x = y in z
let rec x = y in z

start with, respectively:

MLast.ExLet loc (Ploc.VaVal False)
  ... (* and so on *)

and:

MLast.ExLet loc (Ploc.VaVal True)
  ... (* and so on *)

The case “Ploc.VaAnt s” is internally used by the parsers and by the quotation kit “q_ast.cmo” to record antiquotation strings representing the expression or the patterns having this value. For example, in this “let” statement:

MLast.ExLet loc (Ploc.VaAnt s)
  ... (* and so on *)

The contents of this “s” is internally handled. For information, it contains the antiquotation string (kind included) together with representation of the location of the antiquotation in the quotation. See the next section.

Antiquoting

To antiquotate the fact that the “let” is with or without rec (a flag of type boolean), there are two ways.

direct antiquoting

The first way, hidding the type “Ploc.val”, can be written with the antiquotation kind “flag”:

<:expr< let $flag:rf$ x = y in z >>

This corresponds to the syntax tree:

MLast.ExLet loc (Ploc.VaVal rf)
   ... (* and so on *)

And, therefore, the type of the variable “rf” is simply “bool”.

general antiquoting

The second way, introducing variables of type “Ploc.vala” can be written a kind prefixed by “_”, namely here “_flag”:

<:expr< let $_flag:rf$ x = y in z >>

In that case, it corresponds to the syntax tree:

MLast.ExLet loc rf
   ... (* and so on *)

And, therefore, the type of the variable “rf” is now “Ploc.vala bool”.

Remarks

The first form of antiquotation ensures the compatibility with previous versions of Camlp5. The syntax tree is not the same, but the bound variables keep the same type.

All antiquotations kinds have these two forms: one with some name (e.g. “flag”) and one with the same name prefixed by “a” (e.g. “aflag”).

Nodes and Quotations

This section describes all nodes defined in the module “MLast” of Camlp5 and how to write them with quotations. Notice that, inside quotations, one is not restricted to these elementary cases, but any complex value can be used, resulting on possibly complex combined nodes.

The quotation forms are described here in revised syntax (like the rest of this document). In reality, it depends on which quotation kit is loaded:

  • If “q_MLast.cmo” is used, the revised syntax is mandatory: the quotations must be in that syntax without any extension.
  • If “q_ast.cmo” is used, the quotation syntax must be in the current user syntax with all extensions added to compile the file.

Last remark: in the following tables, the variables names give information of their types. The details can be found in the distributed source file “mLast.mli”.

  • e, e1, e2, e3: expr
  • p, p1, p2, p3: patt
  • t, t1, t2, e3: ctyp
  • s: string
  • b: bool
  • me, me1, me2: module_expr
  • mt, mt1, mt2: module_type
  • le: list expr
  • lp: list patt
  • lt: list ctyp
  • ls: list string
  • lse: list (string * expr)
  • lpe: list (patt * expr)
  • lpp: list (patt * patt)
  • lpoee: list (patt * option expr * expr)
  • op: option patt
  • lcstri: list class_str_item
  • lcsigi: list class_sig_item

expr

Expressions of the language.

  • access
    <:expr< $e1$ . $e2$ >> MLast.ExAcc loc e1 e2
  • antiquotation (1)
    <:expr< $anti:e$ >> MLast.ExAnt loc e
  • application
    <:expr< $e1$ $e2$ >> MLast.ExApp loc e1 e2
  • array element
    <:expr< $e1$ .( $e2$ ) >> MLast.ExAre loc e1 e2
  • array
    <:expr< [| $list:le$ |] >> MLast.ExArr loc (Ploc.VaVal le) <:expr< [| $_list:le$ |] >> MLast.ExArr loc le
  • assert
    <:expr< assert $e$ >> MLast.ExAsr loc e
  • assignment
    <:expr< $e1$ := $e2$ >> MLast.ExAss loc e1 e2
  • big array element
    <:expr< $e$ .{ $list:le$ } >> MLast.ExBae loc e (Ploc.VaVal le) <:expr< $e$ .{ $_list:le$ } >> MLast.ExBae loc e le
  • character constant
    <:expr< $chr:s$ >> MLast.ExChr loc (Ploc.VaVal s) <:expr< $_chr:s$ >> MLast.ExChr loc s
  • coercion
    <:expr< ($e$ :> $t2$) >> MLast.ExCoe loc e None t2 <:expr< ($e$ : $t1$ :> $t2$) >> MLast.ExCoe loc e (Some t1) t2
  • float constant
    <:expr< $flo:s$ >> MLast.ExFlo loc (Ploc.VaVal s) <:expr< $_flo:s$ >> MLast.ExFlo loc s
  • for (increasing)
    <:expr< for $lid:s$ = $e1$ to $e2$ do { $list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 (Ploc.VaVal True) (Ploc.VaVal le) <:expr< for $lid:s$ = $e1$ to $e2$ do { $_list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 (Ploc.VaVal True) le
  • for (decreasing)
    <:expr< for $lid:s$ = $e1$ downto $e2$ do { $list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 (Ploc.VaVal False) (Ploc.VaVal le) <:expr< for $lid:s$ = $e1$ downto $e2$ do { $_list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 (Ploc.VaVal False) le
  • for
    <:expr< for $lid:s$ = $e1$ $to:b$ $e2$ do { $list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 (Ploc.VaVal b) (Ploc.VaVal le) <:expr< for $lid:s$ = $e1$ $to:b$ $e2$ do { $_list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 (Ploc.VaVal b) le <:expr< for $lid:s$ = $e1$ $_to:b$ $e2$ do { $list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 b (Ploc.VaVal le) <:expr< for $lid:s$ = $e1$ $_to:b$ $e2$ do { $_list:le$ } >> MLast.ExFor loc (Ploc.VaVal s) e1 e2 b le <:expr< for $_lid:s$ = $e1$ to $e2$ do { $list:le$ } >> MLast.ExFor loc s e1 e2 (Ploc.VaVal True) (Ploc.VaVal le) <:expr< for $_lid:s$ = $e1$ to $e2$ do { $_list:le$ } >> MLast.ExFor loc s e1 e2 (Ploc.VaVal True) le <:expr< for $_lid:s$ = $e1$ downto $e2$ do { $list:le$ } >> MLast.ExFor loc s e1 e2 (Ploc.VaVal False) (Ploc.VaVal le) <:expr< for $_lid:s$ = $e1$ downto $e2$ do { $_list:le$ } >> MLast.ExFor loc s e1 e2 (Ploc.VaVal False) le <:expr< for $_lid:s$ = $e1$ $to:b$ $e2$ do { $list:le$ } >> MLast.ExFor loc s e1 e2 (Ploc.VaVal b) (Ploc.VaVal le) <:expr< for $_lid:s$ = $e1$ $to:b$ $e2$ do { $_list:le$ } >> MLast.ExFor loc s e1 e2 (Ploc.VaVal b) le <:expr< for $_lid:s$ = $e1$ $_to:b$ $e2$ do { $list:le$ } >> MLast.ExFor loc s e1 e2 b (Ploc.VaVal le) <:expr< for $_lid:s$ = $e1$ $_to:b$ $e2$ do { $_list:le$ } >> MLast.ExFor loc s e1 e2 b le
  • function
    <:expr< fun [ $list:lpee$ ] >> MLast.ExFun loc (Ploc.VaVal lpee) <:expr< fun [ $_list:lpee$ ] >> MLast.ExFun loc lpee
  • if
    <:expr< if $e1$ then $e2$ else $e3$ >> MLast.ExIfe loc e1 e2 e3
  • integer constant
    <:expr< $int:s1$ >> MLast.ExInt loc (Ploc.VaVal s1) "" <:expr< $_int:s1$ >> MLast.ExInt loc s1 ""
  • integer 32 bits
    <:expr< $int32:s1$ >> MLast.ExInt loc (Ploc.VaVal s1) "l" <:expr< $_int32:s1$ >> MLast.ExInt loc s1 "l"
  • integer 64 bits
    <:expr< $int64:s1$ >> MLast.ExInt loc (Ploc.VaVal s1) "L" <:expr< $_int64:s1$ >> MLast.ExInt loc s1 "L"
  • native integer
    <:expr< $nativeint:s1$ >> MLast.ExInt loc (Ploc.VaVal s1) "n" <:expr< $_nativeint:s1$ >> MLast.ExInt loc s1 "n"
  • label
    <:expr< ~{$p$} >> MLast.ExLab loc p (Ploc.VaVal None) <:expr< ~{$p$ = $e$} >> MLast.ExLab loc p (Ploc.VaVal (Some e)) <:expr< ~{$p$ $opt:oe$} >> MLast.ExLab loc p (Ploc.VaVal oe) <:expr< ~{$p$ $_opt:oe$} >> MLast.ExLab loc p oe
  • lazy
    <:expr< lazy $e$ >> MLast.ExLaz loc e
  • let rec
    <:expr< let rec $list:lpe$ in $e$ >> MLast.ExLet loc (Ploc.VaVal True) (Ploc.VaVal lpe) e <:expr< let rec $_list:lpe$ in $e$ >> MLast.ExLet loc (Ploc.VaVal True) lpe e
  • let not rec
    <:expr< let $list:lpe$ in $e$ >> MLast.ExLet loc (Ploc.VaVal False) (Ploc.VaVal lpe) e <:expr< let $_list:lpe$ in $e$ >> MLast.ExLet loc (Ploc.VaVal False) lpe e
  • let
    <:expr< let $flag:b$ $list:lpe$ in $e$ >> MLast.ExLet loc (Ploc.VaVal b) (Ploc.VaVal lpe) e <:expr< let $flag:b$ $_list:lpe$ in $e$ >> MLast.ExLet loc (Ploc.VaVal b) lpe e <:expr< let $_flag:b$ $list:lpe$ in $e$ >> MLast.ExLet loc b (Ploc.VaVal lpe) e <:expr< let $_flag:b$ $_list:lpe$ in $e$ >> MLast.ExLet loc b lpe e
  • lowercase identifier
    <:expr< $lid:s$ >> MLast.ExLid loc (Ploc.VaVal s) <:expr< $_lid:s$ >> MLast.ExLid loc s
  • let module
    <:expr< let module $uid:s$ = $me$ in $e$ >> MLast.ExLmd loc (Ploc.VaVal s) me e <:expr< let module $_uid:s$ = $me$ in $e$ >> MLast.ExLmd loc s me e
  • match
    <:expr< match $e$ with [ $list:lpee$ ] >> MLast.ExMat loc e (Ploc.VaVal lpee) <:expr< match $e$ with [ $_list:lpee$ ] >> MLast.ExMat loc e lpee
  • new
    <:expr< new $list:ls$ >> MLast.ExNew loc (Ploc.VaVal ls) <:expr< new $_list:ls$ >> MLast.ExNew loc ls
  • object expression
    <:expr< object $list:lcsi$ end >> MLast.ExObj loc (Ploc.VaVal None) (Ploc.VaVal lcsi) <:expr< object $_list:lcsi$ end >> MLast.ExObj loc (Ploc.VaVal None) lcsi <:expr< object ($p$) $list:lcsi$ end >> MLast.ExObj loc (Ploc.VaVal (Some p)) (Ploc.VaVal lcsi) <:expr< object ($p$) $_list:lcsi$ end >> MLast.ExObj loc (Ploc.VaVal (Some p)) lcsi <:expr< object $opt:op$ $list:lcsi$ end >> MLast.ExObj loc (Ploc.VaVal op) (Ploc.VaVal lcsi) <:expr< object $opt:op$ $_list:lcsi$ end >> MLast.ExObj loc (Ploc.VaVal op) lcsi <:expr< object $_opt:op$ $list:lcsi$ end >> MLast.ExObj loc op (Ploc.VaVal lcsi) <:expr< object $_opt:op$ $_list:lcsi$ end >> MLast.ExObj loc op lcsi
  • option label
    <:expr< ?{$p$} >> MLast.ExOlb loc p (Ploc.VaVal None) <:expr< ?{$p$ = $e$} >> MLast.ExOlb loc p (Ploc.VaVal (Some e)) <:expr< ?{$p$ $opt:oe$} >> MLast.ExOlb loc p (Ploc.VaVal oe) <:expr< ?{$p$ $_opt:oe$} >> MLast.ExOlb loc p oe
  • override
    <:expr< {< $list:lse$ >} >> MLast.ExOvr loc (Ploc.VaVal lse) <:expr< {< $_list:lse$ >} >> MLast.ExOvr loc lse
  • module packing
    <:expr< (module $me$) >> MLast.ExPck loc me None <:expr< (module $me$ : $mt$) >> MLast.ExPck loc me (Some mt)
  • record
    <:expr< {$list:lpe$} >> MLast.ExRec loc (Ploc.VaVal lpe) None <:expr< {($e$) with $list:lpe$} >> MLast.ExRec loc (Ploc.VaVal lpe) (Some e) <:expr< {$_list:lpe$} >> MLast.ExRec loc lpe None <:expr< {($e$) with $_list:lpe$} >> MLast.ExRec loc lpe (Some e)
  • sequence
    <:expr< do { $list:le$ } >> MLast.ExSeq loc (Ploc.VaVal le) <:expr< do { $_list:le$ } >> MLast.ExSeq loc le
  • method call
    <:expr< $e$ # $s$ >> MLast.ExSnd loc e (Ploc.VaVal s) <:expr< $e$ # $_:s$ >> MLast.ExSnd loc e s
  • string element
    <:expr< $e1$ .[ $e2$ ] >> MLast.ExSte loc e1 e2
  • string
    <:expr< $str:s$ >> MLast.ExStr loc (Ploc.VaVal s) <:expr< $_str:s$ >> MLast.ExStr loc s
  • try
    <:expr< try $e$ with [ $list:lpee$ ] >> MLast.ExTry loc e (Ploc.VaVal lpee) <:expr< try $e$ with [ $_list:lpee$ ] >> MLast.ExTry loc e lpee
  • t-uple
    <:expr< ($list:le$) >> MLast.ExTup loc (Ploc.VaVal le) <:expr< ($_list:le$) >> MLast.ExTup loc le
  • type constraint
    <:expr< ($e$ : $t$) >> MLast.ExTyc loc e t
  • uppercase identifier
    <:expr< $uid:s$ >> MLast.ExUid loc (Ploc.VaVal s) <:expr< $_uid:s$ >> MLast.ExUid loc s
  • variant
    <:expr< ` $s$ >> MLast.ExVrn loc (Ploc.VaVal s) <:expr< ` $_:s$ >> MLast.ExVrn loc s
  • while
    <:expr< while $e$ do { $list:le$ } >> MLast.ExWhi loc e (Ploc.VaVal le) <:expr< while $e$ do { $_list:le$ } >> MLast.ExWhi loc e le
  • extra node (2)
    ... no representation ... MLast.ExXtr loc s oe
(1) Node used in the quotation expanders to tells at conversion to OCaml compiler syntax tree time, that all locations of the sub-tree is correcty located in the quotation. By default, in quotations, the locations of all generated nodes are the location of the whole quotation. This node allows to make an exception to this rule, since we know that the antiquotation belongs to the universe of the enclosing program. See the chapter about quotations and, in particular, its section about antiquotations.
(2) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of expressions.

patt

Patterns of the language.

  • access
    <:patt< $p1$ . $p2$ >> MLast.PaAcc loc p1 p2
  • alias
    <:patt< ($p1$ as $p2$) >> MLast.PaAli loc p1 p2
  • antiquotation (1)
    <:patt< $anti:p$ >> MLast.PaAnt loc p
  • wildcard
    <:patt< _ >> MLast.PaAny loc
  • application
    <:patt< $p1$ $p2$ >> MLast.PaApp loc p1 p2
  • array
    <:patt< [| $list:lp$ |] >> MLast.PaArr loc (Ploc.VaVal lp) <:patt< [| $_list:lp$ |] >> MLast.PaArr loc lp
  • character
    <:patt< $chr:s$ >> MLast.PaChr loc (Ploc.VaVal s) <:patt< $_chr:s$ >> MLast.PaChr loc s
  • float
    <:patt< $flo:s$ >> MLast.PaFlo loc (Ploc.VaVal s) <:patt< $_flo:s$ >> MLast.PaFlo loc s
  • integer constant
    <:patt< $int:s1$ >> MLast.PaInt loc (Ploc.VaVal s1) "" <:patt< $_int:s1$ >> MLast.PaInt loc s1 ""
  • integer 32 bits
    <:patt< $int32:s1$ >> MLast.PaInt loc (Ploc.VaVal s1) "l" <:patt< $_int32:s1$ >> MLast.PaInt loc s1 "l"
  • integer 64 bits
    <:patt< $int64:s1$ >> MLast.PaInt loc (Ploc.VaVal s1) "L" <:patt< $_int64:s1$ >> MLast.PaInt loc s1 "L"
  • native integer
    <:patt< $nativeint:s1$ >> MLast.PaInt loc (Ploc.VaVal s1) "n" <:patt< $_nativeint:s1$ >> MLast.PaInt loc s1 "n"
  • label
    <:patt< ~{$p1$} >> MLast.PaLab loc p1 (Ploc.VaVal None) <:patt< ~{$p1$ = $p2$} >> MLast.PaLab loc p1 (Ploc.VaVal (Some p2)) <:patt< ~{$p1$ $opt:op2$} >> MLast.PaLab loc p1 (Ploc.VaVal op2) <:patt< ~{$p1$ $_opt:op2$} >> MLast.PaLab loc p1 op2
  • lazy
    <:patt< lazy $p$ >> MLast.PaLaz loc p
  • lowercase identifier
    <:patt< $lid:s$ >> MLast.PaLid loc (Ploc.VaVal s) <:patt< $_lid:s$ >> MLast.PaLid loc s
  • new type
    <:patt< (type $lid:s$) >> MLast.PaNty loc (Ploc.VaVal s) <:patt< (type $_lid:s$) >> MLast.PaNty loc s
  • option label
    <:patt< ?{$p$} >> MLast.PaOlb loc p (Ploc.VaVal None) <:patt< ?{$p$ = $e$} >> MLast.PaOlb loc p (Ploc.VaVal (Some e)) <:patt< ?{$p$ $opt:oe$} >> MLast.PaOlb loc p (Ploc.VaVal oe) <:patt< ?{$p$ $_opt:oe$} >> MLast.PaOlb loc p oe
  • or
    <:patt< $p1$ | $p2$ >> MLast.PaOrp loc p1 p2
  • record
    <:patt< { $list:lpp$ } >> MLast.PaRec loc (Ploc.VaVal lpp) <:patt< { $_list:lpp$ } >> MLast.PaRec loc lpp
  • range
    <:patt< $p1$ .. $p2$ >> MLast.PaRng loc p1 p2
  • string
    <:patt< $str:s$ >> MLast.PaStr loc (Ploc.VaVal s) <:patt< $_str:s$ >> MLast.PaStr loc s
  • t-uple
    <:patt< ($list:lp$) >> MLast.PaTup loc (Ploc.VaVal lp) <:patt< ($_list:lp$) >> MLast.PaTup loc lp
  • type constraint
    <:patt< ($p$ : $t$) >> MLast.PaTyc loc p t
  • type pattern
    <:patt< # $list:ls$ >> MLast.PaTyp loc (Ploc.VaVal ls) <:patt< # $_list:ls$ >> MLast.PaTyp loc ls
  • uppercase identifier
    <:patt< $uid:s$ >> MLast.PaUid loc (Ploc.VaVal s) <:patt< $_uid:s$ >> MLast.PaUid loc s
  • module unpacking
    <:patt< (module $uid:s$) >> MLast.PaUnp loc (Ploc.VaVal s) None <:patt< (module $uid:s$ : $mt$) >> MLast.PaUnp loc (Ploc.VaVal s) (Some mt) <:patt< (module $_uid:s$) >> MLast.PaUnp loc s None <:patt< (module $_uid:s$ : $mt$) >> MLast.PaUnp loc s (Some mt)
  • variant
    <:patt< ` $s$ >> MLast.PaVrn loc (Ploc.VaVal s) <:patt< ` $_:s$ >> MLast.PaVrn loc s
  • extra node (2)
    ... no representation ... MLast.PaXtr loc s op
(1) Node used to specify an antiquotation area, like for the equivalent node in expressions. See above.
(2) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of patterns.

ctyp

Type expressions of the language.

  • access
    <:ctyp< $t1$ . $t2$ >> MLast.TyAcc loc t1 t2
  • alias
    <:ctyp< $t1$ as $t2$ >> MLast.TyAli loc t1 t2
  • wildcard
    <:ctyp< _ >> MLast.TyAny loc
  • application
    <:ctyp< $t1$ $t2$ >> MLast.TyApp loc t1 t2
  • arrow
    <:ctyp< $t1$ -> $t2$ >> MLast.TyArr loc t1 t2
  • class
    <:ctyp< # $list:ls$ >> MLast.TyCls loc (Ploc.VaVal ls) <:ctyp< # $_list:ls$ >> MLast.TyCls loc ls
  • label
    <:ctyp< ~$s$: $t$ >> MLast.TyLab loc (Ploc.VaVal s) t <:ctyp< ~$_:s$: $t$ >> MLast.TyLab loc s t
  • lowercase identifier
    <:ctyp< $lid:s$ >> MLast.TyLid loc (Ploc.VaVal s) <:ctyp< $_lid:s$ >> MLast.TyLid loc s
  • manifest
    <:ctyp< $t1$ == private $t2$ >> MLast.TyMan loc t1 (Ploc.VaVal True) t2 <:ctyp< $t1$ == $t2$ >> MLast.TyMan loc t1 (Ploc.VaVal False) t2 <:ctyp< $t1$ == $priv:b$ $t2$ >> MLast.TyMan loc t1 (Ploc.VaVal b) t2 <:ctyp< $t1$ == $_priv:b$ $t2$ >> MLast.TyMan loc t1 b t2
  • object
    <:ctyp< < $list:lst$ .. > >> MLast.TyObj loc (Ploc.VaVal lst) (Ploc.VaVal True) <:ctyp< < $list:lst$ > >> MLast.TyObj loc (Ploc.VaVal lst) (Ploc.VaVal False) <:ctyp< < $list:lst$ $flag:b$ > >> MLast.TyObj loc (Ploc.VaVal lst) (Ploc.VaVal b) <:ctyp< < $list:lst$ $_flag:b$ > >> MLast.TyObj loc (Ploc.VaVal lst) b <:ctyp< < $_list:lst$ .. > >> MLast.TyObj loc lst (Ploc.VaVal True) <:ctyp< < $_list:lst$ > >> MLast.TyObj loc lst (Ploc.VaVal False) <:ctyp< < $_list:lst$ $flag:b$ > >> MLast.TyObj loc lst (Ploc.VaVal b) <:ctyp< < $_list:lst$ $_flag:b$ > >> MLast.TyObj loc lst b
  • option label
    <:ctyp< ?$s$: $t$ >> MLast.TyOlb loc (Ploc.VaVal s) t <:ctyp< ?$_:s$: $t$ >> MLast.TyOlb loc s t
  • package
    <:ctyp< (module $mt$) >> MLast.TyPck loc mt
  • polymorph
    <:ctyp< ! $list:ls$ . $t$ >> MLast.TyPol loc (Ploc.VaVal ls) t <:ctyp< ! $_list:ls$ . $t$ >> MLast.TyPol loc ls t
  • variable
    <:ctyp< '$s$ >> MLast.TyQuo loc (Ploc.VaVal s) <:ctyp< '$_:s$ >> MLast.TyQuo loc s
  • record
    <:ctyp< { $list:llsbt$ } >> MLast.TyRec loc (Ploc.VaVal llsbt) <:ctyp< { $_list:llsbt$ } >> MLast.TyRec loc llsbt
  • sum
    <:ctyp< [ $list:llslt$ ] >> MLast.TySum loc (Ploc.VaVal llslt) <:ctyp< [ $_list:llslt$ ] >> MLast.TySum loc llslt
  • t-uple
    <:ctyp< ( $list:lt$ ) >> MLast.TyTup loc (Ploc.VaVal lt) <:ctyp< ( $_list:lt$ ) >> MLast.TyTup loc lt
  • uppercase identifier
    <:ctyp< $uid:s$ >> MLast.TyUid loc (Ploc.VaVal s) <:ctyp< $_uid:s$ >> MLast.TyUid loc s
  • variant
    <:ctyp< [ = $list:lpv$ ] >> MLast.TyVrn loc (Ploc.VaVal lpv) None <:ctyp< [ > $list:lpv$ ] >> MLast.TyVrn loc (Ploc.VaVal lpv) (Some None) <:ctyp< [ < $list:lpv$ ] >> MLast.TyVrn loc (Ploc.VaVal lpv) (Some (Some (Ploc.VaVal []))) <:ctyp< [ < $list:lpv$ > $list:ls$ ] >> MLast.TyVrn loc (Ploc.VaVal lpv) (Some (Some (Ploc.VaVal ls))) <:ctyp< [ < $list:lpv$ > $_list:ls$ ] >> MLast.TyVrn loc (Ploc.VaVal lpv) (Some (Some ls)) <:ctyp< [ = $_list:lpv$ ] >> MLast.TyVrn loc lpv None <:ctyp< [ > $_list:lpv$ ] >> MLast.TyVrn loc lpv (Some None) <:ctyp< [ < $_list:lpv$ ] >> MLast.TyVrn loc lpv (Some (Some (Ploc.VaVal []))) <:ctyp< [ < $_list:lpv$ > $list:ls$ ] >> MLast.TyVrn loc lpv (Some (Some (Ploc.VaVal ls))) <:ctyp< [ < $_list:lpv$ > $_list:ls$ ] >> MLast.TyVrn loc lpv (Some (Some ls))
  • extra node (1)
    ... no representation ... MLast.TyXtr loc s ot
(1) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of types.

modules…

str_item

Structure items, i.e. phrases in a “.ml” file or “struct” elements.

  • class declaration
    <:str_item< class $list:lcice$ >> MLast.StCls loc (Ploc.VaVal lcice) <:str_item< class $_list:lcice$ >> MLast.StCls loc lcice
  • class type declaration
    <:str_item< class type $list:lcict$ >> MLast.StClt loc (Ploc.VaVal lcict) <:str_item< class type $_list:lcict$ >> MLast.StClt loc lcict
  • declare
    <:str_item< declare $list:lsi$ end >> MLast.StDcl loc (Ploc.VaVal lsi) <:str_item< declare $_list:lsi$ end >> MLast.StDcl loc lsi
  • directive
    <:str_item< # $lid:s$ >> MLast.StDir loc (Ploc.VaVal s) (Ploc.VaVal None) <:str_item< # $lid:s$ $e$ >> MLast.StDir loc (Ploc.VaVal s) (Ploc.VaVal (Some e)) <:str_item< # $lid:s$ $opt:oe$ >> MLast.StDir loc (Ploc.VaVal s) (Ploc.VaVal oe) <:str_item< # $lid:s$ $_opt:oe$ >> MLast.StDir loc (Ploc.VaVal s) oe <:str_item< # $_lid:s$ >> MLast.StDir loc s (Ploc.VaVal None) <:str_item< # $_lid:s$ $e$ >> MLast.StDir loc s (Ploc.VaVal (Some e)) <:str_item< # $_lid:s$ $opt:oe$ >> MLast.StDir loc s (Ploc.VaVal oe) <:str_item< # $_lid:s$ $_opt:oe$ >> MLast.StDir loc s oe
  • exception
    <:str_item< exception $uid:s$ >> MLast.StExc loc (Ploc.VaVal s) (Ploc.VaVal []) (Ploc.VaVal []) <:str_item< exception $uid:s$ of $list:lt$ >> MLast.StExc loc (Ploc.VaVal s) (Ploc.VaVal lt) (Ploc.VaVal []) <:str_item< exception $uid:s$ = $list:ls$ >> MLast.StExc loc (Ploc.VaVal s) (Ploc.VaVal []) (Ploc.VaVal ls) <:str_item< exception $uid:s$ of $list:lt$ = $list:ls$ >> MLast.StExc loc (Ploc.VaVal s) (Ploc.VaVal lt) (Ploc.VaVal ls) <:str_item< exception $uid:s$ = $_list:ls$ >> MLast.StExc loc (Ploc.VaVal s) (Ploc.VaVal []) ls <:str_item< exception $uid:s$ of $list:lt$ = $_list:ls$ >> MLast.StExc loc (Ploc.VaVal s) (Ploc.VaVal lt) ls <:str_item< exception $uid:s$ of $_list:lt$ >> MLast.StExc loc (Ploc.VaVal s) lt (Ploc.VaVal []) <:str_item< exception $uid:s$ of $_list:lt$ = $list:ls$ >> MLast.StExc loc (Ploc.VaVal s) lt (Ploc.VaVal ls) <:str_item< exception $uid:s$ of $_list:lt$ = $_list:ls$ >> MLast.StExc loc (Ploc.VaVal s) lt ls <:str_item< exception $_uid:s$ >> MLast.StExc loc s (Ploc.VaVal []) (Ploc.VaVal []) <:str_item< exception $_uid:s$ of $list:lt$ >> MLast.StExc loc s (Ploc.VaVal lt) (Ploc.VaVal []) <:str_item< exception $_uid:s$ = $list:ls$ >> MLast.StExc loc s (Ploc.VaVal []) (Ploc.VaVal ls) <:str_item< exception $_uid:s$ of $list:lt$ = $list:ls$ >> MLast.StExc loc s (Ploc.VaVal lt) (Ploc.VaVal ls) <:str_item< exception $_uid:s$ = $_list:ls$ >> MLast.StExc loc s (Ploc.VaVal []) ls <:str_item< exception $_uid:s$ of $list:lt$ = $_list:ls$ >> MLast.StExc loc s (Ploc.VaVal lt) ls <:str_item< exception $_uid:s$ of $_list:lt$ >> MLast.StExc loc s lt (Ploc.VaVal []) <:str_item< exception $_uid:s$ of $_list:lt$ = $list:ls$ >> MLast.StExc loc s lt (Ploc.VaVal ls) <:str_item< exception $_uid:s$ of $_list:lt$ = $_list:ls$ >> MLast.StExc loc s lt ls
  • expression
    <:str_item< $exp:e$ >> MLast.StExp loc e
  • external
    <:str_item< external $s$ : $t$ = $list:ls$ >> MLast.StExt loc (Ploc.VaVal s) t (Ploc.VaVal ls) <:str_item< external $s$ : $t$ = $_list:ls$ >> MLast.StExt loc (Ploc.VaVal s) t ls <:str_item< external $_:s$ : $t$ = $list:ls$ >> MLast.StExt loc s t (Ploc.VaVal ls) <:str_item< external $_:s$ : $t$ = $_list:ls$ >> MLast.StExt loc s t ls
  • include
    <:str_item< include $me$ >> MLast.StInc loc me
  • module rec
    <:str_item< module rec $list:lsme$ >> MLast.StMod loc (Ploc.VaVal True) (Ploc.VaVal lsme) <:str_item< module rec $_list:lsme$ >> MLast.StMod loc (Ploc.VaVal True) lsme
  • module non rec
    <:str_item< module $list:lsme$ >> MLast.StMod loc (Ploc.VaVal False) (Ploc.VaVal lsme) <:str_item< module $_list:lsme$ >> MLast.StMod loc (Ploc.VaVal False) lsme
  • module
    <:str_item< module $flag:b$ $list:lsme$ >> MLast.StMod loc (Ploc.VaVal b) (Ploc.VaVal lsme) <:str_item< module $flag:b$ $_list:lsme$ >> MLast.StMod loc (Ploc.VaVal b) lsme <:str_item< module $_flag:b$ $list:lsme$ >> MLast.StMod loc b (Ploc.VaVal lsme) <:str_item< module $_flag:b$ $_list:lsme$ >> MLast.StMod loc b lsme
  • module type
    <:str_item< module type $s$ = $mt$ >> MLast.StMty loc (Ploc.VaVal s) mt <:str_item< module type $_:s$ = $mt$ >> MLast.StMty loc s mt
  • open
    <:str_item< open $list:ls$ >> MLast.StOpn loc (Ploc.VaVal ls) <:str_item< open $_list:ls$ >> MLast.StOpn loc ls
  • type declaration
    <:str_item< type $list:ltd$ >> MLast.StTyp loc (Ploc.VaVal ltd) <:str_item< type $_list:ltd$ >> MLast.StTyp loc ltd
  • … internal use … (1)
    <:str_item< # $str:s$ $list:lsil$ >> MLast.StUse loc (Ploc.VaVal s) (Ploc.VaVal lsil) <:str_item< # $str:s$ $_list:lsil$ >> MLast.StUse loc (Ploc.VaVal s) lsil <:str_item< # $_str:s$ $list:lsil$ >> MLast.StUse loc s (Ploc.VaVal lsil) <:str_item< # $_str:s$ $_list:lsil$ >> MLast.StUse loc s lsil
  • value rec
    <:str_item< value rec $list:lpe$ >> MLast.StVal loc (Ploc.VaVal True) (Ploc.VaVal lpe) <:str_item< value rec $_list:lpe$ >> MLast.StVal loc (Ploc.VaVal True) lpe
  • value non rec
    <:str_item< value $list:lpe$ >> MLast.StVal loc (Ploc.VaVal False) (Ploc.VaVal lpe) <:str_item< value $_list:lpe$ >> MLast.StVal loc (Ploc.VaVal False) lpe
  • value
    <:str_item< value $flag:b$ $list:lpe$ >> MLast.StVal loc (Ploc.VaVal b) (Ploc.VaVal lpe) <:str_item< value $flag:b$ $_list:lpe$ >> MLast.StVal loc (Ploc.VaVal b) lpe <:str_item< value $_flag:b$ $list:lpe$ >> MLast.StVal loc b (Ploc.VaVal lpe) <:str_item< value $_flag:b$ $_list:lpe$ >> MLast.StVal loc b lpe
  • extra node (2)
    ... no representation ... MLast.StXtr loc s ot
(1) Node internally used to specify a different file name applying to the whole subtree. This is generated by the directive “use” and used when converting to the OCaml syntax tree which needs the file name in its location type.
(2) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of structure items.

sig_item

Signature items, i.e. phrases in a “.mli” file or elements inside “sig … end”.

  • class
    <:sig_item< class $list:lcict$ >> MLast.SgCls loc (Ploc.VaVal lcict) <:sig_item< class $_list:lcict$ >> MLast.SgCls loc lcict
  • class type
    <:sig_item< class type $list:lcict$ >> MLast.SgClt loc (Ploc.VaVal lcict) <:sig_item< class type $_list:lcict$ >> MLast.SgClt loc lcict
  • declare
    <:sig_item< declare $list:lsi$ end >> MLast.SgDcl loc (Ploc.VaVal lsi) <:sig_item< declare $_list:lsi$ end >> MLast.SgDcl loc lsi
  • directive
    <:sig_item< # $lid:s$ >> MLast.SgDir loc (Ploc.VaVal s) (Ploc.VaVal None) <:sig_item< # $lid:s$ $e$ >> MLast.SgDir loc (Ploc.VaVal s) (Ploc.VaVal (Some e)) <:sig_item< # $lid:s$ $opt:oe$ >> MLast.SgDir loc (Ploc.VaVal s) (Ploc.VaVal oe) <:sig_item< # $lid:s$ $_opt:oe$ >> MLast.SgDir loc (Ploc.VaVal s) oe <:sig_item< # $_lid:s$ >> MLast.SgDir loc s (Ploc.VaVal None) <:sig_item< # $_lid:s$ $e$ >> MLast.SgDir loc s (Ploc.VaVal (Some e)) <:sig_item< # $_lid:s$ $opt:oe$ >> MLast.SgDir loc s (Ploc.VaVal oe) <:sig_item< # $_lid:s$ $_opt:oe$ >> MLast.SgDir loc s oe
  • exception
    <:sig_item< exception $s$ >> MLast.SgExc loc (Ploc.VaVal s) (Ploc.VaVal []) <:sig_item< exception $s$ of $list:lt$ >> MLast.SgExc loc (Ploc.VaVal s) (Ploc.VaVal lt) <:sig_item< exception $s$ of $_list:lt$ >> MLast.SgExc loc (Ploc.VaVal s) lt <:sig_item< exception $_:s$ >> MLast.SgExc loc s (Ploc.VaVal []) <:sig_item< exception $_:s$ of $list:lt$ >> MLast.SgExc loc s (Ploc.VaVal lt) <:sig_item< exception $_:s$ of $_list:lt$ >> MLast.SgExc loc s lt
  • external
    <:sig_item< external $s$ : $t$ = $list:ls$ >> MLast.SgExt loc (Ploc.VaVal s) t (Ploc.VaVal ls) <:sig_item< external $s$ : $t$ = $_list:ls$ >> MLast.SgExt loc (Ploc.VaVal s) t ls <:sig_item< external $_:s$ : $t$ = $list:ls$ >> MLast.SgExt loc s t (Ploc.VaVal ls) <:sig_item< external $_:s$ : $t$ = $_list:ls$ >> MLast.SgExt loc s t ls
  • include
    <:sig_item< include $mt$ >> MLast.SgInc loc mt
  • module rec
    <:sig_item< module rec $list:lsmt$ >> MLast.SgMod loc (Ploc.VaVal True) (Ploc.VaVal lsmt) <:sig_item< module rec $_list:lsmt$ >> MLast.SgMod loc (Ploc.VaVal True) lsmt
  • module non rec
    <:sig_item< module $list:lsmt$ >> MLast.SgMod loc (Ploc.VaVal False) (Ploc.VaVal lsmt) <:sig_item< module $_list:lsmt$ >> MLast.SgMod loc (Ploc.VaVal False) lsmt
  • module
    <:sig_item< module $flag:b$ $list:lsmt$ >> MLast.SgMod loc (Ploc.VaVal b) (Ploc.VaVal lsmt) <:sig_item< module $flag:b$ $_list:lsmt$ >> MLast.SgMod loc (Ploc.VaVal b) lsmt <:sig_item< module $_flag:b$ $list:lsmt$ >> MLast.SgMod loc b (Ploc.VaVal lsmt) <:sig_item< module $_flag:b$ $_list:lsmt$ >> MLast.SgMod loc b lsmt
  • module type
    <:sig_item< module type $s$ = $mt$ >> MLast.SgMty loc (Ploc.VaVal s) mt <:sig_item< module type $_:s$ = $mt$ >> MLast.SgMty loc s mt
  • open
    <:sig_item< open $list:ls$ >> MLast.SgOpn loc (Ploc.VaVal ls) <:sig_item< open $_list:ls$ >> MLast.SgOpn loc ls
  • type declaration
    <:sig_item< type $list:ltd$ >> MLast.SgTyp loc (Ploc.VaVal ltd) <:sig_item< type $_list:ltd$ >> MLast.SgTyp loc ltd
  • … internal use … (1)
    <:sig_item< # $str:s$ $list:lsil$ >> MLast.SgUse loc (Ploc.VaVal s) (Ploc.VaVal lsil) <:sig_item< # $str:s$ $_list:lsil$ >> MLast.SgUse loc (Ploc.VaVal s) lsil <:sig_item< # $_str:s$ $list:lsil$ >> MLast.SgUse loc s (Ploc.VaVal lsil) <:sig_item< # $_str:s$ $_list:lsil$ >> MLast.SgUse loc s lsil
  • value
    <:sig_item< value $s$ : $t$ >> MLast.SgVal loc (Ploc.VaVal s) t <:sig_item< value $_:s$ : $t$ >> MLast.SgVal loc s t
  • extra node (2)
    ... no representation ... MLast.SgXtr loc s ot
  1. Same remark as for “str_item” above.
(2) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of signature items.

module_expr

  • access
    <:module_expr< $me1$ . $me2$ >> MLast.MeAcc loc me1 me2
  • application
    <:module_expr< $me1$ $me2$ >> MLast.MeApp loc me1 me2
  • functor
    <:module_expr< functor ($s$ : $mt$) -> $me$ >> MLast.MeFun loc (Ploc.VaVal s) mt me <:module_expr< functor ($_:s$ : $mt$) -> $me$ >> MLast.MeFun loc s mt me
  • struct
    <:module_expr< struct $list:lsi$ end >> MLast.MeStr loc (Ploc.VaVal lsi) <:module_expr< struct $_list:lsi$ end >> MLast.MeStr loc lsi
  • module type constraint
    <:module_expr< ($me$ : $mt$) >> MLast.MeTyc loc me mt
  • uppercase identifier
    <:module_expr< $uid:s$ >> MLast.MeUid loc (Ploc.VaVal s) <:module_expr< $_uid:s$ >> MLast.MeUid loc s
  • module unpacking
    <:module_expr< (value $e$) >> MLast.MeUnp loc e None <:module_expr< (value $e$ : $mt$) >> MLast.MeUnp loc e (Some mt)
  • extra node (1)
    ... no representation ... MLast.MeXtr loc s ot
(1) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of module expressions.

module_type

  • access
    <:module_type< $mt1$ . $mt2$ >> MLast.MtAcc loc mt1 mt2
  • application
    <:module_type< $mt1$ $mt2$ >> MLast.MtApp loc mt1 mt2
  • functor
    <:module_type< functor ($s$ : $mt1$) -> $mt2$ >> MLast.MtFun loc (Ploc.VaVal s) mt1 mt2 <:module_type< functor ($_:s$ : $mt1$) -> $mt2$ >> MLast.MtFun loc s mt1 mt2
  • lowercase identifier
    <:module_type< $lid:s$ >> MLast.MtLid loc (Ploc.VaVal s) <:module_type< $_lid:s$ >> MLast.MtLid loc s
  • abstract
    <:module_type< ' $s$ >> MLast.MtQuo loc (Ploc.VaVal s) <:module_type< ' $_:s$ >> MLast.MtQuo loc s
  • signature
    <:module_type< sig $list:lsi$ end >> MLast.MtSig loc (Ploc.VaVal lsi) <:module_type< sig $_list:lsi$ end >> MLast.MtSig loc lsi
  • of module expression
    <:module_type< module type of $me$ >> MLast.MtTyo loc me
  • uppercase identifier
    <:module_type< $uid:s$ >> MLast.MtUid loc (Ploc.VaVal s) <:module_type< $_uid:s$ >> MLast.MtUid loc s
  • with construction
    <:module_type< $mt$ with $list:lwc$ >> MLast.MtWit loc mt (Ploc.VaVal lwc) <:module_type< $mt$ with $_list:lwc$ >> MLast.MtWit loc mt lwc
  • extra node (1)
    ... no representation ... MLast.MtXtr loc s ot
(1) Extra node internally used by the quotation kit “q_ast.cmo” to build antiquotations of module types.

classes…

class_expr

  • application
    <:class_expr< $ce$ $e$ >> MLast.CeApp loc ce e
  • constructor
    <:class_expr< [ $list:lt$ ] $list:ls$ >> MLast.CeCon loc (Ploc.VaVal ls) (Ploc.VaVal lt) <:class_expr< [ $_list:lt$ ] $list:ls$ >> MLast.CeCon loc (Ploc.VaVal ls) lt <:class_expr< [ $list:lt$ ] $_list:ls$ >> MLast.CeCon loc ls (Ploc.VaVal lt) <:class_expr< [ $_list:lt$ ] $_list:ls$ >> MLast.CeCon loc ls lt
  • function
    <:class_expr< fun $p$ -> $ce$ >> MLast.CeFun loc p ce
  • let rec
    <:class_expr< let rec $list:lpe$ in $ce$ >> MLast.CeLet loc (Ploc.VaVal True) (Ploc.VaVal lpe) ce <:class_expr< let rec $_list:lpe$ in $ce$ >> MLast.CeLet loc (Ploc.VaVal True) lpe ce
  • let non rec
    <:class_expr< let $list:lpe$ in $ce$ >> MLast.CeLet loc (Ploc.VaVal False) (Ploc.VaVal lpe) ce <:class_expr< let $_list:lpe$ in $ce$ >> MLast.CeLet loc (Ploc.VaVal False) lpe ce
  • let
    <:class_expr< let $flag:b$ $list:lpe$ in $ce$ >> MLast.CeLet loc (Ploc.VaVal b) (Ploc.VaVal lpe) ce <:class_expr< let $flag:b$ $_list:lpe$ in $ce$ >> MLast.CeLet loc (Ploc.VaVal b) lpe ce <:class_expr< let $_flag:b$ $list:lpe$ in $ce$ >> MLast.CeLet loc b (Ploc.VaVal lpe) ce <:class_expr< let $_flag:b$ $_list:lpe$ in $ce$ >> MLast.CeLet loc b lpe ce
  • object
    <:class_expr< object $list:lcsi$ end >> MLast.CeStr loc (Ploc.VaVal None) (Ploc.VaVal lcsi) <:class_expr< object $_list:lcsi$ end >> MLast.CeStr loc (Ploc.VaVal None) lcsi <:class_expr< object ($p$) $list:lcsi$ end >> MLast.CeStr loc (Ploc.VaVal (Some p)) (Ploc.VaVal lcsi) <:class_expr< object ($p$) $_list:lcsi$ end >> MLast.CeStr loc (Ploc.VaVal (Some p)) lcsi <:class_expr< object $opt:op$ $list:lcsi$ end >> MLast.CeStr loc (Ploc.VaVal op) (Ploc.VaVal lcsi) <:class_expr< object $opt:op$ $_list:lcsi$ end >> MLast.CeStr loc (Ploc.VaVal op) lcsi <:class_expr< object $_opt:op$ $list:lcsi$ end >> MLast.CeStr loc op (Ploc.VaVal lcsi) <:class_expr< object $_opt:op$ $_list:lcsi$ end >> MLast.CeStr loc op lcsi
  • class type constraint
    <:class_expr< ($ce$ : $ct$) >> MLast.CeTyc loc ce ct

class_type

  • access
    <:class_type< $ct1$ . $ct2$ >> MLast.CtAcc loc ct1 ct2
  • application
    <:class_type< $ct1$ $ct2$ >> MLast.CtApp loc ct1 ct2
  • constructor
    <:class_type< $ct$ [ $list:lt$ ] >> MLast.CtCon loc ct (Ploc.VaVal lt) <:class_type< $ct$ [ $_list:lt$ ] >> MLast.CtCon loc ct lt
  • arrow
    <:class_type< [ $t$ ] -> $ct$ >> MLast.CtFun loc t ct
  • identifier
    <:class_type< $id:s$ >> MLast.CtIde loc (Ploc.VaVal s) <:class_type< $_id:s$ >> MLast.CtIde loc s
  • object
    <:class_type< object $list:lcsi$ end >> MLast.CtSig loc (Ploc.VaVal None) (Ploc.VaVal lcsi) <:class_type< object $_list:lcsi$ end >> MLast.CtSig loc (Ploc.VaVal None) lcsi <:class_type< object ($t$) $list:lcsi$ end >> MLast.CtSig loc (Ploc.VaVal (Some t)) (Ploc.VaVal lcsi) <:class_type< object ($t$) $_list:lcsi$ end >> MLast.CtSig loc (Ploc.VaVal (Some t)) lcsi <:class_type< object $opt:ot$ $list:lcsi$ end >> MLast.CtSig loc (Ploc.VaVal ot) (Ploc.VaVal lcsi) <:class_type< object $opt:ot$ $_list:lcsi$ end >> MLast.CtSig loc (Ploc.VaVal ot) lcsi <:class_type< object $_opt:ot$ $list:lcsi$ end >> MLast.CtSig loc ot (Ploc.VaVal lcsi) <:class_type< object $_opt:ot$ $_list:lcsi$ end >> MLast.CtSig loc ot lcsi

class_str_item

  • type constraint
    <:class_str_item< type $t1$ = $t2$ >> MLast.CrCtr loc t1 t2
  • declaration list
    <:class_str_item< declare $list:lcsi$ end >> MLast.CrDcl loc (Ploc.VaVal lcsi) <:class_str_item< declare $_list:lcsi$ end >> MLast.CrDcl loc lcsi
  • inheritance
    <:class_str_item< inherit $ce$ >> MLast.CrInh loc ce (Ploc.VaVal None) <:class_str_item< inherit $ce$ $opt:Some s$ >> MLast.CrInh loc ce (Ploc.VaVal (Some s)) <:class_str_item< inherit $ce$ $opt:os$ >> MLast.CrInh loc ce (Ploc.VaVal os) <:class_str_item< inherit $ce$ $_opt:os$ >> MLast.CrInh loc ce os
  • initialization
    <:class_str_item< initializer $e$ >> MLast.CrIni loc e
  • method
    <:class_str_item< method! private $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method! private $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method! private $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method! private $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) (Ploc.VaVal s) ot e <:class_str_item< method! private $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) s (Ploc.VaVal None) e <:class_str_item< method! private $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) s (Ploc.VaVal (Some t)) e <:class_str_item< method! private $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) s (Ploc.VaVal ot) e <:class_str_item< method! private $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal True) s ot e <:class_str_item< method! $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method! $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method! $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method! $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) (Ploc.VaVal s) ot e <:class_str_item< method! $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) s (Ploc.VaVal None) e <:class_str_item< method! $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) s (Ploc.VaVal (Some t)) e <:class_str_item< method! $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) s (Ploc.VaVal ot) e <:class_str_item< method! $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal False) s ot e <:class_str_item< method! $priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method! $priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method! $priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method! $priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) (Ploc.VaVal s) ot e <:class_str_item< method! $priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) s (Ploc.VaVal None) e <:class_str_item< method! $priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) s (Ploc.VaVal (Some t)) e <:class_str_item< method! $priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) s (Ploc.VaVal ot) e <:class_str_item< method! $priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) (Ploc.VaVal b2) s ot e <:class_str_item< method! $_priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method! $_priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method! $_priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method! $_priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 (Ploc.VaVal s) ot e <:class_str_item< method! $_priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 s (Ploc.VaVal None) e <:class_str_item< method! $_priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 s (Ploc.VaVal (Some t)) e <:class_str_item< method! $_priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 s (Ploc.VaVal ot) e <:class_str_item< method! $_priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal True) b2 s ot e <:class_str_item< method private $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method private $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method private $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method private $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) (Ploc.VaVal s) ot e <:class_str_item< method private $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) s (Ploc.VaVal None) e <:class_str_item< method private $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) s (Ploc.VaVal (Some t)) e <:class_str_item< method private $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) s (Ploc.VaVal ot) e <:class_str_item< method private $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal True) s ot e <:class_str_item< method $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) (Ploc.VaVal s) ot e <:class_str_item< method $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) s (Ploc.VaVal None) e <:class_str_item< method $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) s (Ploc.VaVal (Some t)) e <:class_str_item< method $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) s (Ploc.VaVal ot) e <:class_str_item< method $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal False) s ot e <:class_str_item< method $priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) (Ploc.VaVal s) ot e <:class_str_item< method $priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) s (Ploc.VaVal None) e <:class_str_item< method $priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) s (Ploc.VaVal (Some t)) e <:class_str_item< method $priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) s (Ploc.VaVal ot) e <:class_str_item< method $priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) (Ploc.VaVal b2) s ot e <:class_str_item< method $_priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $_priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $_priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $_priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 (Ploc.VaVal s) ot e <:class_str_item< method $_priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 s (Ploc.VaVal None) e <:class_str_item< method $_priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 s (Ploc.VaVal (Some t)) e <:class_str_item< method $_priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 s (Ploc.VaVal ot) e <:class_str_item< method $_priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal False) b2 s ot e <:class_str_item< method $!:b1$ private $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $!:b1$ private $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ private $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ private $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) (Ploc.VaVal s) ot e <:class_str_item< method $!:b1$ private $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) s (Ploc.VaVal None) e <:class_str_item< method $!:b1$ private $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) s (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ private $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) s (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ private $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal True) s ot e <:class_str_item< method $!:b1$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $!:b1$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) (Ploc.VaVal s) ot e <:class_str_item< method $!:b1$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) s (Ploc.VaVal None) e <:class_str_item< method $!:b1$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) s (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) s (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal False) s ot e <:class_str_item< method $!:b1$ $priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $!:b1$ $priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ $priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ $priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) (Ploc.VaVal s) ot e <:class_str_item< method $!:b1$ $priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) s (Ploc.VaVal None) e <:class_str_item< method $!:b1$ $priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) s (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ $priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) s (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ $priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) (Ploc.VaVal b2) s ot e <:class_str_item< method $!:b1$ $_priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $!:b1$ $_priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ $_priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ $_priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 (Ploc.VaVal s) ot e <:class_str_item< method $!:b1$ $_priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 s (Ploc.VaVal None) e <:class_str_item< method $!:b1$ $_priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 s (Ploc.VaVal (Some t)) e <:class_str_item< method $!:b1$ $_priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 s (Ploc.VaVal ot) e <:class_str_item< method $!:b1$ $_priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc (Ploc.VaVal b1) b2 s ot e <:class_str_item< method $_!:b1$ private $lid:s$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ private $lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ private $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ private $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) (Ploc.VaVal s) ot e <:class_str_item< method $_!:b1$ private $_lid:s$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) s (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ private $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) s (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ private $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) s (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ private $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal True) s ot e <:class_str_item< method $_!:b1$ $lid:s$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) (Ploc.VaVal s) ot e <:class_str_item< method $_!:b1$ $_lid:s$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) s (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) s (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) s (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal False) s ot e <:class_str_item< method $_!:b1$ $priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ $priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ $priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ $priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) (Ploc.VaVal s) ot e <:class_str_item< method $_!:b1$ $priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) s (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ $priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) s (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ $priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) s (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ $priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 (Ploc.VaVal b2) s ot e <:class_str_item< method $_!:b1$ $_priv:b2$ $lid:s$ = $e$ >> MLast.CrMth loc b1 b2 (Ploc.VaVal s) (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ $_priv:b2$ $lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 b2 (Ploc.VaVal s) (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ $_priv:b2$ $lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 b2 (Ploc.VaVal s) (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ $_priv:b2$ $lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 b2 (Ploc.VaVal s) ot e <:class_str_item< method $_!:b1$ $_priv:b2$ $_lid:s$ = $e$ >> MLast.CrMth loc b1 b2 s (Ploc.VaVal None) e <:class_str_item< method $_!:b1$ $_priv:b2$ $_lid:s$ : $t$ = $e$ >> MLast.CrMth loc b1 b2 s (Ploc.VaVal (Some t)) e <:class_str_item< method $_!:b1$ $_priv:b2$ $_lid:s$ $opt:ot$ = $e$ >> MLast.CrMth loc b1 b2 s (Ploc.VaVal ot) e <:class_str_item< method $_!:b1$ $_priv:b2$ $_lid:s$ $_opt:ot$ = $e$ >> MLast.CrMth loc b1 b2 s ot e
  • value
    <:class_str_item< value! mutable $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) (Ploc.VaVal True) (Ploc.VaVal s) e <:class_str_item< value! mutable $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) (Ploc.VaVal True) s e <:class_str_item< value! $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) (Ploc.VaVal False) (Ploc.VaVal s) e <:class_str_item< value! $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) (Ploc.VaVal False) s e <:class_str_item< value! $flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) (Ploc.VaVal b2) (Ploc.VaVal s) e <:class_str_item< value! $flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) (Ploc.VaVal b2) s e <:class_str_item< value! $_flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) b2 (Ploc.VaVal s) e <:class_str_item< value! $_flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal True) b2 s e <:class_str_item< value mutable $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) (Ploc.VaVal True) (Ploc.VaVal s) e <:class_str_item< value mutable $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) (Ploc.VaVal True) s e <:class_str_item< value $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) (Ploc.VaVal False) (Ploc.VaVal s) e <:class_str_item< value $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) (Ploc.VaVal False) s e <:class_str_item< value $flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) (Ploc.VaVal b2) (Ploc.VaVal s) e <:class_str_item< value $flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) (Ploc.VaVal b2) s e <:class_str_item< value $_flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) b2 (Ploc.VaVal s) e <:class_str_item< value $_flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal False) b2 s e <:class_str_item< value $!:b1$ mutable $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) (Ploc.VaVal True) (Ploc.VaVal s) e <:class_str_item< value $!:b1$ mutable $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) (Ploc.VaVal True) s e <:class_str_item< value $!:b1$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) (Ploc.VaVal False) (Ploc.VaVal s) e <:class_str_item< value $!:b1$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) (Ploc.VaVal False) s e <:class_str_item< value $!:b1$ $flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) (Ploc.VaVal b2) (Ploc.VaVal s) e <:class_str_item< value $!:b1$ $flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) (Ploc.VaVal b2) s e <:class_str_item< value $!:b1$ $_flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) b2 (Ploc.VaVal s) e <:class_str_item< value $!:b1$ $_flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc (Ploc.VaVal b1) b2 s e <:class_str_item< value $_!:b1$ mutable $lid:s$ = $e$ >> MLast.CrVal loc b1 (Ploc.VaVal True) (Ploc.VaVal s) e <:class_str_item< value $_!:b1$ mutable $_lid:s$ = $e$ >> MLast.CrVal loc b1 (Ploc.VaVal True) s e <:class_str_item< value $_!:b1$ $lid:s$ = $e$ >> MLast.CrVal loc b1 (Ploc.VaVal False) (Ploc.VaVal s) e <:class_str_item< value $_!:b1$ $_lid:s$ = $e$ >> MLast.CrVal loc b1 (Ploc.VaVal False) s e <:class_str_item< value $_!:b1$ $flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc b1 (Ploc.VaVal b2) (Ploc.VaVal s) e <:class_str_item< value $_!:b1$ $flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc b1 (Ploc.VaVal b2) s e <:class_str_item< value $_!:b1$ $_flag:b2$ $lid:s$ = $e$ >> MLast.CrVal loc b1 b2 (Ploc.VaVal s) e <:class_str_item< value $_!:b1$ $_flag:b2$ $_lid:s$ = $e$ >> MLast.CrVal loc b1 b2 s e
  • virtual value
    <:class_str_item< value virtual mutable $lid:s$ : $t$ >> MLast.CrVav loc (Ploc.VaVal True) (Ploc.VaVal s) t <:class_str_item< value virtual mutable $_lid:s$ : $t$ >> MLast.CrVav loc (Ploc.VaVal True) s t <:class_str_item< value virtual $lid:s$ : $t$ >> MLast.CrVav loc (Ploc.VaVal False) (Ploc.VaVal s) t <:class_str_item< value virtual $_lid:s$ : $t$ >> MLast.CrVav loc (Ploc.VaVal False) s t <:class_str_item< value virtual $flag:b$ $lid:s$ : $t$ >> MLast.CrVav loc (Ploc.VaVal b) (Ploc.VaVal s) t <:class_str_item< value virtual $flag:b$ $_lid:s$ : $t$ >> MLast.CrVav loc (Ploc.VaVal b) s t <:class_str_item< value virtual $_flag:b$ $lid:s$ : $t$ >> MLast.CrVav loc b (Ploc.VaVal s) t <:class_str_item< value virtual $_flag:b$ $_lid:s$ : $t$ >> MLast.CrVav loc b s t
  • virtual method
    <:class_str_item< method virtual private $lid:s$ : $t$ >> MLast.CrVir loc (Ploc.VaVal True) (Ploc.VaVal s) t <:class_str_item< method virtual private $_lid:s$ : $t$ >> MLast.CrVir loc (Ploc.VaVal True) s t <:class_str_item< method virtual $lid:s$ : $t$ >> MLast.CrVir loc (Ploc.VaVal False) (Ploc.VaVal s) t <:class_str_item< method virtual $_lid:s$ : $t$ >> MLast.CrVir loc (Ploc.VaVal False) s t <:class_str_item< method virtual $flag:b$ $lid:s$ : $t$ >> MLast.CrVir loc (Ploc.VaVal b) (Ploc.VaVal s) t <:class_str_item< method virtual $flag:b$ $_lid:s$ : $t$ >> MLast.CrVir loc (Ploc.VaVal b) s t <:class_str_item< method virtual $_flag:b$ $lid:s$ : $t$ >> MLast.CrVir loc b (Ploc.VaVal s) t <:class_str_item< method virtual $_flag:b$ $_lid:s$ : $t$ >> MLast.CrVir loc b s t

class_sig_item

  • type constraint
    <:class_sig_item< type $t1$ = $t2$ >> MLast.CgCtr loc t1 t2
  • declare
    <:class_sig_item< declare $list:lcsi$ end >> MLast.CgDcl loc (Ploc.VaVal lcsi) <:class_sig_item< declare $_list:lcsi$ end >> MLast.CgDcl loc lcsi
  • inheritance
    <:class_sig_item< inherit $ct$ >> MLast.CgInh loc ct
  • method
    <:class_sig_item< method private $lid:s$ : $t$ >> MLast.CgMth loc (Ploc.VaVal True) (Ploc.VaVal s) t <:class_sig_item< method private $_lid:s$ : $t$ >> MLast.CgMth loc (Ploc.VaVal True) s t <:class_sig_item< method $lid:s$ : $t$ >> MLast.CgMth loc (Ploc.VaVal False) (Ploc.VaVal s) t <:class_sig_item< method $_lid:s$ : $t$ >> MLast.CgMth loc (Ploc.VaVal False) s t <:class_sig_item< method $flag:b$ $lid:s$ : $t$ >> MLast.CgMth loc (Ploc.VaVal b) (Ploc.VaVal s) t <:class_sig_item< method $flag:b$ $_lid:s$ : $t$ >> MLast.CgMth loc (Ploc.VaVal b) s t <:class_sig_item< method $_flag:b$ $lid:s$ : $t$ >> MLast.CgMth loc b (Ploc.VaVal s) t <:class_sig_item< method $_flag:b$ $_lid:s$ : $t$ >> MLast.CgMth loc b s t
  • value
    <:class_sig_item< value mutable $lid:s$ : $t$ >> MLast.CgVal loc (Ploc.VaVal True) (Ploc.VaVal s) t <:class_sig_item< value mutable $_lid:s$ : $t$ >> MLast.CgVal loc (Ploc.VaVal True) s t <:class_sig_item< value $lid:s$ : $t$ >> MLast.CgVal loc (Ploc.VaVal False) (Ploc.VaVal s) t <:class_sig_item< value $_lid:s$ : $t$ >> MLast.CgVal loc (Ploc.VaVal False) s t <:class_sig_item< value $flag:b$ $lid:s$ : $t$ >> MLast.CgVal loc (Ploc.VaVal b) (Ploc.VaVal s) t <:class_sig_item< value $flag:b$ $_lid:s$ : $t$ >> MLast.CgVal loc (Ploc.VaVal b) s t <:class_sig_item< value $_flag:b$ $lid:s$ : $t$ >> MLast.CgVal loc b (Ploc.VaVal s) t <:class_sig_item< value $_flag:b$ $_lid:s$ : $t$ >> MLast.CgVal loc b s t
  • virtual method
    <:class_sig_item< method virtual private $lid:s$ : $t$ >> MLast.CgVir loc (Ploc.VaVal True) (Ploc.VaVal s) t <:class_sig_item< method virtual private $_lid:s$ : $t$ >> MLast.CgVir loc (Ploc.VaVal True) s t <:class_sig_item< method virtual $lid:s$ : $t$ >> MLast.CgVir loc (Ploc.VaVal False) (Ploc.VaVal s) t <:class_sig_item< method virtual $_lid:s$ : $t$ >> MLast.CgVir loc (Ploc.VaVal False) s t <:class_sig_item< method virtual $flag:b$ $lid:s$ : $t$ >> MLast.CgVir loc (Ploc.VaVal b) (Ploc.VaVal s) t <:class_sig_item< method virtual $flag:b$ $_lid:s$ : $t$ >> MLast.CgVir loc (Ploc.VaVal b) s t <:class_sig_item< method virtual $_flag:b$ $lid:s$ : $t$ >> MLast.CgVir loc b (Ploc.VaVal s) t <:class_sig_item< method virtual $_flag:b$ $_lid:s$ : $t$ >> MLast.CgVir loc b s t

other

type_decl

What is after ‘type’ or ‘and’ in a type declaration.

The type “type_decl” is a record type corresponding to a type declaration. Its definition is:

type type_decl =
  { tdNam : (loc * Ploc.vaval string);
    tdPrm : Ploc.vala (list type_var);
    tdPrv : Ploc.vala bool;
    tdDef : ctyp;
    tdCon : Ploc.vala (list (ctyp * ctyp)) }
;

The field “tdNam” is the type identifier (together with its location in the source).

The field “tdPrm” is the list of its possible parameters.

The field “tdPrv” tells if the type is private or not.

The field “tdDef” is the definition of the type.

The field “tdCon” is the possible list of type constraints.

<:type_decl< $tp:ls$ $list:ltv$ = private $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $list:ltv$ = private $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $list:ltv$ = $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $list:ltv$ = $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $list:ltv$ = $priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $list:ltv$ = $priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $list:ltv$ = $_priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $list:ltv$ = $_priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $_list:ltv$ = private $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $_list:ltv$ = private $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $_list:ltv$ = $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $_list:ltv$ = $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $_list:ltv$ = $priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $_list:ltv$ = $priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $tp:ls$ $_list:ltv$ = $_priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $tp:ls$ $_list:ltv$ = $_priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = Ploc.VaVal ls; MLast.tdPrm = ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $list:ltv$ = private $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $list:ltv$ = private $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $list:ltv$ = $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $list:ltv$ = $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $list:ltv$ = $priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $list:ltv$ = $priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $list:ltv$ = $_priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $list:ltv$ = $_priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = Ploc.VaVal ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = private $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = private $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal True; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal False; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = $priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = $priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = Ploc.VaVal b; MLast.tdDef = t; MLast.tdCon = ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = $_priv:b$ $t$ $list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = Ploc.VaVal ltt} <:type_decl< $_tp:ls$ $_list:ltv$ = $_priv:b$ $t$ $_list:ltt$ >> {MLast.tdNam = ls; MLast.tdPrm = ltv; MLast.tdPrv = b; MLast.tdDef = t; MLast.tdCon = ltt}

with_constr

“With” possibly following a module type.

  • with module
    <:with_constr< module $list:ls$ = $me$ >> MLast.WcMod loc (Ploc.VaVal ls) me <:with_constr< module $_list:ls$ = $me$ >> MLast.WcMod loc ls me
  • with module substitution
    <:with_constr< module $list:ls$ := $me$ >> MLast.WcMos loc (Ploc.VaVal ls) me <:with_constr< module $_list:ls$ := $me$ >> MLast.WcMos loc ls me
  • with type
    <:with_constr< type $list:ls$ $list:ltv$ = private $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) (Ploc.VaVal ltv) (Ploc.VaVal True) t <:with_constr< type $list:ls$ $list:ltv$ = $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) (Ploc.VaVal ltv) (Ploc.VaVal False) t <:with_constr< type $list:ls$ $list:ltv$ = $flag:b$ $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) (Ploc.VaVal ltv) (Ploc.VaVal b) t <:with_constr< type $list:ls$ $list:ltv$ = $_flag:b$ $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) (Ploc.VaVal ltv) b t <:with_constr< type $list:ls$ $_list:ltv$ = private $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) ltv (Ploc.VaVal True) t <:with_constr< type $list:ls$ $_list:ltv$ = $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) ltv (Ploc.VaVal False) t <:with_constr< type $list:ls$ $_list:ltv$ = $flag:b$ $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) ltv (Ploc.VaVal b) t <:with_constr< type $list:ls$ $_list:ltv$ = $_flag:b$ $t$ >> MLast.WcTyp loc (Ploc.VaVal ls) ltv b t <:with_constr< type $_list:ls$ $list:ltv$ = private $t$ >> MLast.WcTyp loc ls (Ploc.VaVal ltv) (Ploc.VaVal True) t <:with_constr< type $_list:ls$ $list:ltv$ = $t$ >> MLast.WcTyp loc ls (Ploc.VaVal ltv) (Ploc.VaVal False) t <:with_constr< type $_list:ls$ $list:ltv$ = $flag:b$ $t$ >> MLast.WcTyp loc ls (Ploc.VaVal ltv) (Ploc.VaVal b) t <:with_constr< type $_list:ls$ $list:ltv$ = $_flag:b$ $t$ >> MLast.WcTyp loc ls (Ploc.VaVal ltv) b t <:with_constr< type $_list:ls$ $_list:ltv$ = private $t$ >> MLast.WcTyp loc ls ltv (Ploc.VaVal True) t <:with_constr< type $_list:ls$ $_list:ltv$ = $t$ >> MLast.WcTyp loc ls ltv (Ploc.VaVal False) t <:with_constr< type $_list:ls$ $_list:ltv$ = $flag:b$ $t$ >> MLast.WcTyp loc ls ltv (Ploc.VaVal b) t <:with_constr< type $_list:ls$ $_list:ltv$ = $_flag:b$ $t$ >> MLast.WcTyp loc ls ltv b t
  • with type substitution
    <:with_constr< type $list:ls$ $list:ltv$ := $t$ >> MLast.WcTys loc (Ploc.VaVal ls) (Ploc.VaVal ltv) t <:with_constr< type $list:ls$ $_list:ltv$ := $t$ >> MLast.WcTys loc (Ploc.VaVal ls) ltv t <:with_constr< type $_list:ls$ $list:ltv$ := $t$ >> MLast.WcTys loc ls (Ploc.VaVal ltv) t <:with_constr< type $_list:ls$ $_list:ltv$ := $t$ >> MLast.WcTys loc ls ltv t

poly_variant

Polymorphic variants.

  • constructor
    <:poly_variant< `$s$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal True) (Ploc.VaVal []) <:poly_variant< `$s$ of & $list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal True) (Ploc.VaVal lt) <:poly_variant< `$s$ of & $_list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal True) lt <:poly_variant< `$s$ of $list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal False) (Ploc.VaVal lt) <:poly_variant< `$s$ of $_list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal False) lt <:poly_variant< `$s$ of $flag:b$ $list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal b) (Ploc.VaVal lt) <:poly_variant< `$s$ of $flag:b$ $_list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) (Ploc.VaVal b) lt <:poly_variant< `$s$ of $_flag:b$ $list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) b (Ploc.VaVal lt) <:poly_variant< `$s$ of $_flag:b$ $_list:lt$ >> MLast.PvTag loc (Ploc.VaVal s) b lt <:poly_variant< `$_:s$ >> MLast.PvTag loc s (Ploc.VaVal True) (Ploc.VaVal []) <:poly_variant< `$_:s$ of & $list:lt$ >> MLast.PvTag loc s (Ploc.VaVal True) (Ploc.VaVal lt) <:poly_variant< `$_:s$ of & $_list:lt$ >> MLast.PvTag loc s (Ploc.VaVal True) lt <:poly_variant< `$_:s$ of $list:lt$ >> MLast.PvTag loc s (Ploc.VaVal False) (Ploc.VaVal lt) <:poly_variant< `$_:s$ of $_list:lt$ >> MLast.PvTag loc s (Ploc.VaVal False) lt <:poly_variant< `$_:s$ of $flag:b$ $list:lt$ >> MLast.PvTag loc s (Ploc.VaVal b) (Ploc.VaVal lt) <:poly_variant< `$_:s$ of $flag:b$ $_list:lt$ >> MLast.PvTag loc s (Ploc.VaVal b) lt <:poly_variant< `$_:s$ of $_flag:b$ $list:lt$ >> MLast.PvTag loc s b (Ploc.VaVal lt) <:poly_variant< `$_:s$ of $_flag:b$ $_list:lt$ >> MLast.PvTag loc s b lt
  • type
    <:poly_variant< $t$ >> MLast.PvInh loc t

Nodes without quotations

Some types defined in the AST tree module “MLast” don’t have an associated quotation. They are:

  • type_var
  • class_infos

type_var

The type “type_var” is defined as:

type type_var = (Ploc.vala string * (bool * bool));

The first boolean is “True” if the type variable is prefixed by “+” (“plus” sign). The second boolean is “True” if the type variable is prefixed by “-” (“minus” sign).

class_infos

The type “class_infos” is a record type parametrized with a type variable. It is common to:

  • the “class declaration” (”class ...” as structure item), the type variable being “class_expr”,
  • the “class description” (”class ...” as signature item), the type variable being “class_type”,
  • the “class type declaration” (”class type ...”), the type variable being “class_type”.

It is defined as:

type class_infos 'a =
  { ciLoc : loc;
    ciVir : Ploc.vala bool;
    ciPrm : (loc * Ploc.vala (list type_var));
    ciNam : Ploc.vala string;
    ciExp : 'a }
;

The field “ciLoc” is the location of the whole definition.

The field “ciVir” tells whether the type is virtual or not.

The field “ciPrm” is the list of its possible parameters.

The field “ciNam” is the class identifier.

The field “ciExp” is the class definition, depending of its kind.

Syntax tree quotations in user syntax

Syntax tree quotations in user syntax

The quotation kit “q_ast.cmo” allows to use syntax tree quotations in user syntax. It fully works only in “strict” mode. In “transitional” mode, there is no way to use antiquotations, which restricts its utility.

If this kit is loaded, when a quotation of syntax tree is found, the current OCaml language parser is called. Then, the resulting tree is reified (except the antiquotations) to represent the syntax tree of the syntax tree.

Antiquotations

The OCaml langage parser used, and its possible extensions, must have been built to allow the places of the antiquotations. The symbols enclosed by the meta-symbol “V” (see the chapter about extensible grammars, section “symbols”), define where antiquotations can take place.

There is no need to specify antiquotations for the main types defined in the AST tree module (”MLast”): “expr”, “patt”, “expr”, “str_item”, “sig_item”, and so on. All syntax parts of these types are automatically antiquotable.

For example, in Camlp5 sources, the grammar rule defining the “let..in” statement is:

"let"; r = V (FLAG "rec") "flag" "opt";
 l = V (LIST1 let_binding SEP "and"); "in"; x = expr

All symbols of these rules, except the keywords, are antiquotable:

  • The “rec” flag, because enclosed by the “V” meta symbol. The two strings which follow it gives the possible antiquotation kinds: “flag” (the normal antiquotation kind) and “opt” (kept by backward compatibility, but not recommended). It is therefore possible to antiquote it as: “$flag:...$” or “$opt:...$” where the “...” is an expression or a pattern depending on the position of the enclosing quotation
  • The binding list is also antiquotable, since it is also enclosed by the “V” meta symbol. Its antiquotation kind is “list” (the default when the meta symbol parameter is a list). It is therefore possible to write “$list:...$” at the place of the binding list.
  • The expression after the “in” is also antiquotable, because it belongs to the main types defined in the module “MLast”.

In that example, the variable “r” is of type “Ploc.vala   bool”, the variable “r” of type “Ploc.vala (list   (patt * expr))” and the variable “x” of type “MLast.expr”.

… to be completed …

The Pcaml module

The Pcaml module

All about language parsing entries, language printing functions, quotation management at parsing time, extensible directives, extensible options, and generalities about Camlp5.

Language parsing

Main parsing functions

The two functions below are called when parsing an interface (.mli file) or an implementation (.ml file) to build the syntax tree; the returned list contains the phrases (signature items or structure items) and their locations; the boolean tells whether the parser has encountered a directive; in this case, since the directive may change the syntax, the parsing stops, the directive is evaluated, and this function is called again.
These functions are references, because they can be changed to use another technology than the Camlp5 extended grammars. By default, they use the grammars entries [implem] and [interf] defined below.
value parse_interf :
  ref (Stream.t char -> (list (MLast.sig_item * MLast.loc) * bool));

Function called when parsing an interface (“.mli”) file

value parse_implem :
  ref (Stream.t char -> (list (MLast.str_item * MLast.loc) * bool));

Function called when parsing an implementation (“.ml”) file .. rubric:: Grammar

name:grammar
value gram : Grammar.g;
Grammar variable of the language.

Entries

Grammar entries which return syntax trees. These are set by the parsing kit of the current syntax, through the statement EXTEND. They are usable by other possible user syntax extensions.

value expr : Grammar.Entry.e MLast.expr;
Expressions.
value patt : Grammar.Entry.e MLast.patt;
Patterns.
value ctyp : Grammar.Entry.e MLast.ctyp;
Types.
value sig_item : Grammar.Entry.e MLast.sig_item;
Signature items, i.e. items between “sig” and “end”, or inside an interface (“.mli”) file.
value str_item : Grammar.Entry.e MLast.str_item;
Structure items, i.e. items between “struct” and “end”, or inside an implementation (“.ml”) file.
value module_type : Grammar.Entry.e MLast.module_type;
Module types, e.g. signatures, functors, identifiers.
value module_expr : Grammar.Entry.e MLast.module_expr;
Module expressions, e.g. structures, functors, identifiers.
value let_binding : Grammar.Entry.e (MLast.patt *       MLast.expr);
Specific entry for the “let binding”, i.e. the association “let pattern = expression”.
value type_declaration : Grammar.Entry.e       MLast.type_decl;
Specific entry for the “type declaration”, i.e. the association “type name = type-expression”
value class_sig_item : Grammar.Entry.e       MLast.class_sig_item;
Class signature items, i.e. items of class objects types.
value class_str_item : Grammar.Entry.e       MLast.class_str_item;
Class structure items, i.e. items of class objects.
value class_type : Grammar.Entry.e MLast.class_type;
Class types, e.g. object types, class types functions, identifiers.
value class_expr : Grammar.Entry.e MLast.class_expr;
Class expressions, e.g. objects, class functions, identifiers.
value interf : Grammar.Entry.e (list (MLast.sig_item * MLast.loc) * bool);
Interface, i.e. files with extension “.mli”. The location is the top of the tree. The boolean says whether the parsing stopped because of the presence of a directive (which potentially could change the syntax).
value implem : Grammar.Entry.e (list (MLast.str_item * MLast.loc) * bool);
Implementation, i.e. files with extension “.ml”. Same remark about the location and the boolean.
value top_phrase : Grammar.Entry.e (option       MLast.str_item);
Phrases of the OCaml interactive toplevel. Return “None” in case of end of file.
value use_file : Grammar.Entry.e (list MLast.str_item *       bool);
Phrases in files included by the directive “#use”. The boolean indicates whether the parsing stopped because of a directive (as for “interf” above).

Language printing

Main printing functions

The two function below are called when printing an interface (.mli file) of an implementation (.ml file) from the syntax tree; the list is the result of the corresponding parsing function.
These functions are references, to allow using other technologies than the Camlp5 extended printers.
value print_interf :
  ref (list (MLast.sig_item * MLast.loc) -> unit);

Function called when printing an interface (“.mli”) file

value print_implem :
  ref (list (MLast.str_item * MLast.loc) -> unit);

Function called when printing an implementation (“.ml”) file By default, these functions fail. The printer kit “pr_dump.cmo” (loaded by most Camlp5 commands) sets them to functions dumping the syntax tree in binary (for the OCaml compiler). The pretty printer kits, such as “pr_r.cmo” and “pr_o.cmo” set them to functions calling the predefined printers (see next section).

Printers

Printers taking syntax trees as parameters and returning pretty printed strings. These are set by the printing kits, through the statement EXTEND_PRINTER. They are usable by other possible user printing extensions.

value pr_expr : Eprinter.t MLast.expr;
Expressions.
value pr_patt : Eprinter.t MLast.patt;
Patterns.
value pr_ctyp : Eprinter.t MLast.ctyp;
Types.
value pr_sig_item : Eprinter.t MLast.sig_item;
Signature items, i.e. items between “sig” and “end”, or inside an interface (“.mli”) file.
value pr_str_item : Eprinter.t MLast.str_item;
Structure items, i.e. items between “struct” and “end”, or inside an implementation (“.ml”) file.
value pr_module_type : Eprinter.t MLast.module_type;
Module types, e.g. signatures, functors, identifiers.
value pr_module_expr : Eprinter.t MLast.module_expr;
Module expressions, e.g. structures, functors, identifiers.
value pr_class_sig_item : Eprinter.t       MLast.class_sig_item;
Class signature items, i.e. items of class objects types.
value pr_class_str_item : Eprinter.t       MLast.class_str_item;
Class structure items, i.e. items of class objects.
value pr_class_type : Eprinter.t MLast.class_type;
Class types, e.g. object types, class types functions, identifiers.
value pr_class_expr : Eprinter.t MLast.class_expr;
Class expressions, e.g. objects, class functions, identifiers.

Quotation management

value handle_expr_quotation : MLast.loc -> (string * string)       -> MLast.expr;
Called in the semantic actions of the rules parsing a quotation in position of expression.
value handle_patt_quotation : MLast.loc -> (string * string)       -> MLast.patt;
Called in the semantic actions of the rules parsing a quotation in position of pattern.
value quotation_dump_file : ref (option string);
Pcaml.quotation_dump_file” optionally tells the compiler to dump the result of an expander (of kind “generating a string”) if this result is syntactically incorrect. If “None” (default), this result is not dumped. If “Some       fname”, the result is dumped in the file “fname”. The same effect can be done with the option “-QD” of Camlp5 commands.
value quotation_location : unit -> Ploc.t;
While expanding a quotation, returns the location of the quotation text (between the quotation quotes) in the source; raises “Failure” if not in the context of a quotation expander.

Extensible directives and options

type directive_fun = option MLast.expr -> unit;
The type of functions called to treat a directive with its syntactic parameter. Directives act by side effect.
value add_directive : string -> directive_fun -> unit;
Add a new directive.
value find_directive : string -> directive_fun;
Find the function associated with a directive. Raises “Not_found” if the directive does not exists.
value add_option : string -> Arg.spec -> string -> unit;
Add an option to the command line of the Camlp5 command.

Equalities over syntax trees

These equalities skip the locations.

value eq_expr : MLast.expr -> MLast.expr -> bool;
value eq_patt : MLast.patt -> MLast.patt -> bool;
value eq_ctyp : MLast.ctyp -> MLast.ctyp -> bool;
value eq_str_item : MLast.str_item -> MLast.str_item -> bool;
value eq_sig_item : MLast.sig_item -> MLast.sig_item -> bool;
value eq_module_expr : MLast.module_expr -> MLast.module_expr -> bool;
value eq_module_type : MLast.module_type -> MLast.module_type -> bool;
value eq_class_sig_item : MLast.class_sig_item -> MLast.class_sig_item -> bool;
value eq_class_str_item : MLast.class_str_item -> MLast.class_str_item -> bool;
value eq_class_type : MLast.class_type -> MLast.class_type -> bool;
value eq_class_expr : MLast.class_expr -> MLast.class_expr -> bool;

Generalities

value version : string;
The current version of Camlp5.
value syntax_name : ref string;
The name of the current syntax. Set by the loaded syntax kit.
value input_file : ref string;
The file currently being parsed.
value output_file : ref (option string);
The output file, stdout if None (default).
value no_constructors_arity : ref bool;
True if the current syntax does not generate constructor arity, which is the case of the normal syntax, and not of the revised one. This has an impact when converting Camlp5 syntax tree into OCaml compiler syntax tree.

Directives

Directives

Directives in normal or revised syntax are statements at top level, or, more generally, as signature items or structure items, which stops the preprocessor for evaluate things - which can change the behaviour of the preprocessor, for example to add syntax, load syntax extensions and so on. After the directive is evaluated, the parsing resumes.

Directives begin with ‘#’, followed by an identifier, and, optionnally by an expression. They are usable in source files the and generally in the ocaml toplevel too.

Four predefined directives exist: #load, #directory, #option and #use. It is also possible to add other directives. An example of that is the parsing kit pa_pragma.cmo which adds a new directive #pragma.

Predefined directives

The predefined directives are:

#load “name”

Loads an object file (ocaml cmo or cma file) in the core, evaluating it. This is typically to be used in the ocaml toplevel to add an syntax extension kit.

For example, in the toplevel, loading the quotation expander of ocaml syntax trees:

# #load "q_MLast.cmo";

# value loc = Ploc.dummy;
value loc : Ploc.t = <abstr>

# <:expr< fun x -> x >>;
- : MLast.expr =
MLast.ExFun <abstr>
  (Ploc.VaVal
     [(MLast.PaLid <abstr> (Ploc.VaVal "x"), Ploc.VaVal None,
       MLast.ExLid <abstr> (Ploc.VaVal "x"))])

In a source file, the ‘#load’ directive is equivalent to put the object file as camlp5 parameter among the ‘load options’:

$ cat myfile.ml

#load "pa_extend.cmo";
value g = Grammar.gcreate (Plexer.gmake ());
value e = Grammar.Entry.create g "e";

EXTEND e: [[ i = INT -> i ]]; END;

$ ocamlc -pp camlp5r -I +camlp5 -c myfile.ml

which is equivalent to, without using ‘#load’:

$ cat myfile2.ml

value g = Grammar.gcreate (Plexer.gmake ());
value e = Grammar.Entry.create g "e";

EXTEND e: [[ i = INT -> i ]]; END;

and compiling it like this:

$ ocamlc -pp 'camlp5r pa_extend.cmo' -I +camlp5 -c myfile2.ml

#directory “name”

Adds a new directory in the camlp5 path searching for loaded files (using the directive #load above). This is equivalent to the option ‘-I’ of the camlp5 command. See the camlp5 man page.

#use “name”

Loads a source file name. Useful in the ocaml toplevel to test a source file.

#option “option”

Adds an option as if it were added in camlp5 command line (to be used in a source file, not in the ocaml toplevel). Implemented only on options without an extra parameter.

For example, the syntax kit pa_extend.cmo adds an option named ‘-split_ext’. This can be viewed through the command:

camlp5r pa_extend.cmo -help

Thanks to the directive ‘#option’, the following command in the shell:

$ camlp5r pa_extend.cmo -split_ext file.ml

can be used only as:

$ camlp5r file.ml

providing the file starts with:

#load "pa_extend.cmo";
#option "-split_ext";

User directives

It is possible to add any extra directive. The syntax kit pragma.cmo, for example, adds a directive named ‘#pragma’.

A user syntax kit can add its directives using the function “add_directive” of the module Pcaml.

Extensions of syntax

Extensions of syntax

Camlp5 allows one to extend the syntax of the OCaml language, and even change the entire syntax.

It uses for that one of its parsing tools: the extensible grammars.

To understand the whole syntax in the examples given in this chapter, it is good to understand this parsing tool (the extensible grammars), but we shall try to give some minimal explanations to allow the reader to follow them.

A syntax extension is an OCaml object file (ending with “.cmo” or “.cma”) which is loaded inside Camlp5. The source of this file uses calls to the specific statement EXTEND applied to entries defined in the Camlp5 module “Pcaml”.

Entries

The grammar of OCaml contains several entries, corresponding to the major notions of the language, which are modifiable this way, and even erasable. They are defined in this module “Pcaml”.

Most important entries:

  • expr: the expressions.
  • patt: the patterns.
  • ctyp: the types.
  • str_item: the structure items, i.e. the items between “struct” and “end”, and the toplevel phrases in a “.ml” file.
  • sig_item: the signature items, i.e. the items between “sig” and “end”, and the toplevel phrases in a “.mli” file.
  • module_expr: the module expressions.
  • module_type: the module types.

Entries of object programming:

  • class_expr: the class expressions.
  • class_type: the class types.
  • class_str_item: the objects items.
  • class_sig_item: the objects types items.

Main entries of files and interactive toplevel parsing:

  • implem: the phrases that can be found in a “.ml” file.
  • interf: the phrases that can be found in a “.mli” file.
  • top_phrase: the phrases of the interactive toplevel.
  • use_file: the phrases that can be found in a file included by the directive “use”.

Extra useful entries also accessible:

  • let_binding: the bindings “expression = pattern” found in the “let” statement.
  • type_declaration: the bindings “name = type” found in the “type” statement.

Syntax tree quotations

A grammar rule is a list of rule symbols followed by the semantic action, i.e. the result of the rule. This result is a syntax tree, whose type is the type of the extended entry. The description of the types of the syntax tree are in the Camlp5 module “MLast”.

There is however a simpler way to make values of these syntax tree types: the system quotations (see chapters about quotations and syntax tree). With this system, it is possible to represent syntax tree in concrete syntax, between specific parentheses, namely “<<” and “>>”, or between “<:name<” and “>>”.

For example, the syntax node of the “if” statement is, normally:

MLast.ExIfe loc e1 e2 e3

where loc is the source location, and e1, e2, e3 are the expressions constituting the if statement. With quotations, it is possible to write it like this (which is stricly equivalent because this is evaluated at parse time, not execution time):

<:expr< if $e1$ then $e2$ else $e3$ >>

With quotations, it is possible to build pieces of program as complex as desired. See the chapter about syntax trees.

An example : repeat..until

A classical extension is the creation of the “repeat” statement. The “repeat” statement is like a “while” except that the loop is executed at least one time and that the test is at the end of the loop and is inverted. The equivalent of:

repeat x; y; z until c

is:

do {
  x; y; z;
  while not c do { x; y; z }
}

or, with a loop:

loop () where rec loop () = do {
  x; y; z;
  if c then () else loop ()
}

The code

This syntax extension could be written like this (see the detail of syntax in the chapter about extensible grammars and the syntax tree quotations in the chapter about them):

#load "pa_extend.cmo";
#load "q_MLast.cmo";
open Pcaml;
EXTEND
  expr:
    [ [ "repeat"; el = LIST1 expr SEP ";"; "until"; c = expr ->
          let el = el @ [<:expr< while not $c$ do { $list:el$ } >>] in
          <:expr< do { $list:el$ } >> ] ]
  ;
END;

Alternatively, with the loop version:

#load "pa_extend.cmo";
#load "q_MLast.cmo";
open Pcaml;
EXTEND
  expr:
    [ [ "repeat"; el = LIST1 expr SEP ";"; "until"; c = expr ->
          let el = el @ [<:expr< if $c$ then () else loop () >>] in
          <:expr< loop () where rec loop () = do { $list:el$ } >> ] ]
  ;
END;

The first “#load” in the code (in both files) means that a syntax extension has been used in the file, namely the “EXTEND” statement. The second “#load” means that abstract tree quotations has been used, namely the “<:expr< ... >>”.

The quotation, found in the second version:

<:expr< loop () where rec loop () = do { $list:el$ } >>

is especially interesting. Written with abstract syntax tree, it would be:

MLast.ExLet loc True
  [(MLast.PaLid loc "loop",
    MLast.ExFun loc [(MLast.PaUid loc "()", None, MLast.ExSeq loc el)])]
  (MLast.ExApp loc (MLast.ExLid loc "loop") (MLast.ExUid loc "()"));

This shows the interest of writing abstract syntax tree with quotations: it is easier to program and to understand.

Compilation

If the file “foo.ml” contains one of these versions, it is possible to compile it like this:

ocamlc -pp camlp5r -I +camlp5 -c foo.ml

Notice that the ocamlc option “-c” means that we are interested only in generating the object file “foo.cmo”, not achieving the compilation by creating an executable. Anyway the link would not work because of usage of modules specific to Camlp5.

Testing

In the OCaml toplevel

ocaml -I +camlp5 camlp5r.cma
        Objective Caml version ...

        Camlp5 Parsing version ...

# #load "foo.cmo";
# value x = ref 42;
value x : ref int = {val=42}
# repeat
    print_int x.val; print_endline ""; x.val := x.val + 3
  until x.val > 70;
42
45
48
51
54
57
60
63
66
69
- : unit = ()

In a file

The code, above, used in the toplevel, can be written in a file, say “bar.ml”:

#load "./foo.cmo";
value x = ref 42;
repeat
  print_int x.val;
  print_endline "";
  x.val := x.val + 3
until x.val > 70;

with a subtile difference: the loaded file must be “./foo.cmo” and not just “foo.cmo” because Camlp5 does not have, by default, the current directory in its path.

The file can be compiled like this:

ocamlc -pp camlp5r bar.ml

or in native code:

ocamlopt -pp camlp5r bar.ml

And it is possible to check the resulting program by typing:

camlp5r pr_r.cmo bar.ml

whose displayed result is:

#load "./foo.cmo";
value x = ref 42;
do {
  print_int x.val;
  print_endline "";
  x.val := x.val + 3;
  while not (x.val > 70) do {
    print_int x.val;
    print_endline "";
    x.val := x.val + 3
  }
};

See also the same example pretty printed in its original syntax, using the extendable programs printing.

Extensions of printing

Extensions of printing

Camlp5 provides extensions kits to pretty print programs in revised syntax and normal syntax. Some other extensions kits also allow to rebuild the parsers, or the EXTEND statements in their initial syntax. The pretty print system is itself extensible, by adding new rules. We present here how it works in the Camlp5 sources.

The pretty print system of Camlp5 uses the library modules Pretty, an original system to format output) and Extfun, another original system of extensible functions.

This chapter is designed for programmers that want to understand how the pretty printing of OCaml programs work in Camlp5, want to adapt, modify or debug it, or want to add their own pretty printing extensions.

Introduction

The files doing the pretty printings are located in Camlp5 sources in the directory “etc”. Peruse them if you are interested in creating new ones. The main ones are:

  • “etc/pr_r.ml”: pretty print in revised syntax.
  • “etc/pr_o.ml”: pretty print in normal syntax.
  • “etc/pr_rp.ml”: rebuilding parsers in their original revised syntax.
  • “etc/pr_op.ml”: rebuilding parsers in their original normal syntax.
  • “etc/pr_extend.ml”: rebuilding EXTEND in its original syntax.

We present here how this system works inside these files. First, the general principles. Second, more details of the implementation.

Principles

Using module Pretty

All functions in OCaml pretty printing take a parameter named “the printing context” (variable pc). It is a record holding :

  • The current indendation : pc.ind
  • What should be printed before, in the same line : pc.bef
  • What should be printed after, in the same line : pc.aft
  • The dangling token, useful in normal syntax to know whether parentheses are necessary : pc.dang

A typical pretty printing function calls the function horiz_vertic of the library module Pretty. This function takes two functions as parameter:

  • The way to print the data in one only line (horizontal printing)
  • The way to print the data in two or more lines (vertical printing)

Both functions catenate the strings by using the function sprintf of the library module Pretty which controls whether the printed data holds in the line or not. They generally call, recursively, other pretty printing functions with the same behaviour.

Let us see an example (fictitious) of printing an OCaml application. Let us suppose we have an application expression “e1 e2” to pretty print where e1 and e2 are sub-expressions. If both expressions and their application holds on one only line, we want to see:

e1 e2

On the other hand, if they do not hold on one only line, we want to see e2 in another line with, say, an indendation of 2 spaces:

e1
  e2

Here is a possible implementation. The function has been named expr_app and can call the function expr to print the sub-expressions e1 and e2:

value expr_app pc e1 e2 =
  horiz_vertic
    (fun () ->
       let s1 = expr {(pc) with aft = ""} e1 in
       let s2 = expr {(pc) with bef = ""} e2 in
       sprintf "%s %s" s1 s2)
    (fun () ->
       let s1 = expr {(pc) with aft = ""} e1 in
       let s2 =
         expr
           {(pc) with
              ind = pc.ind + 2;
              bef = tab (pc.ind + 2)}
           e2
       in
       sprintf "%s\n%s" s1 s2)
;

The first function is the horizontal printing. It ends with a sprintf separating the printing of e1 and e2 by a space. The possible “before part” (pc.bef) and “after part” (pc.aft) are transmitted in the calls of the sub-functions.

The second function is the vertical printing. It ends with a sprintf separating the printing of e1 and e2 by a newline. The second line starts with an indendation, using the “before part” (pc.bef) of the second call to expr.

The pretty printing library function Pretty.horiz_vertic calls the first (horizontal) function, and if it fails (either because s1 or s2 are too long or hold newlines, or because the final string produced by sprintf is too long), calls the second (vertical) function.

Notice that the parameter pc contains a field pc.bef (what should be printed before in the same line), which in both cases is transmitted to the printing of e1 (since the syntax {(pc) with aft =   ""} is a record with pc.bef kept). Same for the field pc.aft transmitted to the printing of e2.

Using EXTEND_PRINTER statement

This system is combined to the extensible printers to allow the extensibility of the pretty printing.

The code above actually looks like:

EXTEND_PRINTER
  pr_expr:
    [ [ <:expr< $e1$ $e2$ >> ->
          horiz_vertic
            (fun () ->
               let s1 = curr {(pc) with aft = ""} e1 in
               let s2 = next {(pc) with bef = ""} e2 in
               sprintf "%s %s" s1 s2)
            (fun () ->
               let s1 = curr {(pc) with aft = ""} e1 in
               let s2 =
                 next
                   {(pc) with
                      ind = pc.ind + 2;
                      bef = tab (pc.ind + 2)}
                   e2
               in
               sprintf "%s\n%s" s1 s2) ] ]
  ;
END;

The variable “pc” is implicit in the semantic actions of the syntax “EXTEND_PRINTER”, as well as two other variables: “curr” and “next”.

These parameters, “curr” and “next”, correspond to the pretty printing of, respectively, the current level and the next level. Since the application in OCaml is left associative, the first sub-expression is printed at the same (current) level and the second one is printed at the next level. We also see a call to next in the last (2nd) case of the function to treat the other cases in the next level.

Dangling else, bar, semicolon

In normal syntax, there are cases where it is necessary to enclose expressions between parentheses (or between begin and end, which is equivalent in that syntax). Three tokens may cause problems: the “else”, the vertical bar “|” and the semicolon “;”. Here are examples where the presence of these tokens constrains the previous expression to be parenthesized. In these three examples, removing the begin..end enclosers would change the meaning of the expression because the dangling token would be included in that expression:

Dangling else:

if a then begin if b then c end else d

Dangling bar:

function
  A ->
    begin match a with
      B -> c
    | D -> e
    end
| F -> g

Dangling semicolon:

if a then b
else begin
  let c = d in
  e
end;
f

The information is transmitted by the value pc.dang. In the first example above, while displaying the “then” part of the outer “if”, the sub-expression is called with the value pc.dang set to "else" to inform the last sub-sub-expression that it is going to be followed by that token. When a “if” expression should be displayed without “else” part, and that its “pc.dang” is “else”, it should be enclosed with spaces.

This problem does not exist in revised syntax. While pretty printing in revised syntax, the parameter pc.dang is not necessary and remains the empty string.

By level

As explained in the chapter about the extensible printers (with the EXTEND_PRINTER statement), printers contain levels. The global printer variable of expressions is named “pr_expr” and contain all definitions for pretty printing expressions, organized by levels, just like the parsing of expressions. The definition of “pr_expr” actually looks like this:

EXTEND_PRINTER
  pr_expr:
    [ "top"
      [ (* code for level "top" *) ]
    | "add"
      [ (* code for level "add" *) ]
    | "mul"
      [ (* code for level "mul" *) ]
    | "apply"
      [ (* code for level "apply" *) ]
    | "simple"
      [ (* code for level "add" *) ] ]
  ;
END;

The Prtools module

The Prtools module is defined inside Camlp5 for pretty printing kits. It provides variables and functions to process comments, and meta-functions to process lists (horizontally, vertically, paragraphly).

Comments

value comm_bef : int -> MLast.loc -> string;
comm_bef ind loc” get the comment from the source just before the given location “loc”. This comment may be reindented using “ind”. Returns the empty string if no comment found.
value source : ref string;
The initial source string, which must be set by the pretty printing kit. Used by [comm_bef] above.
value set_comm_min_pos : int -> unit;
Set the minimum position of the source where comments can be found, (to prevent possible duplication of comments).

Meta functions for lists

type pr_fun 'a = pr_context -> 'a -> string;
Type of printer functions.
value hlist : pr_fun 'a -> pr_fun (list 'a);
[hlist elem pc el] returns the horizontally pretty printed string of a list of elements; elements are separated with spaces. The list is displayed in one only line. If this function is called in the context of the [horiz] function of the function [horiz_vertic] of the module Printing, and if the line overflows or contains newlines, the function internally fails (the exception is catched by [horiz_vertic] for a vertical pretty print).
value hlist2 : pr_fun 'a -> pr_fun 'a -> pr_fun (list 'a);
horizontal list with a different function from 2nd element on.
value hlistl : pr_fun 'a -> pr_fun 'a -> pr_fun (list 'a);
horizontal list with a different function for the last element.
value vlist : pr_fun 'a -> pr_fun (list 'a);
[vlist elem pc el] returns the vertically pretty printed string of a list of elements; elements are separated with newlines and indentations.
value vlist2 : pr_fun 'a -> pr_fun 'a -> pr_fun (list    'a);
vertical list with different function from 2nd element on.
value vlist3 : pr_fun ('a * bool) -> pr_fun ('a * bool) -> pr_fun       (list 'a);
vertical list with different function from 2nd element on, the boolean value being True for the last element of the list.
value vlistl : pr_fun 'a -> pr_fun 'a -> pr_fun (list    'a);
vertical list with different function for the last element.
value plist : pr_fun 'a -> int -> pr_fun (list ('a *    string));
[plist elem sh pc el] returns the pretty printed string of a list of elements with separators. The elements are printed horizontally as far as possible. When an element does not fit on the line, a newline is added and the element is displayed in the next line with an indentation of [sh]. [elem] is the function to print elements, [el] a list of pairs (element * separator) (the last separator being ignored).
value plistb : pr_fun 'a -> int -> pr_fun (list ('a *       string));
[plist elem sh pc el] returns the pretty printed string of the list of elements, like with [plist] but the value of [pc.bef] corresponds to an element already printed, as it were on the list. Therefore, if the first element of [el] does not fit in the line, a newline and a tabulation is added after [pc.bef].
value plistl : pr_fun 'a -> pr_fun 'a -> int -> pr_fun (list       ('a * string));
paragraph list with a different function for the last element.
value hvlistl : pr_fun 'a -> pr_fun 'a -> pr_fun (list    'a);
applies “hlistl” if the context is horizontal; else applies “vlistl”.

Miscellaneous

value tab : int -> string;
[tab ind] is equivalent to [String.make ind ‘ ‘]
value flatten_sequence : MLast.expr -> option (list    MLast.expr);
[flatten_sequence e]. If [e] is an expression representing a sequence, return the list of expressions of the sequence. If some of these expressions are already sequences, they are flattened in the list. If that list contains expressions of the form let..in sequence, this sub-sequence is also flattened with the let..in applying only to the first expression of the sequence. If [e] is a let..in sequence, it works the same way. If [e] is not a sequence nor a let..in sequence, return None.

Example : repeat..until

This pretty prints the example repeat..until statement programmed in the chapter Syntax extensions (first version generating a “while” statement).

The code

The pattern generated by the “repeat” statement is recognized (sequence ending with a “while” whose contents is the same than the beginning of the sequence) by the function “is_repeat” and the repeat statement is pretty printed in its initial form using the function “horiz_vertic” of the Pretty module. File “pr_repeat.ml”:

#load "pa_extprint.cmo";
#load "q_MLast.cmo";

open Pcaml;
open Pretty;
open Prtools;

value eq_expr_list el1 el2 =
  if List.length el1 <> List.length el2 then False
  else List.for_all2 eq_expr el1 el2
;

value is_repeat el =
  match List.rev el with
  [ [<:expr< while not $e$ do { $list:el2$ } >> :: rel1] ->
      eq_expr_list (List.rev rel1) el2
  | _ -> False ]
;

value semi_after pr pc = pr {(pc) with aft = sprintf "%s;" pc.aft};

EXTEND_PRINTER
  pr_expr:
    [ [ <:expr< do { $list:el$ } >> when is_repeat el ->
          match List.rev el with
          [ [<:expr< while not $e$ do { $list:el$ } >> :: _] ->
              horiz_vertic
                (fun () ->
                   sprintf "%srepeat %s until %s%s" pc.bef
                     (hlistl (semi_after curr) curr
                        {(pc) with bef = ""; aft = ""} el)
                     (curr {(pc) with bef = ""; aft = ""} e)
                     pc.aft)
                (fun () ->
                   let s1 = sprintf "%srepeat" (tab pc.ind) in
                   let s2 =
                     vlistl (semi_after curr) curr
                       {(pc) with
                        ind = pc.ind + 2;
                        bef = tab (pc.ind + 2);
                        aft = ""}
                       el
                   in
                   let s3 =
                     sprintf "%suntil %s" (tab pc.ind)
                       (curr {(pc) with bef = ""} e)
                   in
                   sprintf "%s\n%s\n%s" s1 s2 s3)
          | _ -> assert False ] ] ]
  ;
END;

Compilation

ocamlc -pp camlp5r -I +camlp5 -c pr_repeat.ml

Testing

Getting the same files “foo.ml” and “bar.ml” of the repeat syntax example:

$ cat bar.ml
#load "./foo.cmo";
value x = ref 42;
repeat
  print_int x.val;
  print_endline "";
  x.val := x.val + 3
until x.val > 70;

$ camlp

Without the pretty printing kit:

$ camlp5r pr_r.cmo bar.ml
#load "./foo.cmo";
value x = ref 42;
do {
  print_int x.val;
  print_endline "";
  x.val := x.val + 3;
  while not (x.val > 70) do {
    print_int x.val;
    print_endline "";
    x.val := x.val + 3
  }
};

With the pretty printing kit:

$ camlp5r pr_r.cmo ./pr_repeat.cmo bar.ml -l 75
#load "./foo.cmo";
value x = ref 42;
repeat
  print_int x.val;
  print_endline "";
  x.val := x.val + 3
until x.val > 70;

Redefining OCaml syntax

Redefining OCaml syntax

Better than just syntax extensions, it is possible to redefine the whole syntax of the language. For example, to:

  • define your own syntax, like the revised syntax does,
  • have a version whose keywords are translated in your native language,
  • restrict the OCaml language,
  • interpret XML (or other languages) as OCaml source,
  • and so on…

Starting with an example

A way to start doing this is to take, in Camlp5 sources, one of the files “etc/pa_o.ml” or file “meta/pa_r.ml”. The first one defines the OCaml standard syntax and the second one the revised syntax.

Let’s say you want to take the normal syntax and make some readjustments. You first make a copy of “etc/pa_o.ml” naming it, e.g., “mysyntax.ml” (the example below works similarly if you take “meta/pa_r.ml” instead):

To test, you can compile it with the command:

ocamlc -pp camlp5r -I $(camlp5 -where) -c mysyntax.ml

This produces the file “mysyntax.cmo”. Now you can compile one of your files, e.g. “foo.ml”, if written in this syntax, i.e. the normal OCaml syntax if you made no changes in “mysyntax.ml”:

ocamlc -pp 'camlp5 ./mysyntax.cmo pr_dump.cmo' -c foo.ml

If there si no changes in “mysyntax.ml” from “pa_o.ml”, this is just a compilation with the normal OCaml syntax. To make changes, you can edit the file “mysyntax.ml” and recompile it. As an exercice, try to translate some keywords in your native language (or another language if it is not English).

Reading the way Camlp5 extensible grammars and syntax tree work (both used in “pa_o.ml” and “pa_r.ml”), you can make more complicated changes or change everything, if you want.

A file for an OCaml syntax

This is what you can find in the files “pa_o.ml” and “pa_r.ml”.

An OCaml syntax files uses the Camlp5 library module Pcaml. All grammar entries are defined there. The first thing is the reinitialization of the grammar (which clear all tokens and define a lexer) and all grammar entries, to be sure that no possible previous loaded grammars remain.

If using the same lexer (provided in Camlp5 library module Plexer), it is done by:

Grammar.Unsafe.gram_reinit Pcaml.gram (Plexer.gmake ())

The cleanup of all grammar entries are done by calls to the function “Grammar.Unsafe.clear_entry”. The main entries are Pcaml.interf, for compiling an interface (a “.mli” file) and Pcaml.implem, for compiling an implementation (a “.ml” file). And all other grammars entries you want to use must be cleared:

Grammar.Unsafe.clear_entry Pcaml.interf;
Grammar.Unsafe.clear_entry Pcaml.implem;
Grammar.Unsafe.clear_entry Pcaml.expr;
Grammar.Unsafe.clear_entry Pcaml.patt;
...

Actually, the camlp5 command can compile the input file with other ways than using the Camlp5 grammars. The variables “Pcaml.parse_interf” and “Pcaml.parse_implem” are references to the functions called by camlp5. By default, it is the Camlp5 grammar syntax, but to be sure it goes on using it (if a previous load changed that), the following statement are added:

Pcaml.parse_interf.val := Pcaml.Grammar.Entry.parse interf;
Pcaml.parse_implem.val := Pcaml.Grammar.Entry.parse implem;

In the files “pa_o.ml” and “pa_r.ml”, some local functions follow, which are themselves followed by a call to the big statement “EXTEND”, the main statement of the Camlp5 extensible grammars system.

Quotations

Quotations

Quotations are a syntax extension in Camlp5 to build expressions and patterns in any syntax independant from the one of OCaml. Quotations are expanded, i.e. transformed, at parse time to produce normal syntax trees, like the rest of the program. Quotations expanders are normal OCaml functions writable by any programmer.

The aim of quotations is to use concrete syntax for manipulating abstract values. That makes programs easier to write, read, modify, and understand. The drawback is that quotations are linguistically isolated from the rest of the program, in opposition to syntax extensions, which are included in the language.

Introduction

A quotation is syntactically enclosed by specific quotes formed by less (<) and greater (>) signs. Namely:

  • starting with either “<<” or “<:ident<” where “ident” is the quotation name,
  • ending with “>>

Examples:

<< \x.x x >>
<:foo< hello, world >>
<:bar< @#$%;* >>

The text between these particular parentheses can be any text. It may contain enclosing quotations and the characters “<”, “>” and “\” can be escaped by “\”. Notice that possible double-quote, parentheses, OCaml comments do not necessarily have to balanced inside them.

As far as the lexer is concerned, a quotation is just a kind of string.

Quotation expander

Quotations are treated at parse time. Each quotation name is associated with a quotation expander, a function transforming the content of the quotation into a syntax tree. There are actually two expanding functions, depending on whether the quotation is in the context of an expression or if it is in the context of a pattern.

If a quotation has no associated quotation expander, a parsing error is displayed and the compilation fails.

The quotation expander, or, rather, expanders, are functions taking the quotation string as parameter and returning a syntax tree. There is no constraint about which parsing technology is used. It can be stream parsers, extensible grammars, string scanning, ocamllex and yacc, any.

To build syntax trees, Camlp5 provides a way to easily build them see the chapter about them: it is possible to build abstract syntax through concrete syntax using, precisely… quotations.

Defining a quotation

By syntax tree

To define a quotation, it is necessary to program the quotation expanders and to, finally, end the source code with a call to:

Quotation.add name (Quotation.ExAst (f_expr, f_patt));

where “name” is the name of the quotation, and “f_expr” and “f_patt” the respective quotations expanders for expressions and patterns.

After compilation of the source file (without linking, i.e. using option “-c” of the OCaml compiler), an object file is created (ending with “.cmo”), which can be used as syntax extension kit of Camlp5.

By string

There is another way to program the expander: a single function which returns, not a syntax tree, but a string which is parsed, afterwards, by the system. This function takes a boolean as first parameter telling whether the quotation is in position of expression (True) or in position of a pattern (False).

Used that way, the source file must end with:

Quotation.add name (Quotation.ExStr f);

where “f” is that quotation expander. The advantage of this second approach is that it is simple to understand and use. The drawback is that there is no way to specify different source locations for different parts of the quotation (what may be important in semantic error messages).

Default quotation

It is possible to use some quotation without its name. Use for that the variable “Quotation.default_quotation”. For example, ending a file by:

Quotation.add "foo" (Quotation.ExAst (f_expr, f_patt));
Quotation.default.val := "foo";

allows to use the quotation “foo” without its name, i.e.:

<< ... >>

instead of:

<:foo< ... >>

If several files set the variable “Quotation.default”, the default quotation applies to the last loaded one.

Antiquotations

A quotation obeys its own rules of lexing and parsing. Its result is a syntax tree, of type “Pcaml.expr” if the quotation is in the context of an expression, or “Pcaml.patt” if the quotation is in the context of a pattern.

It can be interesting to insert portions of expressions or patterns of the enclosing context in its own quotations. For that, the syntax of the quotation must define a syntax for antiquotations areas. It can be, for example:

  • A character introducing a variable: in this case the antiquotation can just be a variable.
  • A couple of characters enclosing the antiquotations. For example, in the predefined syntax tree quotations, the antiquotations are enclosed with dollar (”$”) signs.

In quotations, the locations in the resulting syntax tree are all set to the location of the quotation itself (if this resulting tree contains locations, they are overwritten with this location). Consequently, if there are semantic (typing) errors, the OCaml compiler will underline the entire quotation.

But in antiquotations, since they can be inserted in the resulting syntax tree, it is interesting to keep their initial quotations. For that, the nodes:

<:expr< $anti:...$ >>
<:patt< $anti:...$ >>

equivalent to:

MLast.ExAnt loc ...
MLast.PaAnt loc ...

are provided (see syntax tree quotations).

Let us take an example, without this node, and with this specific node.

Let us consider an elementary quotation system whose contents is just an antiquotation. This is just a school example, since the quotations brackets are not necessary, in this case. But we are going to see how the source code is underlined in errors messages.

Example without antiquotation node

The code for this quotation is (file “qa.ml”):

#load "q_MLast.cmo";
let expr s = Grammar.Entry.parse Pcaml.expr (Stream.of_string s) in
Quotation.add "a" (Quotation.ExAst (expr, fun []));

The quotation expander “expr” just takes the string parameter (the contents of the quotation), and returns the result of the expression parser of the OCaml language.

Compilation:

ocamlc -pp camlp5r -I +camlp5 -c qa.ml

Let us test it in the toplevel with a voluntary typing error:

$ ocaml -I +camlp5 camlp5r.cma
        Objective Caml version ...

        Camlp5 Parsing version ...

# #load "qa.cmo";
# let x = "abc" and y = 25 in <:a< x ^ y >>;
Characters 28-41:
  let x = "abc" and y = 25 in <:a< x ^ y >>;
                              ^^^^^^^^^^^^^
This expression has type int but is here used with type string

We observe that the full quotation is underlined, although it concerns only the variable “y”.

Example with antiquotation node

Let us consider this second version (file “qb.ml”):

#load "q_MLast.cmo";
let expr s =
  let ast = Grammar.Entry.parse Pcaml.expr (Stream.of_string s) in
  let loc = Ploc.make 1 0 (0, String.length s) in
  <:expr< $anti:ast$ >>
in
Quotation.add "b" (Quotation.ExAst (expr, fun []));

As above, the quotation expander “expr” takes the string parameter (the contents of the quotation) and applies the expression parser of the OCaml language. Its result, instead of being returned as it is, is enclosed with the antiquotation node. (The location built is the location of the whole string.)

Compilation:

ocamlc -pp camlp5r -I +camlp5 -c qb.ml

Now the same test gives:

$ ocaml -I +camlp5 camlp5r.cma
        Objective Caml version ...

        Camlp5 Parsing version ...

# #load "qb.cmo";
# let x = "abc" and y = 25 in <:b< x ^ y >>;
Characters 37-38:
  let x = "abc" and y = 25 in <:b< x ^ y >>;
                                       ^
This expression has type int but is here used with type string

Notice that, now, only the variable “y” is underlined.

In conclusion

In the resulting tree of the quotation expander:

  • only portions of this tree, which are sons of the expr and patt antiquotation nodes, have the right location they have in the quotation (provided the quotation expander gives it the right location of the antiquation in the quotation),
  • the other nodes inherit, as location, the location of the full quotation.

Locations in quotations and antiquotations

This section explains in details the problem of locations in quotations and antiquotations. It is designed for programmers of quotation expanders.

Locations are the difficult point in quotations and antiquotations. If they are not set correctly, error messages may highlight wrong parts of the source.

The locations are controlled:

  • for syntax errors: by the exception “Ploc.Exc”, raised by the function “Ploc.raise”,
  • for typing errors, by the syntax tree nodes “<:expr< $anti:...$ >>” and “<:meta< $anti:...$ >>”.

If the quotation expander never uses them, all syntax and typing errors highlight the whole quotation.

Remark that in extensible grammars, syntax errors are automatically enclosed by the exception “Ploc.Exc”. Therefore, if the quotation is parsed by an extensible grammar entry, this exception can be raised.

In the syntax tree nodes “<:expr< $anti:...$ >>” and “<:meta< $anti:...$ >>”, the location is indicated by the implicit variable named “loc”. Their usage is therefore something like:

let loc = ...computation of the location... in
<:expr< $anti:...$ >>

In the quotation

All locations must be computed relatively to the quotation string. The quotation string is the string between “<<” or “<:name<” and “>>” (excluded), the first character of this string being at location zero.

The quotation system automatically shifts all locations with the location of the quotation: the programmer of the quotation expander does not therefore need to care about where the quotation appears in the source.

In antiquotations

In antiquotations, it is important to control how the antiquotation string is parsed. For example, if the function parsing the antiquotation string raises “Ploc.Exc”, the location of this exception must be shifted with the location of the antiquotation in the quotation.

For example, let us suppose that the source contains:

<< abc^ijk^(xyz) >>

where the antiquotation is specified between the caret (”^”) characters. The antiquotation string is “ijk”. It can be built in the quotation expander by:

<:expr< ijk >>

If used just like this, without the “<:expr< $anti:x$ >>”, in case of typing error (for example if the variable “ijk” is unbound), the OCaml error message will be:

<< abc^ijk^(xyz) >>
^^^^^^^^^^^^^^^^^^^
Unbound value ijk

To put a location to “ijk”, since its location in the quotation is “(5, 8)” (the “i” being the 5th characater of the quotation string, starting at zero), the quotation expander can build it like this:

let e = <:expr< ijk >> in
let loc = Ploc.make_unlined (5, 8) in
<:expr< $anti:e$ >>

In this case, the possible typing error message will be:

<< abc^ijk^(xyz) >>
       ^^^
Unbound value ijk

This case is simple, since the antiquotation is just an identifier, and there is no parser computing it.

If the antiquotation has to be parsed, for example if it is a complicated expression, there are two points to care about:

First, the syntax error messages. If the parser of the antiquotation raises “Ploc.Exc”, its location is relative to the antiquotation. It must therefore be shifted to correspond to a location in the quotation. If “f” is the parsing function and “sh” the shift of the antiquotation in the quotation (whose value is “5” in the example), the code must be something like:

try f () with
[ Ploc.Exc loc exc -> Ploc.raise (Ploc.shift sh loc) exc ]

Second, the typing error messages. Here, the above code with “<:expr< $anti:e$ >>” can apply to the resulting tree.

The complete code, taking the possible syntax error messages and the possible typing error messages into account, can be (where “len” is the antiquotation length):

let e =
  try f () with
  [ Ploc.Exc loc exc -> Ploc.raise (Ploc.shift sh loc) exc ]
in
let loc = Ploc.make_unlined (sh, sh + len) in
<:expr< $anti:e$ >>

Located errors

If the quotation expander raises an exception, by default, the whole quotation is underlined:

$ cat foo.ml
#load "q_MLast.cmo";
let expr s = raise (Failure "hello") in
Quotation.add "a" (Quotation.ExAst (expr, fun []));

$ ocaml -I +camlp5 camlp5r.cma
        Objective Caml version ...

        Camlp5 Parsing version ...

# #use "foo.ml";
- : unit = ()
# <:a< good morning >>;
Toplevel input:
# <:a< good morning >>;
  ^^^^^^^^^^^^^^^^^^^^
While expanding quotation "a":
Failure: hello

To specify a location of the exception, use the function “Ploc.raise” instead of “raise”. In this example, let us suppose that we want only the characters 5 to 7 are underlined. This can be done like this:

$ cat foo.ml
#load "q_MLast.cmo";
let expr s = Ploc.raise (Ploc.make 1 0 (5, 7)) (Failure "hello") in
Quotation.add "a" (Quotation.ExAst (expr, fun []));

$ ledit ocaml -I +camlp5 camlp5r.cma
        Objective Caml version ...

        Camlp5 Parsing version ...

# #use "foo.ml";
- : unit = ()
# <:a< good morning >>;
Toplevel input:
# <:a< good morning >>;
           ^^
While expanding quotation "a":
Failure: hello

The Quotation module

type expander =
  [ ExStr of bool -> string -> string
  | ExAst of (string -> MLast.expr * string -> MLast.patt) ]
;

Is the type for quotation expander kinds:

  • Quotation.ExStr exp” corresponds to an expander “exp” returning a string which is parsed by the system to create a syntax tree. Its boolean parameter tells whether the quotation is in position of an expression (True) or in position of a pattern (False). Quotations expanders created this way may work for some particular OCaml syntax, and not for another one (e.g. may work when used with revised syntax and not when used with normal syntax, and conversely).
  • Quotation.ExAst (expr_exp, patt_exp)” corresponds to expanders returning syntax trees, therefore not necessitating to be parsed afterwards. The function “expr_exp” is called when the quotation is in position of an expression, and “patt_exp” when the quotation is in position of a pattern. Quotation expanders created this way are independent from the enclosing syntax.
value add : string -> expander -> unit;
Quotation.add name exp” adds the quotation “name” associated with the expander “exp”.
value find : string -> expander;
Quotation.find name” returns the quotation expander of the given name.
value default : ref string;
The name of the default quotation : it is possible to use this quotation between “<<” and “>>” without having to specify its name.
value translate : ref (string -> string);
Function translating quotation names; default = identity. Used in the predefined quotation “q_phony.cmo”. See below.

Some other useful functions for quotations are defined in the module “Pcaml”. See the chapter “The Pcaml module”, section “Quotation management”.

Predefined quotations

q_MLast.cmo

This extension kit add quotations of OCaml syntax tree, allowing to use concrete syntax to represent abstract syntax. See the chapter Syntax tree.

q_ast.cmo

As with the previous quotation, this extension kit add quotations of OCaml syntax tree, but in the current user syntax with all extensions, the previous one being restricted to revised syntax without extension. See the chapters Syntax tree and Syntax tree quotations in user syntax.

q_phony.cmo

This extension kit is designed for pretty printing and must be loaded after a language pretty printing kit (in normal or in revised syntax). It prevents the expansions of quotations, transforming them into variables. The pretty printing then keeps the initial (source) form.

The macros (extension “pa_macro.cmo”) are also displayed in their initial form, instead of expanded.

A full example: lambda terms

This example allows to represent lambda terms by a concrete syntax and to be able to combine them using antiquotations.

A lambda term is defined like this:

type term =
  [ Lam of string and term
  | App of term and term
  | Var of string ]
;

Examples:

value fst = Lam "x" (Lam "y" (Var "x"));
value snd = Lam "x" (Lam "y" (Var "y"));
value delta = Lam "x" (App (Var "x") (Var "x"));
value omega = App delta delta;
value comb_s =
  Lam "x"
    (Lamb "y"
       (Lamb "z"
          (App (App (Var "x") (Var "y")) (App (Var "x") (Var "z")))));

Since combinations of lambda term may be complicated, The idea is to represent them by quotations in concrete syntax. We want to be able to write the examples above like this:

value fst = << \x.\y.x >>;
value snd = << \x.\y.y >>;
value delta = << \x.x x >>
value omega = << ^delta ^delta >>;
value comb_s = << \x.\y.\z.(x y)(x z) >>;

which is a classic representation of lambda terms.

Notice, in the definition of “omega”, the use of the caret (”^”) sign to specify antiquotations. Notice also the simplicity of the representation of the expression defining “comb_s”.

Here is the code of the quotation expander, term.ml. The expander uses the extensible grammars. It has its own lexer (using the stream lexers) because the lexer of OCaml programs (”Plexer.gmake ()”), cannot recognize the backslashes alone.

Lexer

(* lexer *)

#load "pa_lexer.cmo";

value rec ident =
  lexer
  [ [ 'a'-'z' | 'A'-'Z' | '0'-'9' | '-' | '_' | '\128'-'\255' ]
    ident!
  | ]
;

value empty _ = parser [: _ = Stream.empty :] -> [];

value rec next_tok =
  lexer
  [ "\\" -> ("BSLASH", "")
  | "^" -> ("CARET", "")
  | 'a'-'z' ident! -> ("IDENT", $buf)
  | "(" -> ("", "(")
  | ")" -> ("", ")")
  | "." -> ("", ".")
  | empty -> ("EOS", "")
  | -> raise (Stream.Error "lexing error: bad character") ]
;

value rec skip_spaces =
  lexer
  [ " "/ skip_spaces!
  | "\n"/ skip_spaces!
  | "\r"/ skip_spaces! | ]
;

value record_loc loct i (bp, ep) = do {
  if i >= Array.length loct.val then do {
    let newt =
      Array.init (2 * Array.length loct.val + 1)
        (fun i ->
           if i < Array.length loct.val then loct.val.(i)
           else Ploc.dummy)
    in
    loct.val := newt;
  }
  else ();
  loct.val.(i) := Ploc.make_unlined (bp, ep)
};

value lex_func cs =
  let loct = ref [| |] in
  let ts =
    Stream.from
      (fun i -> do {
         ignore (skip_spaces $empty cs : Plexing.Lexbuf.t);
         let bp = Stream.count cs in
         let r = next_tok $empty cs in
         let ep = Stream.count cs in
         record_loc loct i (bp, ep);
         Some r
       })
  in
  let locf i =
    if i < Array.length loct.val then loct.val.(i) else Ploc.dummy
  in
  (ts, locf)
;

value lam_lex =
  {Plexing.tok_func = lex_func;
   Plexing.tok_using _ = (); Plexing.tok_removing _ = ();
   Plexing.tok_match = Plexing.default_match;
   Plexing.tok_text = Plexing.lexer_text;
   Plexing.tok_comm = None}
;

Parser

(* parser *)

#load "pa_extend.cmo";
#load "q_MLast.cmo";

value g = Grammar.gcreate lam_lex;
value expr_term_eos = Grammar.Entry.create g "term";
value patt_term_eos = Grammar.Entry.create g "term";

EXTEND
  GLOBAL: expr_term_eos patt_term_eos;
  expr_term_eos:
    [ [ x = expr_term; EOS -> x ] ]
  ;
  expr_term:
    [ [ BSLASH; i = IDENT; "."; t = SELF -> <:expr< Lam $str:i$ $t$ >> ]
    | [ x = SELF; y = SELF -> <:expr< App $x$ $y$ >> ]
    | [ i = IDENT -> <:expr< Var $str:i$ >>
      | CARET; r = expr_antiquot -> r
      | "("; t = SELF; ")" -> t ] ]
  ;
  expr_antiquot:
    [ [ i = IDENT ->
         let r =
           let loc = Ploc.make_unlined (0, String.length i) in
           <:expr< $lid:i$ >>
         in
         <:expr< $anti:r$ >> ] ]
  ;
  patt_term_eos:
    [ [ x = patt_term; EOS -> x ] ]
  ;
  patt_term:
    [ [ BSLASH; i = IDENT; "."; t = SELF -> <:patt< Lam $str:i$ $t$ >> ]
    | [ x = SELF; y = SELF -> <:patt< App $x$ $y$ >> ]
    | [ i = IDENT -> <:patt< Var $str:i$ >>
      | CARET; r = patt_antiquot -> r
      | "("; t = SELF; ")" -> t ] ]
  ;
  patt_antiquot:
    [ [ i = IDENT ->
         let r =
           let loc = Ploc.make_unlined (0, String.length i) in
           <:patt< $lid:i$ >>
         in
         <:patt< $anti:r$ >> ] ]
  ;
END;

value expand_expr s =
  Grammar.Entry.parse expr_term_eos (Stream.of_string s)
;
value expand_patt s =
  Grammar.Entry.parse patt_term_eos (Stream.of_string s)
;

Quotation.add "term" (Quotation.ExAst (expand_expr, expand_patt));
Quotation.default.val := "term";

Compilation and test

Compilation:

ocamlc -pp camlp5r -I +camlp5 -c term.ml

Example, in the toplevel, including a semantic error, correctly underlined, thanks to the antiquotation nodes:

$ ocaml -I +camlp5 camlp5r.cma
        Objective Caml version ...

        Camlp5 Parsing version ...

# #load "term.cmo";
# type term =
   [ Lam of string and term
   | App of term and term
   |   Var of string ]
  ;
type term =
  [ Lam of string and term | App of term and term | Var of string ]
# value comb_s = << \x.\y.\z.(x y)(x z) >>;
value comb_s : term =
  Lam "x"
   (Lam "y"
     (Lam "z" (App (App (Var "x") (Var "y")) (App (Var "x") (Var "z")))))
# value omega = << ^delta ^delta >>;
Characters 18-23:
  value omega = << ^delta ^delta >>;
                    ^^^^^
Unbound value delta
# value delta = << \x.x x >>;
value delta : term = Lam "x" (App (Var "x") (Var "x"))
# value omega = << ^delta ^delta >>;
value omega : term =
  App (Lam "x" (App (Var "x") (Var "x")))
    (Lam "x" (App (Var "x") (Var "x")))

The revised syntax

The revised syntax

The revised syntax is an alternative syntax of OCaml. It is close to the normal syntax. We present here only the differences between the two syntaxes.

Notice that there is a simple way to know how the normal syntax is written in revised syntax: write the code in a file “foo.ml” in normal syntax and type, in a shell:

camlp5o pr_r.cmo pr_rp.cmo foo.ml

And, conversely, how a file “bar.ml” written in revised syntax is displayed in normal syntax:

camlp5r pr_o.cmo pr_op.cmo bar.ml

Even simpler, without creating a file:

camlp5o pr_r.cmo pr_op.cmo -impl -
... type in normal syntax ...
... type control-D ...
camlp5r pr_o.cmo pr_rp.cmo -impl -
... type in revised syntax ...
... type control-D ...

Lexing

  • The character quote (') can be written without backslash: ======== ======= OCaml Revised ======== ======= '\'' ''' ======== =======

Modules, Structure and Signature items

  • Structure and signature items always end with a single semicolon which is required.

  • In structures, the declaration of a value is introduced by the keyword “value”, instead of “let”: ========================= ======================== OCaml Revised ========================= ======================== let x = 42;; value x = 42; let x = 42 in x + 7;; let x = 42 in x + 7; ========================= ========================

  • In signatures, the declaration of a value is also introduced by the keyword “value”, instead of “val”: ================= ================== OCaml Revised ================= ================== val x : int;; value x : int; ================= ==================

  • In signatures, abstract module types are represented by a quote and an (any) identifier: ==================== ======================== OCaml Revised ==================== ======================== module type MT;; module type MT = 'a; ==================== ========================

  • Functor application uses currying. Parentheses are not required for the parameters: =================================== ============================== OCaml Revised =================================== ============================== type t = Set.Make(M).t;; type t = (Set.Make M).t; module M = Mod.Make (M1) (M2);; module M = Mod.Make M1 M2; =================================== ==============================

  • It is possible to group several declarations together either in an interface or in an implementation by enclosing them between “declare” and “end” (this is useful when using syntax extensions to generate several declarations from one). Example in an interface:

    declare
      type foo = [ Foo of int | Bar ];
      value f : foo -> int;
    end;
    

Expressions and Patterns

Imperative constructions

  • The sequence is introduced by the keyword “do” followed by “{” and terminated by “}”; it is possible to put a semicolon after the last expression: ================== ========================= OCaml Revised ================== ========================= e1; e2; e3; e4 do { e1; e2; e3; e4 } ================== =========================
  • The “do” after the “while” loop and the “for” loop are followed by a “{” and the loop end with “}”; it is possible to put a semicolon after the last expression: ========================= ========================= OCaml Revised ========================= ========================= while e1 do while e1 do { ``  e1; e2; e3`` ``  e1; e2; e3`` done } ```` ```` for i = e1 to e2 do  `` ``for i = e1 to e2 do { ``  e1; e2; e3`` ``  e1; e2; e3`` done } ========================= =========================

Tuples and Lists

  • Parentheses are required in tuples: ===================== ======================= OCaml Revised ===================== ======================= 1, "hello", World (1, "hello", World) ===================== =======================
  • Lists are always enclosed with brackets. A list is a left bracket, followed by a list of elements separated with semicolons, optionally followed by colon-colon and an element, and ended by a right bracket. Warning: the colon-colon is not an infix but is just part of the syntactic construction. ==================== ================== OCaml Revised ==================== ================== x :: y [x :: y] [x; y; z] [x; y; z] x :: y :: z :: t [x; y; z :: t] ==================== ==================

Records

  • In record update, parentheses are required around the initial expression: ====================== ======================== OCaml Revised ====================== ======================== {e with field = a} {(e) with field = a} ====================== ========================
  • It is allowable to use function binding syntax in record field definitions: ======================== ================= OCaml Revised ======================== ================= {field = fun a -> e} {field a = e} ======================== =================

Irrefutable patterns

An irrefutable pattern is a pattern which is syntactically visible and never fails. They are used in some syntactic constructions. It is either:

  • A variable,
  • The wildcard “_”,
  • The constructor “()”,
  • A tuple with irrefutable patterns,
  • A record with irrefutable patterns,
  • A type constraint with an irrefutable pattern.

Notice that this definition is only syntactic: a constructor belonging to a type having only one constructor is not considered as an irrefutable pattern (except “()”).

Constructions with matching

  • The keyword “function” no longer exists. Only “fun” is used.

  • The pattern matching, in constructions with “fun”, “match” and “try” is closed with brackets: an open bracket “[” before the first case, and a close bracket “]” after the last case: ================ ================ OCaml Revised ================ ================ match e with match e with ``  p1 -> e1`` [ p1 -> e1 | p2 -> e2 | p2 -> e2 ] ================ ================

    If there is only one case and if the pattern is irrefutable, the brackets are not required. These examples work identically in OCaml and in revised syntax: ========================= ========================= OCaml Revised ========================= ========================= fun x -> x fun x -> x fun {foo=(y, _)} -> y fun {foo=(y, _)} -> y ========================= =========================

  • It is possible to write the empty function which always raises the exception “Match_failure” when a parameter is applied. It is also possible to write and empty “match”, raising “Match_failure” after having evaluated its expression and the empty “try”, equivalent to its expression without try:

    fun []
    match e with []
    try e with []
    
  • The patterns after “let” and “value” must be irrefutable. The following OCaml expression:

    let f (x::y) = ...
    

    must be written:

    let f = fun [ [x::y] -> ...
    
  • It is possible to use a construction “where”, equivalent to “let”, but usable only when where is only one binding. The expression:

    e1 where p = e
    

    is equivalent to:

    let p = e in e1
    

Mutables and Assignment

  • The statement “<-” is written “:=”: ============ ============ OCaml Revised ============ ============ x.f <- y x.f := y ============ ============
  • The “ref” type is declared as a record type with one field named “val”, instead of “contents”. The operator “!” does not exist any more, and references are assigned like the other mutables: =============== ====================== OCaml Revised =============== ====================== x := !x + y x.val := x.val + y =============== ======================

Miscellaneous

  • The “else” is required in the “if” statement: =============== ======================= OCaml Revised =============== ======================= if a then b if a then b else () =============== =======================

  • The boolean operations “or” and “and” can only be written with “||” and “&&”: =============== =============== OCaml Revised =============== =============== a or b & c a || b && c a || b && c a || b && c =============== ===============

  • No more “begin end” construction. One must use parentheses.

  • The operators as values are written with an backslash: ========= ======== OCaml Revised ========= ======== (+) \+ (mod) \mod ========= ========

  • Nested “as” patterns require parenthesis: ============================== =============================== OCaml Revised ============================== =============================== function Some a as b, c -> fun [ ((Some a as b), c) -> ``  …`` ``  …`` ============================== ===============================

    But they are not required before the right arrow: =========================== ======================== OCaml Revised =========================== ======================== function Some a as b -> fun [ Some a as b -> ``  …`` ``  …`` =========================== ========================

  • The operators with special characters are not automatically infix. To define infixes, use syntax extensions.

Types and Constructors

  • The type constructors are before their type parameters, which are curryfied: ============================== ================================ OCaml Revised ============================== ================================ int list list int ('a, bool) Hashtbl.t Hashtbl.t 'a bool type 'a foo = 'a list list type foo 'a = list (list 'a) ============================== ================================

  • The abstract types are represented by an unbound type variable: ================= ===================== OCaml Revised ================= ===================== type 'a foo;; type foo 'a = 'b; type bar;; type bar = 'a; ================= =====================

  • Parentheses are required in tuples of types: ============== ================ OCaml Revised ============== ================ int * bool (int * bool) ============== ================

  • In declarations of a concrete type, brackets must enclose the constructor declarations: ========================= ============================ OCaml Revised ========================= ============================ type t = A of i | B;; type t = [ A of i | B ]; ========================= ============================

  • It is possible to make the empty type, without constructor:

    type foo = [];
    
  • There is a syntax difference between data constructors with several parameters and data constructors with one parameter of type tuple: The declaration of a data constructor with several parameters is done by separating the types with “and”. In expressions and patterns, these constructor parameters must be curryfied: =========================== ================================ OCaml Revised =========================== ================================ type t = C of t1 * t2;; type t = [ C of t1 and t2 ]; C (x, y);; C x y; =========================== ================================

    The declaration of a data constructor with one parameter of type tuple is done by using a tuple type. In expressions and patterns, the parameter must not to be curryfied, since it is alone. In that case the syntax of constructor parameters is the same between the two syntaxes: ============================= ================================ OCaml Revised ============================= ================================ type t = D of (t1 * t2);; type t = [ D of (t1 * t2) ]; D (x, y);; D (x, y); ============================= ================================

  • The bool constructors start with an uppercase letter. The identifiers “true” and “false” are not keywords: ================= ================= OCaml Revised ================= ================= true && false True && False ================= =================

  • In record types, the keyword “mutable” must appear after the colon: =============================== ============================== OCaml Revised =============================== ============================== type t = {mutable x : t1};; type t = {x : mutable t1}; =============================== ==============================

  • Manifest types are with “==”: =========================== ============================ OCaml Revised =========================== ============================ type 'a t = 'a option = type t 'a = option 'a == ``    None`` ``  [ None`` ``  | Some of ‘a`` ``  | Some of ‘a ]`` =========================== ============================

  • Polymorphic types start with “!”: ========================== ============================ OCaml Revised ========================== ============================ type t = type t = ``  { f : ‘a . ‘a list }`` ``  { f : ! ‘a . list ‘a }`` ========================== ============================

Streams and Parsers

  • The streams and the stream patterns are bracketed with “[:” and “:]” instead of “[<” and “>]”.

  • The stream component “terminal” is written with a back-quote instead of a quote: ======================= ============================== OCaml Revised ======================= ============================== [< '1; '2; s; '3 >] [: `1; `2; s; `3 :] ======================= ==============================

  • The cases of parsers are bracketed with “[” and “]”, as with “fun”, “match” and “try”. If there is one case, the brackets are not required: ========================== ================================ OCaml Revised ========================== ================================ parser parser ``  [< ‘Foo >] -> e`` [ [: `Foo :] -> e | [< p = f >] -> f;; | [: p = f :] -> f ]; parser [< 'x >] -> x;; parser [: `x :] -> x; ========================== ================================

  • It is possible to write the empty parser raising the exception “Stream.Failure” whatever parameter is applied, and the empty stream matching always raising “Stream.Failure”:

    parser []
    match e with parser []
    
  • In normal syntax, the error indicator starts with a double question mark, in revised syntax with a simple question mark: ================================ ====================================== OCaml Revised ================================ ====================================== parser parser ``  [< ‘1; ‘2 ?? “error” >] ->`` :literal:`  [: 1; `2 ? “error” :] -> ``    …`` ``    …`` ================================ ======================================

  • In normal syntax, the component optimization starts with “?!”, in revised syntax with “!”: ======================== ============================== OCaml Revised ======================== ============================== parser parser ``  [< ‘1; ‘2 ?! >] ->`` :literal:`  [: 1; `2 ! :] -> ``    …`` ``    …`` ======================== ==============================

Classes and Objects

  • Object items end with a single semicolon which is required.
  • Class type parameters follow the class identifier: ============================== ============================== OCaml Revised ============================== ============================== class ['a, 'b] point = ... class point ['a, 'b] = ... class c = [int] color;; class c = color [int]; ============================== ==============================
  • In the type of class with parameters, the type of the parameters are between brackets. Example in signature: ============================ ============================= OCaml Revised ============================ ============================= class c : int -> point;; class c : [int] -> point; ============================ =============================
  • The keywords “virtual” and “private” must be in this order: ============================== ============================== OCaml Revised ============================== ============================== method virtual private m : method virtual private m : ``  …`` ``  …`` method private virtual m : method virtual private m : ``  …`` ``  …`` ============================== ==============================
  • Object variables are introduced with “value” instead of “val”: ======================== =========================== OCaml Revised ======================== =========================== object val x = 3 end object value x = 3; end ======================== ===========================
  • Type constraints in objects are introduced with “type” instead of “constraint”: ================================== ============================= OCaml Revised ================================== ============================= object constraint 'a = int end object type 'a = int; end ================================== =============================

Labels and Variants

  • Labels in types must start with “~”: ============================= =============================== OCaml Revised ============================= =============================== val x : num:int -> bool;; value x : ~num:int -> bool; ============================= ===============================

Scheme

Scheme and Lisp syntaxes

It is possible to write OCaml programs with Scheme or Lisp syntax. They are close to one another, both using parentheses to identify and separate statements.

Common

The syntax extension kits are named “pa_scheme.cmo” and “pa_lisp.cmo”. The sources (same names ending with “.ml” in the Camlp5 sources), are written in their own syntax. They are boostrapped thanks to the versions being written in revised syntax and to the Camlp5 pretty printing system.

In the OCaml toplevel, it is possible to use them by loading “camlp5r.cma” first, then “pa_lisp.cmo” or “pa_scheme.cmo” after:

ocaml -I +camlp5 camlp5r.cma pa_scheme.cmo
        Objective Caml version ...

        Camlp5 Parsing version ...

# (let ((x 3)) (* 3 x))
- : int = 9
# (values 3 4 5)
- : (int * int * int) = (3, 4, 5)

ocaml -I +camlp5 camlp5r.cma pa_lisp.cmo
        Objective Caml version ...

        Camlp5 Parsing version ...

# (let ((x 3)) (* 3 x))
- : int = 9
# (, 3 4 5)
- : (int * int * int) = (3, 4, 5)

The grammar of Scheme and Lisp are relatively simple, just reading s-expressions. The syntax tree nodes are created in the semantic actions. Because of this, these grammars are hardly extensible.

However, the syntax extension EXTEND (”pa_extend.cmo” in extensible grammars) works for them: only the semantic actions need be written with the Scheme or Lisp syntax. Stream parsers are also implemented.

Warning: these syntaxes are incomplete, but can be completed, if Camlp5 users are insterested.

Scheme syntax

Some examples are given to show the principles:

OCaml Scheme
let x = 42;; (define x 42)
let f x = 0;; (define (f x) 0)
let rec f x y = 0;; (definerec (f x y) 0)
let x = 42 and y = 27 in x + y;; (let ((x 42) (y 27)) (+ x y))
let x = 42 in let y = 27 in x + y;; (let* ((x 42) (y 27)) (+ x y))
module M = struct … end;; (module M (struct …))
type ‘a t = A of ‘a * int | B (type (t ‘a) (sum (A ‘a int) (B)))
fun x y -> x (lambda (x y) x)
x; y; z (begin x y z)
f x y (f x y)
[1; 2; 3] [1 2 3]
x :: y :: z :: t [x y z :: t]
a, b, c (values a b c)
match x with ‘A’..’Z’ -> “x” (match x ((range ‘A’ ‘Z’) “x”)))
{x = y; z = t} {(x y) (z t)}

Lisp syntax

The same examples:

OCaml Lisp
let x = 42;; (value x 42)
let f x = 0;; (value f (lambda x 0))
let rec f x y = 0;; (value rec f (lambda (x y) 0))
let x = 42 and y = 27 in x + y;; (let ((x 42) (y 27)) (+ x y))
let x = 42 in let y = 27 in x + y;; (let* ((x 42) (y 27)) (+ x y))
module M = struct … end;; (module M (struct …))
type ‘a t = A of ‘a * int | B (type (t ‘a) (sum (A ‘a int) (B)))
fun x y -> x (lambda (x y) x)
x; y; z (progn x y z)
f x y (f x y)
[1; 2; 3] (list 1 2 3)
x :: y :: z :: t (list x y z :: t)
a, b, c (, a b c)
match x with ‘A’..’Z’ -> “x” (match x ((range ‘A’ ‘Z’) “x”)))
{x = y; z = t} ({} (x y) (z t))

Macros

Macros

Camlp5 provides a system of macros, added by the parsing kit “pa_macro.cmo”. Macros are values evaluated at parsing time.

When loaded, the parsing kit extends the syntax of the language and adds command options.

Added syntax

The parsing kit “pa_macro.cmo” extends the structure items (= toplevel phrases), the expressions and the patterns by the following grammar rules:

          str-item ::= str-macro-def
          sig-item ::= sig-macro-def
              expr ::= macro-expr
              patt ::= macro-patt
         cons-decl ::= macro-cons-decl
      rec-lab-decl ::= macro-rec-lab-decl
       match-assoc ::= macro-match-assoc
     str_macro-def ::= "DEFINE" uident
                     | "DEFINE" uident "=" expr
                     | "DEFINE" uident params "=" expr
                     | "UNDEF" uident
                     | "IFDEF" dexpr "THEN" st-or-mac else-str "END"
                     | "IFNDEF" dexpr "THEN" st-or-mac else-str "END"
          else-str ::= "ELSIFDEF" dexpr "THEN" st-or-mac else-str
                     | "ELSIFNDEF" dexpr "THEN" st-or-mac else-str
                     | "ELSE" st-or-mac else-str
                     | <nothing>
     sig_macro-def ::= "DEFINE" uident
                     | "DEFINE" uident params "=" type
                     | "UNDEF" uident
                     | "IFDEF" dexpr "THEN" sg-or-mac else-sig "END"
                     | "IFNDEF" dexpr "THEN" sg-or-mac else-sig "END"
          else-sig ::= "ELSIFDEF" dexpr "THEN" sg-or-mac else-sig
                     | "ELSIFNDEF" dexpr "THEN" sg-or-mac else-sig
                     | "ELSE" st-or-mac else-sig
                     | <nothing>
        macro-expr ::= "IFDEF" dexpr "THEN" expr else-expr "END"
                     | "IFNDEF" dexpr "THEN" expr else-expr "END"
                     | "__FILE__"
                     | "__LOCATION__"
         else-expr ::= "ELSIFDEF" dexpr "THEN" expr else-expr
                     | "ELSIFNDEF" dexpr "THEN" expr else-expr
                     | "ELSE" expr else-expr
        macro-patt ::= "IFDEF" dexpr "THEN" patt "ELSE" patt "END"
                     | "IFNDEF" dexpr "THEN" patt "ELSE" patt "END"
   macro-cons-decl ::= "IFDEF" dexpr "THEN" cons-decl "END"
                     | "IFDEF" dexpr "THEN" cons-decl
                       "ELSE" cons-decl "END"
                     | "IFNDEF" dexpr "THEN" cons-decl "END"
                     | "IFNDEF" dexpr "THEN" cons-decl
                       "ELSE" cons-decl "END"
macro-rec-lab-decl ::= "IFDEF" dexpr "THEN" rec-lab-decl "END"
                     | "IFDEF" dexpr "THEN" rec-lab-decl
                       "ELSE" rec-lab-decl "END"
                     | "IFNDEF" dexpr "THEN" rec-lab-decl "END"
                     | "IFNDEF" dexpr "THEN" rec-lab-decl
                       "ELSE" rec-lab-decl "END"
 macro-match-assoc ::= "IFDEF" dexpr "THEN" match_assoc "END"
                     | "IFDEF" dexpr "THEN" match-assoc
                       "ELSE" match-assoc "END"
                     | "IFNDEF" dexpr "THEN" match-assoc "END"
                     | "IFNDEF" dexpr "THEN" match-assoc
                       "ELSE" match-assoc "END"
         st-or-mac ::= str_macro-def
                     | structure
         sg-or-mac ::= sig-macro-def
                     | signature
            params ::= ident params
                     | ident
             dexpr ::= dexpr "OR" dexpr
                     | dexpr "AND" dexpr
                     | "NOT" dexpr
                     | uident
                     | "(" dexpr ")"
            uident ::= 'A'-'Z' ident
             ident ::= ident-char*
        ident-char ::= ('a'-'a' | 'A'-'Z' | '0'-'9' | '_' | ''' |
                        misc-byte)
         misc-byte ::= '\128'-'\255'

When a macro has been defined, as name e.g. “NAME”, the expressions and patterns are extended this way:

      expr ::= "NAME"
             | "NAME" "(" expr-params ")"
      patt ::= "NAME"
             | "NAME" "(" patt-params ")"
expr-params := expr "," expr-params
             | expr
patt-params := patt "," patt-params
             | patt

Notice that the identifiers “DEFINE”, “UNDEF”, “IFDEF”, “IFNDEF”, “ELSE”, “END”, “OR”, “AND” and “NOT” are new keywords (they cannot be used as identifiers of constructors or modules.

However, the identifiers “__FILE__” and “__LOCATION__” and the new defined macro names are not new identifiers.

Added command options

The parsing kit “pa_macro.cmo” also add two options usable in all Camlp5 commands:

-D uident
Define the uident in question like would have been a DEFINE (without parameter) in the code.
-U uident
Undefine the uident in question like would have been a UNDEF in the code.
-defined
Print the defined macros and exit.

Semantics

The statement “DEFINE” defines a new macro with optional parameters and an optional value. The macro name must start with an uppercase letter.

The test of a macro can be done either:

  • in structure items
  • in signature items
  • in expressions
  • in patterns
  • in a constructor declaration
  • in a match case

using the statement “IFDEF”. Its non-existence can be tested by “IFNDEF”. In expressions and patterns, the “ELSE” part is required, not in structure items.

The expression behind the “IFDEF” or the “IFNDEF” statement may use the operators “OR”, “AND” and “NOT” and contain parentheses.

Notice that in an “IFDEF” where the value is True (resp. False), the “ELSE” (resp “THEN”) part does not need to be semantically correct (well typed), since it does not appear in the resulting syntax tree. Same for “IFNDEF” and for possible macros parameters which are not used in the associated expression.

If a macro is defined twice, its first version is lost.

The statement “UNDEF” removes a macro definition.

When associated with a value, the “DEFINE” macro acts like a variable (or like a function call if it has parameters), except that the parameters are evaluated at parse time and can also be used also in pattern positions. Notice that this is a way to define constants by name in patterns. For example:

DEFINE WW1 = 1914;
DEFINE WW2 = 1939;
value war_or_year =
  fun
  [ WW1 -> "world war I"
  | WW2 -> "world war II"
  | _ -> "not a war" ]
;

In the definition of a macro, if the expression contains an evaluation, the evaluation is not done by Camlp5 but just transmitted as code. In this case, it does not work in pattern position. Example in the toplevel:

# DEFINE PLUS(x, y) = x + y;
# PLUS(3, 4);
- : int = 7
#   fun [ PLUS(3, 4) -> () ];
Toplevel input:
#   fun [ PLUS(3, 4) -> () ];
          ^^^^^^^^^^
Failure: this is not a constructor, it cannot be applied in a pattern

On the other hand, if the expression does not contain evaluation, this is possible:

# DEFINE FOO(x, y) = (x, Some y);
# FOO(True, "bar");
- : (bool * option string) = (True, Some "bar")
# fun [ FOO(_, "hello") -> 0 | _ -> 1 ];
- : ('a * option string) -> int = <fun>

The macro “__FILE__” is replaced by the current compiled source file name. In the OCaml toplevel, its value is the empty string.

The macro “__LOCATION__” is replaced by the the current location (two integers in number of characters from the beginning of the file, starting at zero) of the macro itself.

In signatures, the macro definitions can return types which can be used in type definitions.

In constructor declarations and in match cases, it is possible to conditionally define some cases by “IFDEF” or “IFNDEF”. For example:

type t =
  [ A of int
  | IFNDEF FOO THEN
      B of string
    END
  | C of bool ]
;

match x with
[ A i -> j
| IFNDEF FOO THEN
    B s -> toto
  END
| C b -> e ];

Predefined macros

The macro “CAMLP5” is always predefined.

The macro “OCAML_oversion” is predefined, where “oversion” is the OCaml version the Camlp5 program has been compiled with, where all characters but numbers are replaced by underscores. For example, if using OCaml 3.09.3, the macro “OCAML_3_09_3” is defined.

Moreover, for some Camlp5 versions (and all the versions which follows them), the macro “CAMLP5_version” is defined where “version” is the Camlp5 version where all characters but numbers are replaced by underscores. For example, in version 4.02, the macro “CAMLP5_4_02” had been defined and this macro have appeared in all versions of Camlp5 since 4.02.

To see which macros are predefined, type:

camlp5r pa_macro.cmo -defined

Pragma directive

Pragma directive

The directive “#pragma” allows to evaluate expressions at parse time, useful, for example, to test syntax extensions by the statement EXTEND without having to compile it in a separate file.

To use it, add the syntax extension “pa_pragma.cmo” in the Camlp5 command line. It adds the ability to use this directive.

As an example, let’s add syntax for the statement ‘repeat’ and use it immediately:

#pragma
  EXTEND
    GLOBAL: expr;
    expr: LEVEL "top"
      [ [ "repeat"; e1 = sequence; "until"; e2 = SELF ->
            <:expr< do { $e1$; while not $e2$ do { $e1$ } } >> ] ]
    ;
    sequence:
      [ [ el = LIST1 expr_semi -> <:expr< do { $list:el$ } >> ] ]
    ;
    expr_semi:
      [ [ e = expr; ";" -> e ] ]
    ;
  END;

let i = ref 1 in
repeat print_int i.val; print_endline ""; incr i; until i.val = 10;

The compilation of this example (naming it “foo.ml”) can be done with the command:

ocamlc -pp "camlp5r q_MLast.cmo pa_extend.cmo pa_pragma.cmo" -I +camlp5 foo.ml

Notice that it is still experimental and probably incomplete, for the moment.

Extensible functions

Extensible functions

Extensible functions allows the definition of pattern matching functions which are extensible by adding new cases that are inserted automatically at the proper place by comparing the patterns. The pattern cases are ordered according to syntax trees representing them, “when” statements being inserted before the cases without “when”.

Notice that extensible functions are functional: when extending a function, a new function is returned.

The extensible functions are used in the pretty printing system of Camlp5.

Syntax

The syntax of the extensible functions, when loading “pa_extfun.cmo”, is the following:

         expression ::= extensible-function
extensible-function ::= "extfun" expression "with" "[" match-cases "]"
        match-cases ::= match-case "|" match-cases
         match-case ::= pattern "->" expression
                      | pattern "when" expression "->" expression

It is an extension of the same syntax as the “match” and “try” constructions.

Semantics

The statement “extend” defined by the syntax takes an extensible function and return another extensible function with the new match cases inserted at the proper place within the initial extensible function.

Extensible functions are of type “Extfun.t a b”, which is an abstract type, where “a” and “b” are respectively the type of the patterns and the type of the expressions. It corresponds to a function of type “a -> b”.

The function “Extfun.apply” takes an extensible function as parameter and returns a function which can be applied like a normal function.

The value “Extfun.empty” is an empty extensible function, of type “Extfun.t 'a 'b”. When applied with “Extfun.apply” and a parameter, it raises the exception “Extfun.Failure” whatever the parameter.

For debugging, it is possible to use the function “Extfun.print” which displays the match cases of the extensible functions. (Only the patterns are displayed in clear text, the associated expressions are not.)

The match cases are inserted according to the following rules:

  • The match cases are inserted in the order they are defined in the syntax “extend
  • A match case pattern with “when” is inserted before a match case pattern without “when”.
  • Two match cases patterns both with “when” or both without “when” are inserted according to the alphabetic order of some internal syntax tree of the patterns where bound variables names are not taken into account.
  • If two match cases patterns without “when” have the same patterns internal syntax tree, the initial one is silently removed.
  • If two match cases patterns with “when” have the same patterns internal syntax tree, the new one is inserted before the old one.

Deprecated Information

Low-level Shell usage

The main shell commands are:

  • camlp5o : to treat files written in normal OCaml syntax,
  • camlp5r : to treat files written in a new syntax named The revised syntax .

These commands can be given as parameters of the option -pp of the OCaml compiler. Examples:

ocamlc -pp camlp5o foo.ml
ocamlc -pp camlp5r bar.ml

This way, the parsing is done by Camlp5. In case of syntax errors, the parsing fails with an error message and the compilation is aborted. Otherwise, the OCaml compiler continues with the syntax tree provided by Camlp5.

In the toplevel, it is possible to preprocess the input phrases by loading one of the files “camlp5o.cma” or “camlp5r.cma”. The common usage is:

ocaml -I +camlp5 camlp5o.cma
ocaml -I +camlp5 camlp5r.cma

It is possible that, in your installation, the Camlp5 library is not in the OCaml directory. In this case, the commands must be:

ocaml -I `camlp5 -where` camlp5o.cma
ocaml -I `camlp5 -where` camlp5r.cma

In general, in this documentation, when a command requires:

-I +camlp5

it can be replaced by:

-I `camlp5 -where`

or, by:

-I <directory>

where “directory” is the directory path where the Camlp5 library files are installed.

Parsing and Printing kits

Parsing and printing extensions are OCaml object files, i.e. files with the extension “.cmo” or “.cma”. They are the result of the compilation of OCaml source files containing what is necessary to do the parsing or printing. These object files are named parsing and printing kits.

These files cannot be linked to produce executables because they generally call functions and use variables defined only in Camlp5 core, typically belonging to the module “Pcaml”. The kits are designed to be loaded by the Camlp5 commands, either through their command arguments or through directives in the source files.

It is therefore important to compile the kits with the option “-c” of the OCaml compiler (i.e. just compilation, not producing an executable) and with the option “-I   +camlp5” (or “-I `camlp5 -where`”) to inform the compiler to find module interfaces in installed Camlp5 library.

In the OCaml toplevel, it is possible to use a kit by simply loading it with the directive “#load”.

Extending syntax

A syntax extension is a Camlp5 parsing kit. There are two ways to use a syntax extension:

  • Either by giving this object file as parameter to the Camlp5

command. For example:

ocamlc -pp "camlp5o ./myext.cmo" foo.ml
  • Or by adding the directive “#load” in the source file:
#load "./myext.cmo";;

and then compile it simply like this:

ocamlc -pp camlp5o foo.ml

Several syntax extensions can be used for a single file. The way to create one’s own syntax extensions is explained in this document.

Pretty printing

As for syntax extensions, the pretty printing is defined or extended through Camlp5 printing kits. Some pretty printing kits are provided by Camlp5, the main ones being:

  • pr_o.cmo: to pretty print in normal syntax,
  • pr_r.cmo: to pretty print in revised syntax.

Examples: if we have a file, foo.ml, written in normal syntax and and another one, bar.ml, written in revised syntax, here are the commands to pretty print them in their own syntax:

camlp5o pr_o.cmo foo.ml
camlp5r pr_r.cmo bar.ml

And how to convert them into the other syntax:

camlp5o pr_r.cmo foo.ml
camlp5r pr_o.cmo foo.ml

The way to create one’s own pretty printing extensions is explained in this document.

Note: the revised syntax

The revised syntax is a specific syntax whose aim is to resolve some problems and inconsistencies of the normal OCaml syntax. A chapter will explain the differences between the normal and the revised syntax.

All examples of this documentation are written in that revised syntax. Even if you don’t know it, it is not difficult to understand. The same examples can be written in normal syntax. In case of problems, refer to the chapter describing it.

Future work

Future work

Just some ideas… things to implement or to think about…

extensible lexers

Extensible lexers would be an interesting extension. And possibly lexers using regular expressions (extensible if possible, otherwise as a different module).

utf-8

The Camlp5 lexer for OCaml programs (module Plexer) allows utf-8 characters. Since utf-8 seems to have some success among unicode formats, perhaps a reflection to add greek characters and/or real utf-8 arrows in the syntax (in particular in types) would be interesting.

Appendix

Commands and Files

Commands and Files

The main command of Camlp5 is “camlp5”. It is an OCaml program in bytecode (compiled with ocamlc, not ocamlopt), able to dynamically load OCaml object files (ending with “.cmo” and “.cma”).

We repeat (as explained in :Introduction) that it should almost never be necessary to use the camlp5 command directly: instead, specifying -syntax and various preprocessor packages to ocamlfind should be sufficient.

Most other Camlp5 commands derive from that one: they are the command “camlp5” with some implicitely applied parameters.

Two other commands are provided: “mkcamlp5” and “mkcamlp5.opt”. They allow to create camlp5 executables with already loaded kits.

All commands have an option “-help” which display all possible command parameters and options. Notice that some parameters (the parsing and pretting kits) may add new options. For example, the command:

camlp5 pr_r.cmo -help

prints more lines than just:

camlp5 -help

The first parameter (“load options”) allows to specify parsing and printing kits (“.cmo” and “.cma” files) which are loaded inside the “camlp5” core before any action. Other options may follow.

Parsing and Printing Kits

Parsing kits

language parsing kits

pa_r.cmo
Revised syntax (without parsers).
pa_rp.cmo
Add revised syntax parsers.
pa_o.cmo

Normal syntax (without parsers). Option added:

-no_quot
don’t parse quotations, allowing to use, e.g. “<:>” as token.
pa_op.cmo
Add normal syntax parsers.
pa_oop.cmo
Add normal syntax parsers without code optimization.
pa_lexer.cmo
Add stream lexers.

extensible grammars

pa_extend.cmo

Add the EXTEND statement. Options added:

-split_ext
split EXTEND by functions to turn around a PowerPC problem.
-quotify
generate code for quotations (internally used to synchronize q_MLast and pa_r)
-meta_action
undocumented (internally used for compiled version)
pa_extfold.cmo
Add the specific symbols FOLD0 and FOLD1 to the EXTEND statement.

extensible functions and printers

pa_extfun.cmo
Add the extensible function (“extfun” statement).
pa_extprint.cmo
Add the EXTEND_PRINTER statement.

functional parsers

pa_fstream.cmo
Add the functional parsers (“fparser” statement) and the backtracking parsers (“bparser” statement).

other languages

pa_lisp.cmo
Lisp syntax.
pa_scheme.cmo
Scheme syntax.
pa_sml.cmo
SML syntax.

other parsing kits

pa_lefteval.cmo
Add guarantee of left evaluation in functions calls.
pa_macro.cmo

Add macros. Options added:

-D <string>
define for IFDEF statement
-U <string>
undefine for IFDEF statement
-defined
print the defined macros and exit
pa_pragma.cmo
Add pragma directive: evaluations at parse time

Printing kits

language printing kits

pr_r.cmo

Display in revised syntax. Added options:

-flag <str>
Change pretty printing behaviour according to “<str>”: A/a enable/disable all flags C/c enable/disable comments in phrases D/d enable/disable allowing expanding ‘declare’ E/e enable/disable equilibrate cases L/l enable/disable allowing printing ‘let..in’ horizontally S/s enable/disable printing sequences beginners at end of lines default setting is “aS”.
-wflag <str>
Change displaying ‘where’ statements instead of ‘let’: A/a enable/disable all flags I/i enable/disable ‘where’ after ‘in’ L/l enable/disable ‘where’ after ‘let..=’ M/m enable/disable ‘where’ after ‘match’ and ‘try’ P/p enable/disable ‘where’ after left parenthesis R/r enable/disable ‘where’ after ‘record_field..=’ S/s enable/disable ‘where’ in sequences T/t enable/disable ‘where’ after ‘then’ or ‘else’ V/v enable/disable ‘where’ after ‘value..=’ W/w enable/disable ‘where’ after ‘->’ default setting is “Ars”.
-l <length>
Maximum line length for pretty printing (default 78)
-sep_src
Read source file for text between phrases (default).
-sep <string>
Use this string between phrases instead of reading source.
pr_ro.cmo
Add display objects, labels and variants in revised syntax.
pr_rp.cmo
Add display parsers with their (revised) syntax.
pr_o.cmo

Display in normal syntax. Added options:

-flag <str>
Change pretty printing behaviour according to <str>: A/a enable/disable all flags C/c enable/disable comments in phrases E/e enable/disable equilibrate cases L/l enable/disable allowing printing ‘let..in’ horizontally M/m enable/disable printing double semicolons default setting is “Am”.
-l <length>
Maximum line length for pretty printing (default 78)
-sep_src
Read source file for text between phrases (default).
-sep <string>
Use this string between phrases instead of reading source.
pr_op.cmo
Add displaying parsers with their (normal) syntax.

extensible parsers

pr_extend.cmo

Add the displaying of EXTEND statements in their initial syntax.Option added:

-no_slist
Don’t reconstruct SLIST, SOPT, SFLAG

extensible functions and printers

pr_extfun.cmo
Add displaying extensible functions (“extfun” statement) in their initial syntax.
pr_extprint.cmo
Add displaying extensible printers (“EXTEND_PRINTER” statement) in their initial syntax.

other language

pr_scheme.cmo

Display in Scheme syntax. Option added:

-l <length>
Maximum line length for pretty printing (default 78)
-sep <string>
Use this string between phrases instead of reading source.
pr_schemep.cmo
Add display parsers with their (Scheme) syntax.

other printing kits

pr_depend.cmo

Display dependencies. Option added:

-I dir
Add “dir” to the list of search directories.
pr_dump.cmo
Dump the syntax tree in binary (for the OCaml compiler)
pr_null.cmo
No output.

Quotations expanders

q_MLast.cmo
Syntax tree quotations. Define the quotations named: “expr”, “patt”, “ctyp”, “str_item”, “sig_item”, “module_type”, “module_expr”, “class_type”, “class_expr”, “class_sig_item”, “class_str_item”, “with_constr” and “poly_variant”.
q_phony.cmo
Transform quotations into phony variables to be able to pretty print the quotations in their initial form (not suitable for compilation)

Commands

camlp5r
Shortcut for “camlp5 pa_r.cmo pa_rp.cmo pr_dump.cmo
camlp5r.opt
Same as previous, but in native code instead of bytecode, therefore faster. But not extensible: it is not possible to add other parsing or printing kits neither in command arguments nor with the “load” directive inside sources. Suitable for compiling sources not using other syntax extensions.
camlp5o
Shortcut for “camlp5 pa_o.cmo pa_op.cmo pr_dump.cmo
camlp5o.opt
Same as previous, and like “camlp5r.opt”, faster and not extensible. Moreover, this has been produced by compilation of Camlp5 grammars, resulting in a still faster executable.
camlp5sch
Shortcut for “camlp5 pa_scheme.cmo pr_dump.cmo
mkcamlp5
creates camlp5 executables with almost the same options than ocamlmktop. The interfaces to be visible must be explicitly added in the command line as “.cmi” files. mkcamlp5 is a wrapper around ocamlfind, and typically arguments to ocamlfind are passed-thru. So for instance, to add the OCaml module “str”: “mkcamlp5 -package camlp5,str str.cmi -o camlp5str
mkcamlp5.opt
creates camlp5 executables like mkcamlp5, except that it is in native code, therefore faster, but not extensible; the added kits must be ocamlfind packages (or cmx or cmxa files)

Environment variable

When running a program using extensible grammars (in particular, the camlp5 commands), the environment variable “CAMLP5PARAM” is consulted. It sets the grammar parsing algoritm parameters.

This variable must be a sequence of parameter specifications. A parameter specification is a letter optionally followed by an = and a value, with any separator. There are four possible parameters:

b
Set the full backtrack algorithm as default.
f
Set the limited backtrack algorithm as default.
t
Trace symbols (terminals and non-terminals) while parsing with backtracking.
y
In backtracking, trace the advance in the input stream (number of unfrozen tokens) and the possible stalling (number of tokens tests).
l=value
Set the maximum stalling value.

OCaml toplevel files

These object files can be loaded in the OCaml toplevel to make Camlp5 parse the input. It is possible to load them either by putting them as parameters of the toplevel, or by using the directive “load”. The option “-I +camlp5” (or “-I   `camlp5 -where`”) must be added to the “ocaml” command (the OCaml toplevel).

camlp5r.cma
Read phrases and display results in revised syntax
camlp5o.cma
Read phrases and display results in normal syntax
camlp5sch.cma
Read phrases in Scheme syntax

Library files

The Camlp5 library is named “gramlib.cma” and its native code version is “gramlib.cmxa”. They contain the modules:

This is a pure library : when linking with it, the Camlp5 program is not included.

Library

Library

All modules defined in “gramlib.cma”, but not including all Camlp5 modules used by the Camlp5 commands and kits.

Ploc module

Building and combining locations. This module also contains some pervasive types and functions.

type t = 'abstract;
Location type.

located exceptions

exception Exc of location and exn;
Ploc.Exc loc e” is an encapsulation of the exception “e” with the input location “loc”. To be used to specify a location for an error. This exception must not be raised by the OCaml function “raise”, but rather by “Ploc.raise” (see below), to prevent the risk of several encapsulations of “Ploc.Exc”.
value raise : t -> exn -> 'a;
Ploc.raise loc e”, if “e” is already the exception “Ploc.Exc”, re-raise it (ignoring the new location “loc”), else raise the exception “Ploc.Exc loc e”.

making locations

value make_loc : string -> int -> int -> (int * int) -> string -> t;
Ploc.make_loc fname line_nb bol_pos (bp, ep) comm” creates a location starting at line number “line_nb”, where the position of the beginning of the line is “bol_pos” and between the positions “bp” (included) and “ep” excluded. And “comm” is the comment before the location. The positions are in number of characters since the begin of the stream.
value make_unlined : (int * int) -> t;
Ploc.make_unlined” is like “Ploc.make” except that the line number is not provided (to be used e.g. when the line number is unknown).
value dummy : t;
Ploc.dummy” is a dummy location, used in situations when location has no meaning.

getting location info

value file_name : t -> string;
Ploc.file_name loc” returns the file name of the location.
value first_pos : t -> int;
Ploc.first_pos loc” returns the initial position of the location in number of characters since the beginning of the stream.
value last_pos : t -> int;
Ploc.last_pos loc” returns the final position plus one of the location in number of characters since the beginning of the stream.
value line_nb : t -> int;
Ploc.line_nb loc” returns the line number of the location or “-1” if the location does not contain a line number (i.e. built with “Ploc.make_unlined” above).
value bol_pos : t -> int;
Ploc.bol_pos loc” returns the position of the beginning of the line of the location in number of characters since the beginning of the stream, or “0” if the location does not contain a line number (i.e. built the with “Ploc.make_unlined” above).
value comment : t -> string;
Ploc.comment loc” returns the comment before the location.

combining locations

value encl : t -> t -> t;
Ploc.encl loc1 loc2” returns the location starting at the smallest start and ending at the greatest end of the locations “loc1” and “loc2”. In other words, it is the location enclosing “loc1” and “loc2”.
value shift : int -> t -> t;
Ploc.shift sh loc” returns the location “loc” shifted with “sh” characters. The line number is not recomputed.
value sub : t -> int -> int -> t;
Ploc.sub loc sh len” is the location “loc” shifted with “sh” characters and with length “len”. The previous ending position of the location is lost.
value after : t -> int -> int -> t;
Ploc.after loc sh len” is the location just after loc (starting at the end position of “loc”) shifted with “sh” characters and of length “len”.
value with_comment : t -> string -> t;
Change the comment part of the given location

miscellaneous

value name : ref string;
Ploc.name.val” is the name of the location variable used in grammars and in the predefined quotations for OCaml syntax trees. Default: “``”loc”``”.
value get : string -> t -> (int * int * int * int * int);
Ploc.get fname loc” returns in order: 1/ the line number of the begin of the location, 2/ its column, 3/ the line number of the first character not in the location, 4/ its column and 5/ the length of the location. The parameter “fname” is the file where the location occurs.
value from_file : string -> t -> (string * int * int * int);
Ploc.from_file fname loc” reads the file “fname” up to the location “loc” and returns the real input file, the line number and the characters location in the line; the real input file can be different from “fname” because of possibility of line directives typically generated by /lib/cpp.

pervasives

type vala 'a =
  [ VaAnt of string
  | VaVal of 'a ]
;

Encloser of many abstract syntax tree notes types, in “strict” mode. This allow the system of antiquotations of abstract syntax tree quotations to work when using the quotation kit “q_ast.cmo”.

value call_with : ref 'a -> 'a -> ('b -> 'c) -> 'b -> 'c;
Ploc.call_with r v f a” sets the reference “r” to the value “v”, then calls “f a”, and resets “r” to its initial value. If “f a” raises an exception, its initial value is also reset and the exception is reraised. The result is the result of “f     a”.

Plexing module

Lexing for Camlp5 grammars.

This module defines the Camlp5 lexer type to be used in extensible grammars (see module “Grammar”). It also provides some useful functions to create lexers.

type pattern = (string * string);

Type for values used by the generated code of the EXTEND statement to represent terminals in entry rules.

  • The first string is the constructor name (must start with an uppercase character). When empty, the second string should be a keyword.
  • The second string is the constructor parameter. Empty if it has no parameter (corresponding to the ‘wildcard’ pattern).
  • The way tokens patterns are interpreted to parse tokens is done by the lexer, function “tok_match” below.
exception Error of string;
A lexing error exception to be used by lexers.

lexer type

type lexer 'te =
  { tok_func : lexer_func 'te;
    tok_using : pattern -> unit;
    tok_removing : pattern -> unit;
    tok_match : mutable pattern -> 'te -> string;
    tok_text : pattern -> string;
    tok_comm : mutable option (list Ploc.t) }

The type for lexers compatible with Camlp5 grammars. The parameter type “'te” is the type of the tokens.

  • The field “tok_func” is the main lexer function. See “lexer_func” type below.
  • The field “tok_using” is a function called by the “EXTEND” statement to warn the lexer that a rule uses this pattern (given as parameter). This allow the lexer 1/ to check that the pattern constructor is really among its possible constructors 2/ to enter the keywords in its tables.
  • The field “tok_removing” is a function possibly called by the “DELETE_RULE” statement to warn the lexer that this pattern (given as parameter) is no longer used in the grammar (the grammar system maintains a number of usages of all patterns and calls this function when this number falls to zero). If it is a keyword, this allows the lexer to remove it in its tables.
  • The field “tok_match” is a function called by the Camlp5 grammar system to ask the lexer how the input tokens should be matched against the patterns. Warning: for efficiency, this function must be written as a function taking patterns as parameters and, for each pattern value, returning a function matching a token, not as a function with two parameters.
  • The field “tok_text” is a function called by the grammar system to get the name of the tokens for the error messages, in case of syntax error, or for the displaying of the rules of an entry.
  • The field “tok_comm” is a mutable place where the lexer can put the locations of the comments, if its initial value is not “None”. If it is “None”, nothing has to be done by the lexer.
and lexer_func 'te = Stream.t char -> (Stream.t 'te *       location_function)
The type of a lexer function (field “tok_func” of the type “lexer”). The character stream is the input stream to be lexed. The result is a pair of a token stream and a location function (see below) for this tokens stream.
and location_function = int -> Ploc.t;
The type of a function giving the location of a token in the source from the token number in the stream (starting from zero).
value lexer_text : pattern -> string;
A simple “tok_text” function.
value default_match : pattern -> (string * string) ->       string;
A simple “tok_match” function, appling to the token type “(string * string)”.

lexers from parsers or ocamllex

The functions below create lexer functions either from a “char    stream” parser or for an “ocamllex” function. With the returned function “f”, it is possible to get a simple lexer (of the type “Plexing.lexer” above):

{Plexing.tok_func = f;
 Plexing.tok_using = (fun _ -> ());
 Plexing.tok_removing = (fun _ -> ());
 Plexing.tok_match = Plexing.default_match;
 Plexing.tok_text = Plexing.lexer_text}

Note that a better “tok_using” function would check the used tokens and raise “Plexing.Error” for incorrect ones. The other functions “tok_removing”, “tok_match” and “tok_text” may have other implementations as well.

value lexer_func_of_parser :
  ((Stream.t char * ref int * ref int) -> ('te * Ploc.t)) -> lexer_func 'te;

A lexer function from a lexer written as a char stream parser returning the next token and its location. The two references with the char stream contain the current line number and the position of the beginning of the current line.

value lexer_func_of_ocamllex : (Lexing.lexbuf -> 'te) -> lexer_func 'te;

A lexer function from a lexer created by “ocamllex”. .. rubric:: function to build a stream and a location function

name:function-to-build-a-stream-and-a-location-function
value make_stream_and_location :
  (unit -> ('te * Ploc.t)) -> (Stream.t 'te * location_function);

useful functions and values

value eval_char : string -> char; value eval_string : Ploc.t -> string -> string;

Convert a char or a string token, where the backslashes are not been interpreted into a real char or string; raise “Failure” if a bad backslash sequence is found; “Plexing.eval_char (Char.escaped c)” returns “c” and “Plexing.eval_string (String.escaped s)” returns s.

value restore_lexing_info : ref (option (int * int)); value line_nb : ref (ref int); value bol_pos : ref (ref int);

Special variables used to reinitialize line numbers and position of beginning of line with their correct current values when a parser is called several times with the same character stream. Necessary for directives (e.g. #load or #use) which interrupt the parsing. Without usage of these variables, locations after the directives can be wrong.

backward compatibilities

Deprecated since version 4.08.

type location = Ploc.t; value make_loc : (int * int) -> location; value dummy_loc : location; .. rubric:: Plexer module

name:plexer-module

This module contains a lexer used for OCaml syntax (revised and normal).

lexer

value gmake : unit -> Plexing.lexer (string * string);
gmake ()” returns a lexer compatible with the extensible grammars. The returned tokens follow the normal syntax and the revised syntax lexing rules.

The token type is “(string * string)” just like the pattern type.

The meaning of the tokens are:

  • ("", s) is the keyword s,
  • ("LIDENT", s) is the ident s starting with a lowercase letter,
  • ("UIDENT", s) is the ident s starting with an uppercase letter,
  • ("INT", s) is an integer constant whose string source is s,
  • ("INT_l", s) is an 32 bits integer constant whose string source is s,
  • ("INT_L", s) is an 64 bits integer constant whose string source is s,
  • ("INT_n", s) is an native integer constant whose string source is s,
  • ("FLOAT", s) is a float constant whose string source is s,
  • ("STRING", s) is the string constant s,
  • ("CHAR", s) is the character constant s,
  • ("TILDEIDENT", s) is the tilde character “~” followed by the ident s,
  • ("TILDEIDENTCOLON", s) is the tilde character “~” followed by the ident s and a colon “:”,
  • ("QUESTIONIDENT", s) is the question mark “?” followed by the ident s,
  • ("QUESTIONIDENTCOLON", s) is the question mark “?” followed by the ident s and a colon “:”,
  • ("QUOTATION", "t:s") is a quotation “t” holding the string s,
  • ("ANTIQUOT", "t:s") is an antiquotation “t” holding the string s,
  • ("EOI", "") is the end of input.

The associated token patterns in the EXTEND statement hold the same names as the first string (constructor name) of the tokens expressions above.

Warning: the string associated with the “STRING” constructor is the string found in the source without any interpretation. In particular, the backslashes are not interpreted. For example, if the input is "\n" the string is *not* a string with one element containing the “newline” character, but a string of two elements: the backslash and the "n" letter.

Same thing for the string associated with the “CHAR” constructor.

The functions “Plexing.eval_string” and “Plexing.eval_char” allow to convert them into the real corresponding string or char value.

flags

value dollar_for_antiquotation : ref bool;
When True (default), the next call to “Plexer.gmake ()” returns a lexer where the dollar sign is used for antiquotations. If False, there is no antiquotations and the dollar sign can be used as normal token.
value specific_space_dot : ref bool;
When “False” (default), the next call to “Plexer.gmake ()” returns a lexer where there is no difference between dots which have spaces before and dots which don’t have spaces before. If “True”, dots which have spaces before return the keyword " ." (space dot) and the ones which don’t have spaces before return the keyword "." (dot alone).
value no_quotations : ref bool;
When “True”, all lexers built by “Plexer.gmake       ()” do not lex the quotation syntax. Default is “False” (quotations are lexed).
value utf8_lexing : ref bool;
When “True”, all lexers built by “Plexer.gmake        ()]” use utf-8 encoding to specify letters and punctuation marks. Default is False (all characters between ‘128’ and ‘255’ are considered as letters).

Gramext module

This module is not intended to be used by the casual programmer.

It shows, in clear, the implementations of grammars and entries types, the normal access being through the “Grammar” module where these types are abstract. It can be useful for programmers interested in scanning the contents of grammars and entries, for example to make analyses on them.

grammar type

type grammar 'te =
  { gtokens : Hashtbl.t Plexing.pattern (ref int);
    glexer : mutable Plexing.lexer 'te }
;

The visible type of grammars, i.e. the implementation of the abstract type “Grammar.g”. It is also the implementation of an internal grammar type used in the Grammar functorial interface. The type parameter “'te” is the type of the tokens, which is “(string * string)” for grammars built with “Grammar.gcreate”, and any type for grammars built with the functorial interface. The field “gtokens” records the count of usages of each token pattern, allowing to call the lexer function “tok_removing” (see the Plexing module) when this count reaches zero. The field “lexer” is the lexer. .. rubric:: entry type

name:entry-type
type g_entry 'te =
  { egram : grammar 'te;
    ename : string;
    elocal : bool;
    estart : mutable int -> Stream.t 'te -> Obj.t;
    econtinue : mutable int -> int -> Obj.t -> Stream.t 'te -> Obj.t;
    edesc : mutable g_desc 'te }

The visible type for grammar entries, i.e. the implementation of the abstract type “Grammar.Entry.e” and the type of entries in the Grammar functorial interface. Notice that these entry types have a type parameter which does not appear in the “g_entry” type (the “'te” parameter is, as for grammars above, the type of the tokens). This is due to the specific typing system of the EXTEND statement which sometimes must hide real types, the OCaml normal type system not being able to type Camlp5 grammars. Meaning of the fields:

  • egram : the associated grammar
  • ename : the entry name
  • elocal : True if the entry is local (local entries are written with a star character “*” by Grammar.Entry.print)
  • estart and econtinue are parsers of the entry used in the grammar machinery
  • edesc : the entry description (see below)
and g_desc 'te =
  [ Dlevels of list (g_level 'te)
  | Dparser of Stream.t 'te -> Obj.t ]

The entry description.

  • The constructor “Dlevels” is for entries built by “Grammar.Entry.create” and extendable by the EXTEND statement.
  • The constructor “Dparser” is for entries built by “Grammar.Entry.of_parser”.
and g_level 'te =
  { assoc : g_assoc;
    lname : option string;
    lsuffix : g_tree 'te;
    lprefix : g_tree 'te }
and g_assoc = [ NonA | RightA | LeftA ]

Description of an entry level.

  • assoc : the level associativity
  • lname : the level name, if any
  • lsuffix : the tree composed of the rules starting with “SELF
  • lprefix : the tree composed of the rules not starting with “SELF
and g_symbol 'te =
  [ Smeta of string and list (g_symbol 'te) and Obj.t
  | Snterm of g_entry 'te
  | Snterml of g_entry 'te and string
  | Slist0 of g_symbol 'te
  | Slist0sep of g_symbol 'te and g_symbol 'te
  | Slist1 of g_symbol 'te
  | Slist1sep of g_symbol 'te and g_symbol 'te
  | Sopt of g_symbol 'te
  | Sflag of g_symbol 'te
  | Sself
  | Snext
  | Stoken of Plexing.pattern
  | Stree of g_tree 'te ]

Description of a rule symbol.

  • The constructor “Smeta” is used by the extensions FOLD0 and FOLD1
  • The constructor “Snterm” is the representation of a non-terminal (a call to another entry)
  • The constructor “Snterml” is the representation of a non-terminal at some given level
  • The constructor “Slist0” is the representation of the symbol LIST0
  • The constructor “Slist0sep” is the representation of the symbol LIST0 followed by SEP
  • The constructor “Slist1” is the representation of the symbol LIST1
  • The constructor “Slist1sep” is the representation of the symbol LIST1 followed by SEP
  • The constructor “Sopt” is the representation of the symbol OPT
  • The constructor “Sflag” is the representation of the symbol FLAG
  • The constructor “Sself” is the representation of the symbol SELF
  • The constructor “Snext” is the representation of the symbol NEXT
  • The constructor “Stoken” is the representation of a token pattern
  • The constructor “Stree” is the representation of a anonymous rule list (between brackets).
and g_action = Obj.t

The semantic action, represented by a type “Obj.t” due to the specific typing of the EXTEND statement (the semantic action being able to be any function type, depending on the rule).

and g_tree 'te =
  [ Node of g_node 'te
  | LocAct of g_action and list g_action
  | DeadEnd ]
and g_node 'te =
  { node : g_symbol 'te; son : g_tree 'te; brother : g_tree 'te }
;

The types of tree and tree nodes, representing a list of factorized rules in an entry level.

  • The constructor “Node” is a representation of a symbol (field “node”), the rest of the rule tree (field “son”), and the following node, if this node fails (field “brother”)
  • The constructor “LocAct” is the representation of an action, which is a function having all pattern variables of the rule as parameters and returning the rule semantic action. The list of actions in the constructor correspond to possible previous actions when it happens that rules are masked by other rules.
  • The constructor “DeadEnd” is a representation of a nodes where the tree fails or is in syntax error.
type position =
  [ First
  | Last
  | Before of string
  | After of string
  | Level of string ]
;

The type of position where an entry extension takes place.

  • First : corresponds to FIRST
  • Last : corresponds to LAST
  • Before s : corresponds to BEFORE “s”
  • After s : corresponds to AFTER “s”
  • Level s : corresponds to LEVEL “s”

The module contains other definitions but for internal use.

Grammar module

Extensible grammars.

This module implements the Camlp5 extensible grammars system. Grammars entries can be extended using the EXTEND statement, added by loading the Camlp5 “pa_extend.cmo” file.

main types and values

type g = 'abstract;
The type of grammars, holding entries.
value gcreate : Plexing.lexer (string * string) -> g;
Create a new grammar, without keywords, using the given lexer.
value tokens : g -> string -> list (string * int);

Given a grammar and a token pattern constructor, returns the list of the corresponding values currently used in all entries of this grammar. The integer is the number of times this pattern value is used. Examples:

  • The call: Grammar.tokens g "" returns the keywords list.
  • The call: Grammar.tokens g "IDENT" returns the list of all usages of the pattern “IDENT” in the EXTEND statements.
value glexer : g -> Plexing.lexer token;
Return the lexer used by the grammar

type parsable = 'abstract; value parsable : g -> Stream.t char -> parsable;

Type and value allowing to keep the same token stream between several calls of entries of the same grammar, to prevent loss of tokens. To be used with Entry.parse_parsable below
module Entry =
  sig
    type e 'a = 'x;
    value create : g -> string -> e 'a;
    value parse : e 'a -> Stream.t char -> 'a;
    value parse_all : e 'a -> Stream.t char -> list 'a;
    value parse_token_stream : e 'a -> Stream.t token -> 'a;
    value parse_parsable : e 'a -> parsable -> 'a;
    value name : e 'a -> string;
    value of_parser : g -> string -> (Stream.t token -> 'a) -> e 'a;
    value print : e 'a -> unit;
    value find : e 'a -> string -> e Obj.t;
    external obj : e 'a -> Gramext.g_entry token = "%identity";
  end;

Module to handle entries.

  • Grammar.Entry.e : type for entries returning values of type “'a”.
  • Grammar.Entry.create g n : creates a new entry named “n” in the grammar “g”.
  • Grammar.Entry.parse e : returns the stream parser of the entry “e”.
  • Grammar.Entry.parse_all e : returns the stream parser returning all possible values while parsing with the entry “e”: may return more than one value when the parsing algorithm is “Grammar.Backtracking”.
  • Grammar.Entry.parse_token_stream e : returns the token stream parser of the entry “e”.
  • Grammar.Entry.parse_parsable e : returns the parsable parser of the entry “e”.
  • Grammar.Entry.name e : returns the name of the entry “e”.
  • Grammar.Entry.of_parser g n p : makes an entry from a token stream parser.
  • Grammar.Entry.print e : displays the entry “e” using “Format”.
  • Grammar.Entry.find e s : finds the entry named s in the rules of “e”.
  • Grammar.Entry.obj e : converts an entry into a “Gramext.g_entry” allowing to see what it holds.
value of_entry : Entry.e 'a -> g;
Return the grammar associated with an entry.

printing grammar entries

The function “Grammar.Entry.print” displays the current contents of an entry. Interesting for debugging, to look at the result of a syntax extension, to see the names of the levels.

Try, for example, in the OCaml toplevel:

Grammar.Entry.print Format.std_formatter Pcaml.expr

The display does not include the patterns nor the semantic actions, whose sources are not recorded in the grammar entries data.

Moreover, the local entries (not specified in the GLOBAL indicator of the EXTEND statement) are indicated with a star (”*”) to inform that they are not directly accessible.

clearing grammars and entries

module Unsafe :
  sig
    value gram_reinit : g -> Plexing.lexer token -> unit;
    value clear_entry : Entry.e 'a -> unit;
  end;

Module for clearing grammars and entries. To be manipulated with care, because: 1) reinitializing a grammar destroys all tokens and there may be problems with the associated lexer if there are keywords; 2) clearing an entry does not destroy the tokens used only by itself.

  • Grammar.Unsafe.reinit_gram g lex removes the tokens of the grammar and sets “lex” as a new lexer for “g”. Warning: the lexer itself is not reinitialized.
  • Grammar.Unsafe.clear_entry e removes all rules of the entry “e”.

scan entries

value print_entry : Format.formatter -> Gramext.g_entry 'te -> unit;
General printer for all kinds of entries (obj entries).
value iter_entry :
  (Gramext.g_entry 'te -> unit) -> Gramext.g_entry 'te -> unit;

Grammar.iter_entry f e” applies “f” to the entry “e” and transitively all entries called by “e”. The order in which the entries are passed to “f” is the order they appear in each entry. Each entry is passed only once.

value fold_entry : (Gramext.g_entry 'te -> 'a -> 'a) -> Gramext.g_entry 'te -> 'a -> 'a;
Grammar.fold_entry f e init” computes “(f eN .. (f     e2 (f e1 init)))”, where “e1 .. eN” are “e” and transitively all entries called by “e”. The order in which the entries are passed to “f” is the order they appear in each entry. Each entry is passed only once.

parsing algorithm

type parse_algorithm = Gramext.parse_algorithm ==
  [ Imperative | Backtracking | DefaultAlgorithm ]
;

Type of algorithm used in grammar entries.

  • Imperative: use imperative streams
  • Functional: use functional streams with limited backtracking
  • Backtracking: use functional streams with full backtracking
  • DefaultAlgorithm: use the general default algorithm set by the function “set_default_algorithm” below, or through the environment variable CAMLP5PARAM.

The default, when a grammar is created, is DefaultAlgorithm.

value set_algorithm : g -> parse_algorithm -> unit;
Set the parsing algorithm for all entries of a given grammar.
value set_default_algorithm : parse_algorithm -> unit;
Set the default parsing algorithm for all grammars. If the environment variable CAMLP5PARAM contains “b”, the default is Backtracking; if it contains ‘f’, the default is Functional; if it contains ‘p’, the default is Predictive.
value default_algorithm : unit -> parse_algorithm;
Return the current default algorithm.
value backtrack_stalling_limit : ref int;
Limitation of backtracking to prevent stalling in case of syntax error. In backtracking algorithm, when there is a syntax error, the parsing continues trying to find another solution. It some grammars, it can be very long before checking all possibilities. This number limits the number of tokens tests after a backtrack. (The number of tokens tests is reset to zero when the token stream overtakes the last reached token.) The default is 10000. If set to 0, there is no limit. Can be set by the environment variable CAMLP5PARAM by “l=value”.
value backtrack_parse : ref bool;
Deprecated since 2017-06-06; rather use “set_default_algorithm Backtracking”.

functorial interface

Alternative for grammar use. Grammars are not Ocaml values: there is no type for them. Modules generated preserve the rule “an entry cannot call an entry of another grammar” by normal OCaml typing.

module type GLexerType =
  sig
    type te = 'x;
    value lexer : Plexing.lexer te;
  end;

The input signature for the functor “Grammar.GMake”: “te” is the type of the tokens.

module type S =
  sig
    type te = 'x;
    type parsable = 'x;
    value parsable : Stream.t char -> parsable;
    value tokens : string -> list (string * int);
    value glexer : Plexing.lexer te;
    value set_algorithm : parse_algorithm -> unit;
    module Entry :
      sig
        type e 'a = 'y;
        value create : string -> e 'a;
        value parse : e 'a -> parsable -> 'a;
        value parse_token_stream : e 'a -> Stream.t te -> 'a;
        value name : e 'a -> string;
        value of_parser : string -> (Stream.t te -> 'a) -> e 'a;
        value print : e 'a -> unit;
        external obj : e 'a -> Gramext.g_entry te = "%identity";
      end;
    module Unsafe :
      sig
        value gram_reinit : Plexing.lexer te -> unit;
        value clear_entry : Entry.e 'a -> unit;
      end;
  end;

Signature type of the functor “Grammar.GMake”. The types and functions are almost the same than in generic interface, but:

  • Grammars are not values. Functions holding a grammar as parameter do not have this parameter yet.
  • The type “parsable” is used in function “parse” instead of the char stream, avoiding the possible loss of tokens.
  • The type of tokens (expressions and patterns) can be any type (instead of (string * string)); the module parameter must specify a way to show them as (string * string).

module GMake (L : GLexerType) : S with type te = L.te;

grammar flags

value skip_item : 'a -> 'a;
Grammar.skip_item x can be called in a semantic action of a grammar rule to ask the grammar to skip that item if it is called in a list (LIST0 or LIST1). The function returns the item itself (for typing reasons) but its value is ignored. This function is used to allow IFDEF and IFNDEF for cases of constructor declarations and pattern matchings.
value error_verbose : ref bool;
Flag for displaying more information in case of parsing error; default = “False”.
value warning_verbose : ref bool;
Flag for displaying warnings while extension; default = “True”.
value strict_parsing : ref bool;
Flag to apply strict parsing, without trying to recover errors; default = “False”.

Diff module

Differences between two arrays. Used in Camlp5 sources, but can be used for other applications, independantly from Camlp5 stuff.

value f : array 'a -> array 'a -> (array bool * array bool);

Diff.f a1 a2 returns a pair of boolean arrays (d1, d2).

  • d1 has the same size as a1.
  • d2 has the same size as a2.
  • d1.(i) is True if a1.(i) has no corresponding value in a2.
  • d2.(i) is True if a2.(i) has no corresponding value in a1.
  • d1 and d2 have the same number of values equal to False.

Can be used, e.g., to write the diff program (comparison of two files), the input arrays being the array of lines of each file.

Can be used also to compare two strings (they must have been exploded into arrays of chars), or two DNA strings, and so on.

Extfold module

Module internally used to make the symbols FOLD0 and FOLD1 work in the EXTEND statement + extension “pa_extfold.cmo”.

Extfun module

Extensible functions.

This module implements pattern matching extensible functions which work with the parsing kit “pa_extfun.cmo”, the syntax of an extensible function being:

extfun e with [ pattern_matching ]

See chapter : Extensible functions.

type t 'a 'b = 'x;
The type of the extensible functions of type 'a ->       'b.
value empty : t 'a 'b;
Empty extensible function.
value apply : t 'a 'b -> 'a -> 'b;
Apply an extensible function.
exception Failure;
Match failure while applying an extensible function.
value print : t 'a 'b -> unit;
Print patterns in the order they are recorded in the data structure.

Eprinter module

This module allows creation of printers, apply them and clear them. It is also internally used by the “EXTEND_PRINTER” statement.

type t 'a = 'abstract;
Printer type, to print values of type “'a”.
type pr_context = Pprintf.pr_context;
Printing context.
value make : string -> t 'a;
Builds a printer. The string parameter is used in error messages. The printer is created empty and can be extended with the “EXTEND_PRINTER” statement.
value apply : t 'a -> pr_context -> 'a -> string;
Applies a printer, returning the printed string of the parameter.
value apply_level : t 'a -> string -> pr_context -> 'a ->       string;
Applies a printer at some specific level. Raises “Failure” if the given level does not exist.
value clear : t 'a -> unit;
Clears a printer, removing all its levels and rules.
value print : t 'a -> unit;
Print printer patterns, in the order they are recorded, for debugging purposes.

Some other types and functions exist, for internal use.

Fstream module

This module implement functional streams and parsers together with backtracking parsers.

To be used with syntax “pa_fstream.cmo”. The syntax is:

  • stream: “fstream [: ... :]
  • functional parser: “fparser [ [: ... :] -> ... | ... ]
  • backtracking parser: “bparser [ [: ... :] -> ... | ... ]

Functional parsers are of type:

Fstream.t 'a -> option ('b * Fstream.t 'a)

Backtracking parsers are of type:

Fstream.t 'a -> option ('b * Fstream.t 'a * Fstream.kont 'a 'b)

Functional parsers use limited backtrack, i.e if a rule fails, the next rule is tested with the initial stream; limited because in the case of a rule with two consecutive symbols “a” and “b”, if “b” fails, the rule fails: there is no try with the next rule of “a”.

Backtracking parsers have full backtrack. If a rule fails, the next case of the previous rule is tested.

Functional streams

type t 'a = 'x;
The type of ‘a functional streams.
value from : (int -> option 'a) -> t 'a;
Fstream.from f” returns a stream built from the function “f”. To create a new stream element, the function “f” is called with the current stream count. The user function “f” must return either “Some <value>” for a value or “None” to specify the end of the stream.
value of_list : list 'a -> t 'a;
Return the stream holding the elements of the list in the same order.
value of_string : string -> t char;
Return the stream of the characters of the string parameter.
value of_channel : in_channel -> t char;
Return the stream of the characters read from the input channel.
value iter : ('a -> unit) -> t 'a -> unit;
Fstream.iter f s” scans the whole stream s, applying function “f” in turn to each stream element encountered.
value next : t 'a -> option ('a * t 'a);
Return “Some (a, s)” where “a” is the first element of the stream and s the remaining stream, or “None” if the stream is empty.
value empty : t 'a -> option (unit * t 'a);
Return “Some ((), s)” if the stream is empty where s is itself, else “None”.
value count : t 'a -> int;
Return the current count of the stream elements, i.e. the number of the stream elements discarded.
value count_unfrozen : t 'a -> int;
Return the number of unfrozen elements in the beginning of the stream; useful to determine the position of a parsing error (longest path).

Backtracking parsers

type kont 'a 'b = [ K of unit -> option ('b * t 'a * kont 'a 'b) ];
The type of continuation of a backtracking parser.
type bp 'a 'b = t 'a -> option ('b * t 'a * kont 'a 'b);
The type of a backtracking parser.
value bcontinue : kont 'a 'b -> option ('b * t 'a * kont 'a 'b);
bcontinue k” return the next solution of a backtracking parser.
value bparse_all : bp 'a 'b -> t 'a -> list 'b;
bparse_all p strm” return the list of all solutions of a backtracking parser applied to a functional stream.

Pprintf module

Definitions for pprintf statement.

This module contains types and functions for the “pprintf” statement used by the syntax extension “pa_pprintf.cmo”.

type pr_context = { ind : int; bef : string; aft : string;       dang : string };

Printing context.

  • ind” : the current indendation
  • bef” : what should be printed before, in the same line
  • aft” : what should be printed after, in the same line
  • dang” : the dangling token to know whether parentheses are necessary
value empty_pc : pr_context;
Empty printer context, equal to {ind = 0; bef = ""; aft =       ""; dang = ""}
value sprint_break :
  int -> int -> pr_context -> (pr_context -> string) ->
    (pr_context -> string) -> string;

sprint_break nspaces offset pc f g” concat the two strings returned by “f” and “g”, either in one line, if it holds without overflowing (see module “Pretty”), with “nspaces” spaces betwen them, or in two lines with “offset” spaces added in the indentation for the second line. This function don’t need to be called directly. It is generated by the “pprintf” statement according to its parameters when the format contains breaks, like “@;” and “````”.

value sprint_break_all :
  bool -> pr_context -> (pr_context -> string) ->
    list (int * int * pr_context -> string) -> string;

sprint_break_all force_newlines pc f fl” concat all strings returned by the list with separators “f-fl”, the separators being the number of spaces and the offset like in the function “sprint_break”. The function works as “all or nothing”, i.e. if the resulting string does not hold on the line, all strings are printed in different lines (even if sub-parts could hold in single lines). If the parameter “force_newline” is “True”, all strings are printed in different lines, no horizontal printing is tested. This function don’t need to be called directly. It is generated by the “pprintf” statement according to its parameters when the format contains parenthesized parts with “break all” like “@[<a>” and “@]”, or “@[<b>” and “@]”. .. rubric:: Pretty module

name:pretty-module

Pretty printing on strings. Basic functions.

value horiz_vertic : (unit -> 'a) -> (unit -> 'a) -> 'a;
horiz_vertic h v” first calls “h” to print the data horizontally, i.e. without newlines. If the displaying contains newlines or if its size exceeds the maximum line length (see variable “line_length” below), then the function “h” stops and the function “v” is called which can print using several lines.
value sprintf : format 'a unit string -> 'a;
sprintf fmt ...” formats some string like “Printf.sprintf” does, except that, if it is called in the context of the *first* function of “horiz_vertic” above, it checks whether the resulting string has chances to fit in the line. If not, i.e. if it contains newlines or if its length is greater than “max_line_length.val”, the function gives up (raising some internal exception). Otherwise the built string is returned. “sprintf” behaves like “Printf.sprintf” if it is called in the context of the *second* function of “horiz_vertic” or without context at all.
value line_length : ref int;
line_length” is the maximum length (in characters) of the line. Default = 78. Can be set to any other value before printing.
value horizontally : unit -> bool;
horizontally ()” returns the fact that the context is an horizontal print.

Deprecated modules Stdpp and Token

The modules “Stdpp” and “Token” have been deprecated since version 5.00. The module “Stdpp” was renamed “Ploc” and most of its variables and types were also renamed. The module “Token” was renamed “Plexing

Backward compatibility is assured. See the files “stdpp.mli” and “token.mli” in the Camlp5 distribution to convert from old to new names, if any. After several versions or years, the modules “Stdpp” and “Token” will disappear from Camlp5.

Camlp5 sources

Camlp5 sources

Information for developpers of the Camlp5 program.

Kernel

The sources are composed of:

  • the OCaml stuff, copied from the OCaml compiler
  • the kernel composed of the directories:
    • odyl : the dynamic loading system
    • lib : the library
    • main : the main program camlp5
    • meta : the parsers for revised syntax, ast quotations, EXTEND statement, etc/
  • the rest: directories etc, compile, ocpp

Some other directories contain configuration files, tools, documentation and manual pages.

The kernel is sufficient to make the core system work: it is possible to compile and bootstrap only it. All sources being in revised syntax, the first compilation of Camlp5 is done by a version of this kernel in pure OCaml syntax, located in the directory ocaml_src.

These sources in pure OCaml syntax are not modified by hand. When changes are made to the kernel, and a check is done that it correctly compiles and bootstraps, the kernel in pure OCaml syntax is rebuilt using Camlp5 pretty print. This is done by the command “make bootstrap_sources”.

Compatibility

This distribution of Camlp5 is compatible with several versions of OCaml. The definition of OCaml syntax trees may change from OCaml version to version, which can be a problem. Since OCaml does not install the sources nor the compiled versions of its syntax tree, a copy of the necessary source files, borrowed from the source of the OCaml compiler is in the directory ‘ocaml_stuff’, in subdirectories with the OCaml version number.

If the present distribution of Camlp5 is not compatible with the version of OCaml you have (the command ‘configure’ tells you), it is possible to add it. For that, you need the sources to your specific OCaml distribution. If you have them then a ‘configure’ telling you that camlp5 is not compatible, do:

make steal OCAML_SRC=<path-to-OCaml-sources>

This creates a new directory in ‘ocaml_stuff’ with sources of the syntax tree of your OCaml compiler.

If you want to check that the sources of the syntax tree of OCaml are up-to-date (e.g. if this is the current OCaml developpement), do:

make compare_stolen OCAML_SRC=<path-to-OCaml-sources>

The compatibility is also done with the file ‘lib/versdep.ml’, which is a module containing miscellaneous features depending to the version of OCaml.

In the directory ‘ocaml_src’ which contains the pure OCaml sources of the Camlp5 core (see chapter TREE STRUCTURE below), there are as many versions of this files as versions of OCaml. They are named ‘version.ml’ in the directory ‘lib/versdep’. If you are adding a new version of OCaml, you need this file. As a first step, make a copy from a close version:

cd ocaml_src/lib/versdep
cp <close_version>.ml <version>.ml

Then, you can rerun “configure” and do “make core”. If the file ‘ocaml_src/lib/versdep.ml’ has compilation problems, fix them ‘make core’ again. When it compiles, copy it into the subdirectory ‘versdep’ as ‘<version>.ml’, overwriting the version you copied from the close version.

Later, the same file ‘lib/versdep.ml’ in Camlp5 syntax may have similar compilation problems. There is only a single version of this file, thanks to IFDEF constructs used here or there.

While compiling with some specific version of OCaml, this file is compiled with ‘OCAML_vers’ defined where ‘vers’ is the version number form the beginning to the first space or charcter ‘+’ with all dots converted into underscores. For example, if your OCaml version is 7.04.2+dev35, you can see in the compilation process of versdep.ml that OCAML_7_04_2 is defined, and you can add statements defined by the syntax extension ‘pa_macro.cmo’, for example IFDEF OCAML_7_04_2. Add statements like that in ‘lib/versdep.ml’ to make it compile successfully.

Tree structure

The directory ‘ocaml_src’ contains images in pure OCaml syntax of the directories odyl lib main and meta. This allows the creation of a core version of Camlp5 with only the OCaml compiler installed.

You can decompose the building of the Camlp5 core into:

  1. make library_cold just makes the directory ‘ocaml_src/lib’ and copy the cmo and cmi files into the directory ‘boot’
  2. make compile_cold makes the other directories of ocaml_src
  3. make promote_cold copies the executables “camlp5”, “camlp5r” and the syntax extensions (cmo files) into the directory ‘boot’

From this point, the core Camlp5 is in directory ‘boot’. The real sources in the top directories odyl, lib, main and meta, which are written in revised syntax with some syntax extensions (grammars, quotations) can be compiled. To achieve their compilation, you can do:

make core

Or to compile everything do:

make all

or just:

make

Notice that doing “make core” or “make all” from scratch (after a make clean), automatically starts by making the core files from their pure OCaml versions.

Fast compilation from scratch

./configure
make clean core compare
make coreboot
make all opt opt.opt

Testing changes

  1. do your changes
  2. do:
make core compare

if it says that the bootstrap is ok, you can do:

make all
make opt
make opt.opt

otherwise, to make sure everything is ok, first do:

make coreboot

sometimes two bootstraps (‘make coreboot’ twice) are necessary, in particular if you change things in the directory ‘lib’. It is even possible that three bootstraps are necessary.

If things go wrong, it is possible to return to the previous version by typing:

make restore clean_hot

then you can change what is necessary and continue by typing:

make core

and test the bootstrap again:

make coreboot

After several bootstraps (by ‘make coreboot’ or ‘make bootstrap’), many versions are pushed in the directory ‘boot’ (you can type ‘find boot -type d -print’ to see that). If your system correctly bootstraps, you can clean that by typing:

make cleanboot

which keeps only two versions. (The command ‘make clean’ also removes these stack of versions.)

Before committing your changes

Make sure that the cold start with pure OCaml sources work. For that, do:

make compare_sources | less

This shows you the changes that would be done in the OCaml pure sources of the directory ocaml_src.

To make the new versions, do:

make new_sources
make promote_sources

Notice that these pure OCaml sources are not supposed to be modified by hand, but only created by the above commands. Although their source is pretty printed they are usually not easy to read, particularly for expanded grammars (of the statement ‘EXTEND’).

If these sources do not compile, due to changes in the OCaml compiler, it is possible however to edit them. In this case, similar changes may need to be performed in the normal sources in revised syntax.

After doing ‘make new_sources’ above, and before doing ‘make promote_sources’ below, it is possible to do ‘make untouch_sources’ which changes the dates of the newly created files with the dates of the old files if they are not modified. This way, the “svn commit” will not need to compare these files, which may be important if your network is not fast.

The ‘make new_sources’ builds a directory named ‘ocaml_src.new’. If this directory still exists, due to a previous ‘make new_sources’, the command fails. In this case, just delete it (rm -rf ocaml_src.new) without problem: this directory is not part of the distribution, it is just temporary.

The ‘make clean_sources’ deletes old versions of ocaml_src, keeping only the last and the before last ones.

The command:

make bootstrap_sources

is a shortcut for:

make new_sources
make untouch_sources
make promote_sources
make clean_sources

If there are changes in the specific file ‘lib/versdep.ml’, do also:

make compare_all_versdep

and possibly:

make bootstrap_all_versdep

because this file, in ‘ocaml_src/lib/versdep’ directory has different versions according to the OCaml version.

After having rebuilt the pure OCaml sources, check that they work by rebuilding everything from scratch, starting with “configure”.

If you change the main parser

If you change the main parser ‘meta/pa_r.ml’, you should check that the quotations expanders of syntax tree ‘meta/q_MLast.ml’ match the new version. For that, do:

cd meta
make compare_q_MLast

If no differences are displayed, it means that ‘q_MLast.ml’ is ok, relatively to ‘pa_r.ml’.

Otherwise, if the displayed differences seem reasonable, update the version by typing:

make bootstrap_q_MLast

Then returning to the top directory, do ‘make core compare’ and possibly ‘make coreboot’ (one of several times) to check the correctness of the file.

And don’t forget, if you want to commit, to re-create the pure OCaml sources like indicated above.

Adding new nodes to the syntax tree

If new nodes are necessary in the syntax tree, for example because the OCaml language added itself new nodes, the steps are the following (with the example of adding the “lazy” pattern node).

  • Add the node in the file ‘main/mLast.mli’. Please respect the design of the nodes by looking at the other nodes. Example:

    PaLaz of loc and patt
    
  • Try to compile (do ‘make’ in the main directory). You are going to have some errors in files telling you that nodes are missing in some pattern matchings. Add them, according to the new nodes of OCaml or looking at other nodes.

  • Once the compilation is done, try a ‘make bootstrap’ to be sure everything is OK.

  • When it is, add the possible concrete syntax in the revised syntax, i.e. in ‘meta/pa_r.ml’. Since the quotation is not yet implemented, put it in syntax without quotation. Example:

    "lazy"; p = SELF -> MLast.PaLaz loc p
    
  • Do ‘make bootstrap’ again.

  • Go to the directory ‘meta’ and type:

    make compare_q_MLast
    

    This command try to compare what should be the AST quotation if it perfectly matched the syntax. If this comparison seems reasonable, change the file ‘q_MLast.ml’ by typing:

    make bootstrap_q_MLast
    
  • Do a ‘make bootstrap’ again to check everything is OK. If it is change the line of ‘meta/pa_r.ml’. In the example, from:

    "lazy"; p = SELF -> MLast.PaLaz loc p
    

    to:

    "lazy"; p = SELF -> <:patt< lazy $p$ >>
    
  • The new syntax should work now in revised syntax. You can complete the compilation, do a ‘make install’ and check with the toplevel that it works. Complete with the rest like said above.

  • You can then complete the other syntaxes (the ‘normal’ syntax, for example in ‘etc/pa_o.ml’) and the pretty printers.

Switching between transitional and strict mode

If Camlp5 is compiled in some mode, it is possible to change its mode in two boostrapping steps. Type:

make MODE=T coreboot

to switch to transitional mode, or:

make MODE=S coreboot

to switch to strict mode.

After two (necessary) bootstraps, the kernel is compiled in the new mode. Complete the compilation by:

make MODE=T all opt opt.opt

or:

make MODE=S all opt opt.opt

according to the new mode you want to use.

Another solution is, of course, recompile everything from scratch:

make clean
./configure -transitional
make world.opt

or:

make clean
./configure -strict
make world.opt

Bootstrapping

Camlp5 is bootstrapped in numerous ways.

Camlp5 executable bootstrapping

The file ‘main/camlp5r’ is rebuilt each time a bootstrapping command is used (like ‘make coreboot’ or ‘make bootstrap’). This bootstrapping command starts with copying it in the directory ‘boot’. The file ‘boot/camlp5r’ is used to recompile the sources, creating another file ‘main/camlp5r’. When both files are the same (byte by byte), the Camlp5 executable is bootstrapped.

Sometimes, in particular when changes are done in the library (directory ‘lib’), it is necessary to bootstrap twice before having the message ‘Fixpoint reached, bootstrap succeeded’.

The command ‘make compare’ tells you whether the Camlp5 executable is currently bootstrapped or not.

Source bootstrapping

The compilation of Camlp5 starts with the compilation of files of the directory ocaml_src written in pure OCaml. This creates the files ‘camlp5’ and ‘camlp5r’ in the directory boot. This is called the ‘cold start’.

Once done, the sources of Camlp5 can be compiled using revised syntax and several syntax extensions, like the statement ‘EXTEND’, for example, and the quotations of syntax trees.

The core files of Camlp5 are in the directories lib, main, meta, odyl. There are the same directories in the directory ocaml_src where all files are equivalent.

When changes are done in the core files, and when the printer kit in normal syntax ‘etc/pr_o.cmo’ has been created, the files of the directory ocaml_src can be rebuilt using the command ‘make bootstrap_sources’. This updates the files in ocaml_src to exactly reflect the ones in the core, but in pure OCaml syntax.

Bootstrap: the1 files in ocaml_src creates the first Camlp5 executable. The Camlp5 executable can rebuild the files in ocaml_src.

Source file q_MLast.ml bootstrapping

The source file meta/q_MLast.ml (quotation of syntax trees) can be recreated using the file meta/pa_r.ml (revised syntax). When changes are done in the file meta/pa_r.ml, a good usage is to go to the directory ‘meta’ and type:

make compare_q_MLast

This shows the possible changes that will be applied to meta/q_MLast.ml. If they seem to be reasonable, do:

make bootstrap_q_MLast

This changes the source file meta/q_MLast.ml. After this command, a new ‘make compare_q_MLast’ indicates no differences.

After that, a new ‘make bootstrap’ in the top directory ensures that everythings works.

Bootstrap: the file meta/pa_r.ml uses the quotation expander meta/q_MLast.cmo. The source file meta/q_MLast.ml is recreated by meta/pa_r.ml.

Source file q_ast.ml bootstrapping

The source file meta/q_ast.ml contains another version of the quotation expander of syntax trees which follows the current syntax used (in normal syntax if the current syntax is used). This works only in strict mode.

This file depends on the definition of the syntax tree main/mLast.mli. When changes are done in this file, it is possible to see what changes are impacted in meta/q_ast.ml. For this, go to the directory ‘meta’ and type:

make compare_q_ast

This shows the possible changes that will be applied to meta/q_ast.ml. If they seem to be reasonable, do:

make bootstrap_q_ast

This changes the source file meta/q_ast.ml. After this command, a new ‘make compare_q_ast’ indicates no differences.

After that, a new ‘make bootstrap’ in the top directory ensures that everythings works.

Lisp and Scheme syntax bootstrapping

The Lisp syntax is written in Lisp syntax in the directory etc. It is the file ‘etc/pa_lisp.ml’. To compile this file, there is another file, named ‘etc/pa_lispr.ml’ written in revised syntax.

When changes are done in etc/pa_lisp.ml, the file etc/pa_lispr.ml must be rebuilt. First, go to the directory ‘etc’ and type:

make compare_lisp

If changes seem to be reasonable, do:

make boostrap_lisp

This rebuilds ‘etc/pa_lispr.ml’. A new ‘make’ in the directory ‘etc’ will recompile it and recompile the Lisp version ‘etc/pa_lisp.ml’.

Bootstrap: etc/pa_lispr.ml allows to compile etc/pa_lisp.ml, and changes is etc/pa_lisp.ml are reported in the source file etc/pa_lispr.ml through ‘make bootstrap_lisp’.

Same for the Scheme syntax: the files are etc/pa_scheme.ml and ‘etc/pa_scheme.ml’.

EXTEND statement bootstrapping

The EXTEND statement of Camlp5 is a syntax extension. The file ‘meta/pa_extend.ml’ contains the statement for the adding of this syntax extension, therefore something like:

EXTEND
  expr:
    [ [ "EXTEND" .....

To be compiled, the file ‘meta/pa_extend.ml’ needs ‘pa_extend.cmo’. This is actually its previous version in the directory ‘boot’. When checking for a correct bootstrapping of Camlp5 (with the command ‘make compare’, for example), a test is done to verify that the binary files ‘meta/pa_extend.cmo’ and ‘boot/pa_extend.cmo’ are the same.

Notice that there is also a file ‘ocaml_src/meta/pa_extend.ml’ in pure OCaml syntax, but, although this file is pretty printed, is is hardly editable, because the expansion of the ‘EXTEND’ statement is a very long expression rather difficult to understand. But this file need not to be changes, since the command ‘make bootstrap_sources’ (see above) rebuilts it.

About

About Camlp5

Version Home page

Author
Daniel de Rauglaudre, INRIA
History
The ideas behind Camlp5 were expressed in the 1990s by Michel Mauny. In 1996, Daniel de Rauglaudre implemented the first version named Camlp4 (the four “p” standing for “Pre-Processor-Pretty-Printer”). In 2002, Camlp4 was maintained by Michel Mauny, and later extended by Nicolas Pouillard, with different basic ideas, introducing some incompatibilities. In 2006, Daniel de Rauglaudre restarted this work, renaming it Camlp5.
.. container:: trailer