About parsing

A thread of 9 exercises in the eight chapter of the book is devoted to parsing well balanced sequences of parentheses. This thread ends with the proof of correctness (i.e. soundness and completeness) of a parsing function. This file is a full development which can be considered as a solution to all these exercises.

Defining well-parenthesized expressions, take 1

We consider the following type of characters:

Inductive par : Set := open | close.

We represent character strings using the type list par. An expression is well-parenthesized when:

it is the empty list,
it is a well-parenthesized expression between parentheses,
it is the concatenation of two well-parenthesized expressions.

Define the inductive property wp:list par -> Prop that corresponds to this informal definition. You can use the function app given in the module List to concatenate two lists. Prove the following two properties:

wp_oc : wp (cons open (cons close nil))

wp_o_head_c :
     forall l1 l2:list par,
       wp l1 -> wp l2 -> wp (cons open (app l1 (cons close l2)))

wp_o_tail_c :
     forall l1 l2:list par, wp l1 -> wp l2 ->
       wp (app l1 (cons open (app l2 (cons close nil))))

Generating well-parenthesized expressions, take 1

We consider a type of binary trees without labels and a function that maps any tree to a list of characters. Show that this function always builds a well-parenthesized expression:

Inductive bin : Set := L : bin | N : bin->bin->bin.

Fixpoint bin_to_string (t:bin) : list par :=
  match t with
  | L => nil
  | N u v =>
    cons open 
      (app (bin_to_string u)(cons close (bin_to_string v)))
  end.

Generating well-parenthesized expressions, take 2

Prove that the following function also returns a well-parenthesized expression:

Fixpoint bin_to_string' (t:bin) : list par :=
  match t with
  | L => nil
  | N u v =>
    app (bin_to_string' u)
        (cons open (app (bin_to_string' v)(cons close nil)))
  end.

Defining well-parenthesized expressions, take 2

Here is a second definition of well-parenthesized expressions. Prove that it is equivalent to the previous one:

Inductive wp' : list par -> Prop :=
| wp'_nil : wp' nil
| wp'_cons : forall l1 l2:list par, wp' l1 -> wp' l2 -> 
                  wp' (cons open (app l1 (cons close l2))).

Defining well-parenthesized expressions, take 3

Here is a third definition. Prove that it is equivalent to the previous ones:

Inductive wp'' : list par -> Prop :=
| wp''_nil : wp'' nil
| wp''_cons :
  forall l1 l2:list par, wp'' l1 -> wp'' l2 ->
      wp'' (app l1 (cons open (app l2 (cons close nil)))).

Recognizing well-parenthesized expressions (completeness)

Here is a function that recognizes well-parenthesized expressions by counting the opening parentheses that are not yet closed:

Fixpoint recognize (n:nat)(l:list par){struct l} : bool :=
  match l with
    nil => match n with O => true | _ => false end
  | cons open l' => recognize (S n) l'
  | cons close l' =>
    match n with O => false | S n' => recognize n' l' end
  end.

Prove the following theorem:

recognize_complete_aux :
  forall l:list par, wp l ->
  forall (n:nat)(l':list par), 
  recognize n (app l l') = recognize n l'.

Conclude with the following main theorem:

recognize_complete :
  forall l:list par, wp l -> recognize 0 l = true.

Recognizing well-parenthesized expressions (soundness)

This exercise is rather hard. Prove that the recognize function only accepts well-parenthesized expressions, More precisely

recognize_sound : forall l:list par, recognize 0 l = true -> wp l.

hint:

we suggest proving that if recognize n l is true then the string app ln l is well-parenthesized, where ln is the string made of n opening parentheses. Several lemmas about list concatenation are needed.

Parsing well-parenthesized expressions

We consider the following parsing function:

Fixpoint parse (s:list bin)(t:bin)(l:list par){struct l}
 : option bin :=
  match l with
  | nil => match s with nil => Some t | _ => None end
  | cons open l' => parse (cons t s) L l'
  | cons close l' =>
    match s with
    | cons t' s' => parse s' (N t' t) l'
    | _ => None
    end
  end.

Prove that this parser is correct and complete:

parse_complete :
 forall l:list par, wp l -> parse nil L l <> None.

parse_invert:
  forall (l:list par)(t:bin),
      parse nil L l = Some t -> bin_to_string' t = l.

parse_sound:
 forall (l:list par)(t:bin), parse nil L l = Some t -> wp l.

Inductive presentation of parsing

The following inductive definition gives the description of a parsing function for well-parenthesized expressions. Intuitively, ``parse_rel l1 l2 t'' reads as ``parsing the string l1 leaves l2 as suffix and builds the tree t.''

Inductive parse_rel : list par -> list par -> bin -> Prop :=
  | parse_node :
    forall (l1 l2 l3:list par)(t1 t2:bin),
      parse_rel l1 (cons close l2) t1 -> parse_rel l2 l3 t2 -> 
      parse_rel (cons open l1) l3 (N t1 t2)
  | parse_leaf_nil : parse_rel nil nil L
  | parse_leaf_close :
      forall l:list par, parse_rel (cons close l)(cons close l) L.

Prove the following lemmas:

parse_rel_sound_aux :
    forall (l1 l2:list par)(t:bin),
      parse_rel l1 l2 t -> l1 = app (bin_to_string t) l2.
parse_rel_sound :
    forall l:list par, (exists t:bin, parse_rel l nil t)-> wp l.

Solution

Look at this file