Library additions.Addition_Chains

Addition Chains

Pierre Casteran, LaBRI, University of Bordeaux

From additions Require Export Monoid_instances Pow.
From Coq Require Import Relations RelationClasses Lia List.
From Param Require Import Param.

From additions Require Import More_on_positive.
Generalizable Variables A op one Aeq.
Infix "==" := Monoid_def.equiv (at level 70) : type_scope.
Open Scope nat_scope.
Open Scope M_scope.

Generalizable All Variables.

Computations composed of multiplications over type A

(in Continuation Passing Style)

Inductive computation {A:Type} : Type :=
| Return (a : A)
| Mult (x y : A) (k : A → computation).

Monadic Notation for computations

Notation "z '<---' x 'times' y ';' e2 " :=
  (Mult x y (fun z ⇒ e2))
    (right associativity, at level 60).

Example comp128 : computation :=
  x <--- 2 times 2;
  y <--- x times 2;
  z <--- y times y ;
  t <--- 2 times z ;
  Return t.

Definition

An addition chain (in short a chain) is a function that maps any type and any value of type into a computation on .

Definition chain := ∀ A:Type, A → @computation A.

The chain associated with the empty computation (raising to the first power)

Definition C1 : chain := (@Return).

Example C3 : chain :=
  fun (A:Type) (x:A) ⇒
     x2 <--- x times x ;
     x3 <--- x2 times x ;
     Return x3.

Chain associated with the 7-th power

Example C7 : chain :=
fun (A:Type) (x:A) ⇒
x2 <--- x times x ;
x3 <--- x2 times x ;
x6 <--- x3 times x3 ;
x7 <--- x6 times x ;
Return x7.

Our Favorite example

The chain below is intented to compute 87-th power in any EMonoid.

Example C87 : chain :=
fun A (x : A)=>
  x2 <--- x times x ;
  x3 <--- x2 times x ;
  x6 <--- x3 times x3 ;
  x7 <--- x6 times x ;
  x10 <--- x7 times x3 ;
  x20 <--- x10 times x10 ;
  x40 <--- x20 times x20 ;
  x80 <--- x40 times x40 ;
  x87 <--- x80 times x7 ;
  Return x87.

Chain length (number of mutiplications)

Fixpoint computation_length {A} (a:A) (m : @computation A) : nat :=
match m with
| Mult _ _ k ⇒ S (computation_length a (k a))
| _ ⇒ 0%nat
end.

Definition chain_length (c:chain) := computation_length tt (c _ tt).

Execution of chains

Chains are designed for effectively computing powers.

First, we define recursively the evaluation of a computation, then the execution of a chain.

Fixpoint computation_execute {A:Type} (op: Mult_op A)
         (c : computation) :=
  match c with
    | Return x ⇒ x
    | Mult x y k ⇒ computation_execute op (k (x × y))
  end.

Definition computation_eval `{M:@EMonoid A E_op E_one E_eq}
           (c : computation) : A :=
  computation_execute E_op c.

Definition chain_execute (c:chain) {A} op (a:A) :=
  computation_execute op (c A a).

Definition chain_apply
           (c:chain) `{M:@EMonoid A E_op E_one E_eq} a : A :=
   computation_eval (c A a).

Lemma computation_eval_rw `{M:@EMonoid A E_op E_one E_eq} c :
         computation_eval c = computation_execute E_op c.

Chain correctness

In this section, we define formally the correctness of a given chain with respect to some given exponent, and more generally the correctness of a chain generator, i.e. a function that, given any positive integer , returns a chain that is correct with respect to .

Definition chain_correct_nat (n:nat) (c: chain) := n ≠ 0 ∧
∀ `(M:@EMonoid A E_op E_one E_eq) (x:A),
   chain_apply c x == x ^ n.

Definition chain_correct (p:positive) (c: chain) :=
  chain_correct_nat (Pos.to_nat p) c.

Definition optimal (p:positive) (c : chain) :=
  ∀ c', chain_correct p c' →
             (chain_length c ≤ chain_length c')%nat.

A slow tactic for proving a chain's correctness

Ltac slow_chain_correct_tac :=
  match goal with
      [ |- chain_correct ?p ?c ] ⇒
      let A := fresh "A" in
      let op := fresh "op" in
      let one := fresh "one" in
      let eqv := fresh "eqv" in
      let M := fresh "M" in
      let x := fresh "x"
      in split;[discriminate |
                 unfold c, chain_apply, computation_eval; simpl;
                 intros A op one eq M x; monoid_simpl M; reflexivity]
  end.

Example C7_ok : chain_correct 7 C7.

The following proof takes a very long time. Happily, C87's correctness will be proved more efficiently, using reflection or parametricity.

Remove the comment if you can wait ...

Correctness Proof by Reflection

See chap 16 of Coq'Art

Binary trees of multiplications over A

Inductive Monoid_Exp (A:Type) : Type :=
Mul_node (t t' : Monoid_Exp A) | One_node | A_node (a:A).

Arguments Mul_node {A} _ _.
Arguments One_node {A} .
Arguments A_node {A} _ .

Linearization functions

Fixpoint flatten_aux {A:Type}(t fin : Monoid_Exp A)
  : Monoid_Exp A :=
  match t with
    Mul_node t t' ⇒ flatten_aux t (flatten_aux t' fin)
  | One_node ⇒ fin
  | x ⇒ Mul_node x fin
  end.

Fixpoint flatten {A:Type} (t: Monoid_Exp A)
  : Monoid_Exp A :=
  match t with
  | Mul_node t t' ⇒ flatten_aux t (flatten t')
  | One_node ⇒ One_node
  | X ⇒ Mul_node X One_node
  end.

Interpretation function

Function eval {A:Type} {op one eqv}
         (M: @EMonoid A op one eqv)
         (t: Monoid_Exp A) : A :=
  match t with
    Mul_node t1 t2 ⇒ (eval M t1 × eval M t2)%M
  | One_node ⇒ one
  | A_node a ⇒ a
end.

Lemma flatten_aux_valid `(M: @EMonoid A op one eqv):
∀ t t', (eval M t × eval M t')%M ==
             (eval M (flatten_aux t t')).

Lemma flatten_valid `(M: @EMonoid A op one eqv):
  ∀ t , eval M t == eval M (flatten t).

Lemma flatten_valid_2 `(M: @EMonoid A op one eqv):
∀ t t' , eval M (flatten t) == eval M (flatten t') →
              eval M t == eval M t'.

"Quote" tactic

Ltac model A op one v :=
match v with
| (?x × ?y)%M ⇒ let r1 := model A op one x
                  with r2 :=(model A op one y)
                  in constr:(@Mul_node A r1 r2)
| one ⇒ constr:(@One_node A)
| ?x ⇒ constr:(@A_node A x)
end.

Ltac monoid_eq_A A op one E_eq M :=
match goal with
| [ |- E_eq ?X ?Y ] ⇒
  let tX := model A op one X with
      tY := model A op one Y in
      (change (E_eq (eval M tX) (eval M tY)))
end.

Ltac reflection_correct_tac :=
match goal with
[ |- chain_correct ?n ?c ] ⇒
split; [try discriminate |
         let A := fresh "A"
         in let op := fresh "op"
         in let one := fresh "one"
         in let E_eq := fresh "eq"
         in let M := fresh "M"
         in let x := fresh "x"
         in (try unfold c); unfold chain_apply;
           simpl; red; intros A op one E_eq M x;
           unfold computation_eval;simpl;
           monoid_eq_A A op one E_eq M;
           apply flatten_valid_2;try reflexivity
        ]
end.

Example C87_ok : chain_correct 87 C87.
Example C7_ok' : chain_correct 7 C7.

Correctness and parametricity

In this section, we show some tools for proving automatically the correctness of a given chain, and try to avoid spending time while proceeding to a lot of setoid rewritings

First, we notice that any chain is able to compute its associated exponent:

Definition the_exponent_nat (c:chain) : nat :=
chain_apply c (M:=Natplus) 1%nat.

Definition the_exponent (c:chain) : positive :=
chain_execute c Pos.add 1%positive.

Roughly, if cA is a computation on A and cB a computation on B, cA and cB are related through (computation_R A B R) if, during their execution, the corresponding variables of type A and B are always bound to related (w.r.t. R ) values.

Print computation_R.

We say that a chain c is parametric if c A a and c B b are equivalent with respect to any relation that contains the pair (a,b).

Definition parametric (c:chain) :=
∀ A B (R: A → B → Type) (a:A) (b:B),
R a b → computation_R A B R (c A a) (c B b).

The following statement cannot be proven in Coq (AFAIK)

Definition any_chain_parametric : Type :=
∀ c:chain, parametric c.

Goal any_chain_parametric.

Nevertheless, if we don't want to assume any_chain_parametric, we can use the following tactic for proving a given that a given chain c is parametric.

Ltac parametric_tac :=
match goal with [ |- parametric ?c] ⇒
red ; intros;
repeat (right;[assumption | assumption | ]); left; assumption
end.

Example P87 : parametric C87.

computation of compared to computation of

The following section is dedicated to prove that, if a chain is parametric, it computes powers of the form , where is obtained by applying the function the_exponent_nat to .

Basically, we proceed by an induction on the hypothesis equiv gamma_A gamma_nat l where gamma is a computation on A, gamma_nat a computation on nat (with respect to the additive monoid on nat), and l is a list of pairs of the form .

Section Refinement_proof.
Variable A: Type.
Variable E_op : Mult_op A.
Variable E_one : A.
Variable E_eq : Equiv A.
Context (M : EMonoid E_op E_one E_eq).

Let us characterise the lists of pairs of the form , for a given and .

Definition power_R (a:A) :=
  fun (x:A)(n:nat) ⇒ n ≠ 0 ∧ x == a ^ n.

Remark power_R_Mult : ∀ a x y i j,
    power_R a x i → power_R a y j →
    power_R a (x × y) (i +j)%nat.

Remark power_R_1 : ∀ a, power_R a a 1%nat.

Lemma power_R_is_a_refinement (a:A) :
  ∀(gamma : @computation A)
        (gamma_nat : @computation nat),
    computation_R _ _ (power_R a) gamma gamma_nat →
     power_R a (computation_eval gamma)
             (computation_eval (M:= Natplus) gamma_nat).

Lemma param_correctness_aux :
  ∀ (c:chain)(a:A),
    computation_R A nat (power_R a ) (c A a) (c nat 1%nat) →
     power_R a (computation_eval (c A a)) (the_exponent_nat c) .

End Refinement_proof.

Correctness Theorem

From our study of the computation\_R predicate, we infer that any parametric chain is correct with respect to the number the_exponent_nat c

Lemma exponent_nat_neq_0 : ∀ c: chain, parametric c →
                                             the_exponent_nat c ≠ 0.
Lemma exponent_pos2nat : ∀ c: chain,
    parametric c →
    the_exponent_nat c = Pos.to_nat (the_exponent c).

Lemma exponent_pos_of_nat :
  ∀ c: chain, parametric c →
                    the_exponent c = Pos.of_nat (the_exponent_nat c).

Lemma param_correctness_nat (c: chain) :
  parametric c →
  chain_correct_nat (the_exponent_nat c) c.

Lemma param_correctness :
  ∀ (p:positive) (c:chain),
    p = the_exponent c → parametric c →
    chain_correct p c.

Remark

If we admit that any chain is parametric, we infer the correctness of every chain.

Lemma param_correctness_for_free :
any_chain_parametric →
∀ p (c : chain) ,
p = the_exponent c → chain_correct p c.

Back to 87

We present two new-proofs of consistency of C87. The first one uses the parametric_tac tactic, the other one using the "free refinement" method

Ltac param_chain_correct :=
  match goal with
    [|- chain_correct ?p ?c ] ⇒
    let H := fresh "H" in
    assert (p = the_exponent c) by reflexivity;
    apply param_correctness;[trivial | parametric_tac]
  end.

Lemma C87_ok' : chain_correct 87 C87.
Lemma L87'' : any_chain_parametric → chain_correct 87 C87.

Correct by construction chains

Definition chain_generator := positive → chain.

Definition correct_generator (gen : positive → chain) :=
∀ p, chain_correct p (gen p).

Computing using a chain generator

Definition cpower_pos (g : chain_generator) p
           `{M:@EMonoid A E_op E_one E_eq} a :=
  chain_apply (g p) (M:=M) a.

Definition cpower (g : chain_generator) n
           `{M:@EMonoid A E_op E_one E_eq} a :=
  match n with 0%N ⇒ E_one
             | Npos p ⇒ cpower_pos g p a
  end.

Definition

A chain generator is optimal if it returns chains whose length are less or equal than the chains returned by any correct generator.

Definition optimal_generator (g : positive → chain ) :=
∀ p:positive, optimal p (g p).

Some chain generators

We reinterpret the naïve and binary exponentiation algorithms in the framework of addition chains.

Instead of directly computing for some base and exponent , we build chains that describe the computations associated with both methods. Not surprisingly, this chain generation will be described in terms of recursive functions, once the underlying monoid is fixed.

Chains associated to the well-known binary exponentiation algorithm

computation of

As for the "classical" binary exponentiation algorithm, we define an auxiliary computation generator for computing the product of an accumulator with an arbitrary power of some value

Fixpoint axp_scheme {A} p : A → A → @computation A :=
  match p with
  | xH ⇒ (fun a x ⇒ y <--- a times x ; Return y)
  | xO q ⇒ (fun a x ⇒ x2 <--- x times x ; axp_scheme q a x2)
  | xI q ⇒ (fun a x ⇒ ax <--- a times x ;
            x2 <--- x times x ;
            axp_scheme q ax x2)
  end.

Fixpoint bin_pow_scheme {A} (p:positive) : A → @computation A:=
  match p with | xH ⇒ fun x ⇒ Return x
             | xI q ⇒ fun x ⇒ x2 <--- x times x ; axp_scheme q x x2
             | xO q ⇒ fun x ⇒ x2 <--- x times x ; bin_pow_scheme q x2
  end.

Definition binary_chain (p:positive) : chain :=
  fun A ⇒ bin_pow_scheme p.

Correctness of the binary method

Section binary_power_proof.

Variables (A: Type)
         (E_op : Mult_op A)
         (E_one : A)
         (E_eq: Equiv A).

Context (M : EMonoid E_op E_one E_eq).

Existing Instance Eop_proper.

As for linear chains, we first establish the correctness of the auxiliary function axp_scheme.

Lemma axp_correct : ∀ p a x,
    computation_eval (axp_scheme p a x) ==
    a × x ^ (Pos.to_nat p ).

Lemma binary_correct :
  ∀ p x,
    computation_eval (bin_pow_scheme p (A:=A) x) ==
    x ^ (Pos.to_nat p ).

End binary_power_proof.

Lemma binary_generator_correct : correct_generator binary_chain.

The binary method is not optimal

We use the function chain_length and proofs of correctness for showing that binary_chain, although correct, is not an optimal chain generator.

Our proof is structured as a proof by contradiction, under the hypothesis that binary_chain is optimal.

Section non_optimality_proof.

Hypothesis binary_opt : optimal_generator binary_chain.

Lemma binary_generator_not_optimal : False.
End non_optimality_proof.

Open Scope nat_scope.
Open Scope M_scope.

Section S1.
Fixpoint clog2 (p:positive) : nat :=
match p with xH ⇒ 1%nat
            | xO xH ⇒ 2%nat
            | xO p ⇒ S (clog2 p)
            | xI p ⇒ S (clog2 p)
  end.

Fixpoint exp2 (n:nat) : positive :=
  match n with O ⇒ 1
             | S p ⇒ 2 × exp2 p
end.
Lemma exp2_Plus : ∀ n p, exp2 (n + p) = (exp2 n × exp2 p)%positive.

Lemma axp_scheme_length1 :
  ∀ p, (computation_length tt (axp_scheme p tt tt ) ≤ 2 × clog2 p)%nat.

Lemma bin_pow_scheme_length1 :
  ∀ p, (computation_length tt (bin_pow_scheme p tt ) ≤ 2 × clog2 p)%nat.

Lemma binary_chain_length : ∀ p,
                              (chain_length (binary_chain p) ≤ 2 × clog2 p)%nat.

Lemma optimal_upper_bound: ∀ p c, optimal p c →
   (chain_length c ≤ 2 × clog2 p)%nat.

End S1.

How to prove chain correctness in general ?

In previous sections, we showed proofs of correctness of two chain generators. By definition, every chain returned by these generators is correct w.r.t. the given exponent.

The problem with C87 is quite different. This chain has been given by hand, and we had to prove its correctness.

Although the proof script for lemma L87 is quite short, the resulting proof is quite clumsy, consisting in a long chain of equivalences using associativity of the multiplication.

The reader can easily be convinced of this fact, just by the command Print L87.

In a similar way, it may happen that a correct chain generator is hard to certify in . In this case, one may chose to prove the correctness of various chains returned by the generator.

In the rest of this section, we investigate some ways of proving the correctness of given chains.

Section S2.

Difficult and unfinished (conjecture)

We would like to prove that for any parametric chain c for p, c's length is greater or equal than log2(p).

Variables A B : Type.
Variables (a:A)(b:B).
Let R_true := fun (x:A)(y:B) ⇒ True.
Lemma L :
  ∀(gammaA : @computation A)
        (gammaB : @computation B),
    computation_R _ _ R_true gammaA gammaB →
     @computation_length A a gammaA = @computation_length B b gammaB.

Lemma L2 : ∀ c : chain, parametric c →
    computation_length a (c A a) = computation_length b (c B b).

End S2.

Extraction Language Scheme.
Recursive Extraction chain_apply.