/**********************************************************************/
/*              klint - Static Analyzer for KL1 Programs              */
/*                                                                    */
/*                            Version 2.1                             */
/*                             March 1999                             */
/*                                                                    */
/*                           Kazunori Ueda                            */
/*           Department of Information and Computer Science           */
/*                         Waseda University                          */
/*                    ueda@ueda.info.waseda.ac.jp                     */
/*                                                                    */
/*                 Copyright (C) 1999  Kazunori Ueda                  */
/*                                                                    */
/**********************************************************************/

1. OVERVIEW

Klint v2.1 is a constraint-based static analyzer of KL1 programs, which
performs 

  - mode analysis, 
  - linearity analysis, and
  - type analysis.

The key feature of klint v2.1 is mode analysis.  It analyzes the mode
of a KL1 program using the ideas described in [1].  Well-modedness
roughly means that the communication protocols used in a program are
cooperative.  It is highly recommended to write your KL1 programs in a
well-moded manner, because it enables the klint system to detect many
of program errors statically [3] and it enables more sophisticated
compiler optimization.

Klint v2.1 also analyzes the linearity and the type of the program.
Linearity analysis distinguishes data that are guaranteed to be
referenced from only one reader occurrence of a variable from data
possibly shared by two or more occurrences of variables [6].
Linearity analysis provides basic information for compile-time garbage
collection.

Types in klint v2.1 express what categories of function symbols
(integers, floating-point numbers, [], other constants, strings,
vectors, cons, other function symbols) may occur at each path (=
position in a data structure that a goal may possibly have as its
argument).

Klint v2.1 does not require programmers to provide mode, linearity, or
type declarations; instead, klint v2.1 will infer the
mode/linearity/type of a given program.

Klint v2.1 consists of three parts: a constraint generator, a
mode/linearity constraint solver, and a type constraint solver.  The
constraint generator generates the mode/linearity/type constraints
syntactically imposed by a given KL1 program.  The mode/linearity
constraint solver first tries to solve the set S of mode constraints
by forming a mode graph [1] to see if S is consistent (satisfiable).
The mode graph thus formed is regarded as expressing the 'principal'
(i.e., most general) mode of the program.  If S is [in]consistent, the
program is said to be [non-]well-moded, respectively.  When the
program turns out to be well-moded, the solver goes on to perform
linearity analysis and forms the linearity graph of the program.

The type constraint solver forms the type graph of a program.  Unlike
mode analysis, type analysis will not report a type error because
klint v2.1 does not assume that function symbols belonging to different
categories cannot occur at the same path in a program.

An introductory article of mode analysis can be found in [2].

2. USAGE

Klint is extremely easy to use.

INPUT:

  % klint [+mltg12] [file ...]

The input to klint v2.1 is a KL1 program that may consist of more than
one module, which can be split into several files.  When no input file
is specified, klint will read a program from standard input (stdin).

Possible options are:

  +m  perform mode analysis.
  +l  perform mode and linearity analysis.
  +t  perform type analysis.

  When none of +m, +l and +t are specified, all analyses are performed.

  +g  output the result in a graph format.  By default, the result is
      output in a declaration format (see below).

  +1  output mode/linearity/type constraints rather than the result of 
      analysis.

  +2  read mode/linearity/type constraints (obtained by using the +1
      option) instead of a KL1 program and analyzes their
      satisfiability. 

  Options +1 and +2 are mutually exclusive.  Options +m, +l and +t are 
  independent of constraint generation and have an effect only on
  constraint satisfaction.


OUTPUT:

When the given program is well-moded, klint v2.1 outputs the mode,
linearity, and type information of a program from stdout (standard
output).  The default format is a declaration form (see below), but the
+g option turns it to a graph form reflecting the internal graph
structures. 

When the program is non-well-moded, klint v2.1 reports what mode
constraint finally caused an mode error, and then outputs the current
mode graph.  It then outputs a type graph, skipping linearity
analysis.  Non-well-modedness means that the set of mode constraints
is inconsistent.

Another analyzer, kima, finds minimal inconsistent subsets of mode
constraints and tries to correct errors in variables using the ideas
described in [4][5].  This release of klint encloses the latest version
of kima (version 2).

The mode/linearity/type constraints generated by klint v2.1 can be
viewed by using the option +1.  Another option, +2, tells that the input
is a set of mode/linearity/type constraints rather than a KL1 program.
So

  % klint1 +1 xxx.kl1 | klint +2

will produce the same output as

  % klint xxx.kl1 .


3. EXAMPLES

Suppose the file merge.kl1 contains a program

  :- module m.

  merge([],   Y,    Z ) :- true| Z=Y.
  merge(X,    [],   Z ) :- true| Z=X.
  merge([A|X],Y,    Z0) :- true| Z0=[A|Z], merge(X,Y,Z).
  merge(X,    [A|Y],Z0) :- true| Z0=[A|Z], merge(X,Y,Z).

Then klint v2.1 will produce:

  > klint merge.kl1
  %%% Mode %%%
  :- mode m:merge(1,1,-1).
  :- modedef 1 = (+,[[2|1]]).
  :- modedef 2 = (?,[]).

  %%% Linearity %%%
  :- lin m:merge(1,1,1).
  :- lindef 1 = (?,[[2|1]]).
  :- lindef 2 = (?,[]).

  %%% Type %%%
  :- type m:merge(1,1,1).
  :- typedef 1 = ([nil,cons],[[2|1]]).
  :- typedef 2 = ([],[]).

The above mode information can be interpreted as follows: the first and
the second arguments of 'merge' has the mode defined by the mode
definition ":- modedef 1 = (+,[[2|1]])."  This mode definition says (1)
that the principal function symbols is input (+) and (2) that when the
symbol is a list constructor, its 'car' has the mode defined by ":-
modedef 2 = ..." and its 'cdr' has the same mode as the whole list.
Mode definition 2 defines a mode which is totally undefined.  Hence mode
definition 1 represents a mode of a list whose constructors are input
and whose elements have an undefined (but identical) mode.

The linearity information tells that the 'merge' program itself will not
cause any sharing of data.  In general, '?' means that the analyzed
program will not cause the sharing of data at that path.  Thus, as long
as the elements of the two input streams are non-shared (indicated by
lindef 2), the elements of output streams are guaranteed to be
non-shared.  The program will not cause the sharing of the skeletons of
input or output streams, either (indicated by lindef 1).  As long as the
skeletons of the input streams are not shared, they can be recycled
locally to form the skeleton of the output stream.  Whether the
skeletons and the elements of the input streams can become shared can be
known only by analyzing the whole program.  For instance, when the input
skeletons are non-shared but the list elements can be shared, the value
of linearity definition (lindef) 2 becomes '**' (shared) rather than '?' 
(unconstrained).

The meaning of the type information shown above should now be clear; for 
instance, typedef 1 represents a list whose elements are of the type
represented by typedef 2.

The command-line option +g will change the output format to the textual
representation of mode/linearity/type graphs:

  > klint +g merge.kl1
  %%% Mode %%%
  node(0): (unconstrained)
   <(m:merge)/3,1> ---> node(13)
   <(m:merge)/3,2> ---> node(13)
   <(m:merge)/3,3> --*> node(13)
  node(13): in
   <cons,1> ---> node(15)
   <cons,2> ---> node(13)
  node(15): (unconstrained)

  %%% Linearity %%%
  node(0): (unconstrained)
   <(m:merge)/3,1> ---> node(13)
   <(m:merge)/3,2> ---> node(13)
   <(m:merge)/3,3> ---> node(13)
  node(13): (unconstrained)
   <cons,1> ---> node(15)
   <cons,2> ---> node(13)
  node(15): (unconstrained)

  %%% Type %%%
  node(0): []
   <(m:merge)/3,1> ---> node(13)
   <(m:merge)/3,2> ---> node(13)
   <(m:merge)/3,3> ---> node(13)
  node(13): [nil,cons]
   <cons,1> ---> node(15)
   <cons,2> ---> node(13)
  node(15): []

Here's how to read the above mode graph: The arcs from the root node
(0) say that the first and the second arguments have exactly the same
mode, which is represented by node 13, and that the third argument has
an exactly opposite mode (note the asterisk in the arrow, which means
mode inversion [1]).  Node 13 says that the cdr of a list occurring as
an argument of 'merge' will have the same mode with the whole list.
The graphical representation of this graph can be found in [3].

Here is an example where sharing is detected:

  :- module main.

  main :- true | mults(1000).

  mults(Max) :- true |
      mults2(Max,Ys), outterms(Ys,Os), stdinout:outstream(Os).  

  mults2(N,Ys) :- true |
      timeslist(2,N,Ys,Ys2),
      timeslist(3,N,Ys,Ys3),
      timeslist(5,N,Ys,Ys5),
      merge(Ys3,Ys5,Ys35), merge(Ys2,Ys35,Ys235),
      Ys=[1|Ys235].

  timeslist(U,N,[A|_ ],Ys ) :- U*A>=N | Ys=[].
  timeslist(U,N,[A|Xs],Ys0) :- U*A< N |
      W:=U*A, Ys0=[W|Ys], timeslist(U,N,Xs,Ys).

  merge([],    Ys,    Zs ) :- true | Zs=Ys.
  merge(Xs,    [],    Zs ) :- true | Zs=Xs.
  merge([A|Xs],[B|Ys],Zs0) :- A < B | Zs0=[A|Zs], merge(Xs,[B|Ys],Zs).
  merge([A|Xs],[B|Ys],Zs0) :- A > B | Zs0=[B|Zs], merge([A|Xs],Ys,Zs).
  merge([A|Xs],[B|Ys],Zs0) :- A=:=B | Zs0=[A|Zs], merge(Xs,Ys,Zs).

  outterms([],      Os0) :- true | Os0=[].
  outterms([X|Xs1], Os0) :- true |
       Os0=[putt(X),nl|Os1], outterms(Xs1,Os1).

The mode and linearity information of this program for Hamming's problem
(to produce a list of natural numbers having only 2, 3 and 5 as
denominators) is as follows:

  > klint +l hamm.kl1
  %%% Mode %%%
  :- mode main:mults(++).
  :- mode main:mults(++,-1).
  :- mode main:outterms(1,-3).
  :- mode main:timeslist(7,++,++,-1).
  :- mode main:merge(1,1,-1).
  :- mode stdinout:outstream(3).
  :- modedef 1 = (+,[[2|1]]).
  :- modedef 2 = (+,[]).
  :- modedef 3 = (+,[[4|5]]).
  :- modedef 4 = (+,[putt(2)]).
  :- modedef 5 = (+,[[6|3]]).
  :- modedef 6 = (+,[]).
  :- modedef 7 = (+,[]).

  %%% Linearity %%%
  :- lin main:mults(1).
  :- lin main:mults2(2,**).
  :- lin main:outterms(**,3).
  :- lin main:timeslist(**,**,**,**).
  :- lin main:merge(**,**,**).
  :- lin stdinout:outstream(3).
  :- lindef 1 = (?,[]).
  :- lindef 2 = (?,[]).
  :- lindef 3 = (?,[[4|5]]).
  :- lindef 4 = (?,[putt(**)]).
  :- lindef 5 = (?,[[?|3]]).
  :- lindepend 1 => 2.

The symbol '++' in mode information means that all the paths below
that argument position are input paths.  Other mode symbols not shown
in the examples are '-' meaning output and '--' meaning that all the
paths below that position are output paths.

The linearity informationau72f says (among others) (1) that whether
the first argument of 'mult2' is non-shared depends on whether the
first argument of 'mult' is non-shared (lindepend declaration), (2)
that the arguments indicated '**' such as the second argument of 'mult2'
are all shared, (3) that the skeletons of the lists conveyed by the
second argument of 'outterms' and the first argument of 'outstream'
are non-shared, and (4) that the argument of 'putt' occurring in the
list produced by 'outterms' is shared.

As long as the program is executed by itself, it is guaranteed that the
first argument of 'mults' is non-shared.  However, when 'mults' is
called from other modules, its argument may possibly be shared, in which
case the first argument of 'mults2' also becomes shared.  This is what
the implication form of the lindepend declaration represents.


4. INSTALLATION

Klint v2.1 is totally written in KL1 and can be compiled using KLIC.
Just say

  % make

to make klint.

For your information, the klint v2.1 distribution contains the following
files:

  (1) Makefile -- make file
  (2) Readme-E -- This file
      Readme-J -- Japanese version of this file
  (3) klint-main4.kl1, read_program4.kl1, normalize5.kl1, unify.kl1
      builtin_DB6.kl1, numberbuiltin3.kl1, findpath4.kl1
      constraintsB.kl1, type.kl1, stdinout.kl1 commandline.kl1
      klint25.kl1, graphF.kl1, tgraph3.kl1, decode2.kl1, tdecode.kl1
      reduce6.kl1, sort.kl1, outgraph.kl1, outdecl.kl1, tc.kl1,
      outconstraints.kl1
	       -- source files of klint v2.1
  (4) examples/merge.kl1, examples/hamm.kl1
               -- sample KL1 programs used in this manual
  (5) kima-v2/ -- directory of the kima v2, a diagnoser for KL1
                  programs.  See kima-v2/Readme and kima-v2/INSTALL for
                  more details.


5. FEATURES NOT YET SUPPORTED

(1) Macro expansion.  We expect that a future release of klic will
feature a KL1 counterpart of cc's -E option.

(2) 'inline' expansion feature.

(3) Generic object creation of the form `generic:new(...)' and generic
    method calls of the form "generic:METHOD"

All built-in predicates of KLIC version 3 are supported except for the
following: 

  - 'new_vector' and 'new_string' with the initial value specified by
    lists (using them with size specification is supported),
  - 'arg' and 'vector_element' called from clause guards, and
  - the built-in body predicate 'unbound'.

Built-in predicates are treated as polymorphic predicates; that is,
different calls of built-in predicates can have different
modes/linearities/types.  Stratification-based polymorphism discussed in
[4] is planned to be supported in a future release.

Mode/linearity/type declaration, provided by programmers as assertions,
is an important feature and will be supported in the next release.


6. RESTRICTIONS

(1) Klint v2.1 may strengthen mode constraints using the assumption
that nonlinear variables (i.e., variables with more than three
occurrences) always have simple, one-way dataflow rather than
bi-directional dataflow such as massages with reply boxes.  However,
our observation is that nonlinear variables have almost always been
used for one-way communication.

(2) The satisfiability of non-binary constraints that could not be
reduced to unary/binary constraints is not checked.  Instead, the system
reports what non-binary constraints remained unreduced.



REFERENCES

[1] Ueda, K. and Morita, M., Moded Flat GHC and Its Message-Oriented
Implementation Technique.  New Generation Computing, Vol.13, No.1
(1994), pp.3-43.

[2] Ueda, K., I/O Mode Analysis in Concurrent Logic Programming.  In
Theory and Practice of Parallel Programming, LNCS 907, Springer, 1995,
pp.356-368.

[3] Ueda, K., Experiences with Strong Moding in Concurrent
Logic/Constraint Programming.  In Proc. Int. Workshop on Parallel
Symbolic Languages and Systems (PSLS'95), T. Ito, R.H. Halstead, Jr.,
and C. Queinnec (eds.), LNCS 1068, Springer, 1996, pp.134-153.

[4] Cho, K. and Ueda, K., Diagnosing Non-Well-Moded Concurrent Logic
Programs.  To be presented at 1996 Joint International Conference and
Symposium on Logic Programming (JICSLP'96), Bonn, Germany, September
1996.

[5] Ajiro, Y., Ueda, K. and Cho, K., Error-correcting Source Code.  In
Proc. Fourth Int. Conf. on Principles and Practice of Constraint
Programming (CP'98), LNCS 1520, Springer-Verlag, Berlin, 1998, pp.40-54.

[6] Ueda, K.,  Linearity Analysis of Concurrent Logic Programs.  IPSJ
SIGPRO meeting, August 1998.  (in Japanese)
