"""

Charlemagne Manual

Web site:	http://charlemagne.sourceforge.net/
Project page:	http://sourceforge.net/projects/charlemagne/
Author:	Bob Green
Email:	bob underscore green at speakeasy dot net

This document has an ambitious title. Feel free to email me with questions, requests, etc.

Note

The most recent version of this manual is available on-line at http://charlemagne.sourceforge.net/doc/manual.html. However, this version may sometimes contain documentation on unreleased features. The manual that matches the release can be found in the release itself.

Contents

Charlemagne Manual

Overview

Charlemagne is a genetic programming application which aims to be highly configurable and applicable to a broad range of problems. It is written in Python and Lisp and to some degree is extensible in both languages. It features built-in input-output mapping support, but also provides the ability to define complex fitness calculators in Lisp or Python.

This document in no way tries to cover genetic programming itself. However, you can explore this application by following the instructions below even without knowing alot about genetic programming.

License

Charlemagne is distributed under the GNU General Public License (GPL). Please see the file LICENSE.txt included with the distribution, point your browser to http://www.gnu.org/copyleft/gpl.html, type "charlemagne LICENSE" from the command line, or from an interactive charlemagne session type help(license).

Requirements

Linux/Unix/Cygwin
Python >= 2.2 (http://python.org)
CLISP (http://clisp.cons.org)

Currently, support is limited to Unix-like operating systems (and emulators), and only Linux and Cygwin have been tested. Win32 support is almost there, in fact, the only thing missing at this point is a batch file equivalent to

There is one technical issue preventing Windows support right now, but hopefully this will be corrected in the future.

Download

The official releases are available for download at:

http://sourceforge.net/project/showfiles.php?group_id=36462

However, there sometimes will be bugfixes, enhancements, and major developments available only via CVS. To get the absolute latests version:

$ cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/charlemagne 
  login  
$ cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/charlemagne 
  co charlemagne

When prompted for a password for anonymous, simply press the Enter key.

You can install this development version the same as you would a release, as described below.

Installation

See the file README.txt included with the distribution. But briefly, as root type:

$ python setup.py install

Quick Start

For the impatient, you can try running some of the examples. Many are trivial. First, make a working copy of the examples directory tree. From your working directory:

$ cp -r /usr/local/share/charlemagne/examples .

Change into the example0 directory:

$ cd examples/example0

And run the example:

$ ./example0

Then you could try:

$ cd ../example1
$ ./example1

There are some more difficult problems in example2:

$ cd ../example2

You can run any of:

$ ./cos2x
$ ./sin2x-easy
$ ./sin2x-hard
$ ./tan2x

The problems sin2x-hard and tan2x are non-trivial.

If you look at the scripts you'll notice all of the examples are just calls to charlemagne with command line parameters. You can try tweaking these if you want. For a description of available command-line options, try:

$ charlemagne --help

Interactive Session

Getting Started

Starting charlemagne in the simplest way possible results in an interactive session:

$ charlemagne
Charlemagne-2.0.0 (June 10, 2003)
http://charlemagne.sourceforge.net
Copyright © 2003 Robert Green.
Licensed under the GNU General Public License (GPL).
>>>

This is actually a Python session with some additional user references available for controlling the charlemagne application. The core objects are automatically instantiated for you. The important user reference variables are:

cfg - charlemagne.configurator.TextConfigurator object

pop - charlemagne.population.ConsolePopulation object

lsp - charlemagne.interpreter.CLISPInterpreter object

All public methods of these classes are available for interactive manipulation. These instructions cover the most important functionality.

cfg - The Configurator

In any interactive session, we are provided a reference to a Configurator object through the variable cfg.

To see the current contents of the configurator:

>>> cfg.show()

You'll see that some of the parameters are configured and others are set to None. The configurator comes initialized with some default values, but there are some parameters which must be set before we can move on. Let's configure a few parameters:

>>> cfg.configure("name")
Run Name [None]: tutorial

Once again, let's look at the contents of the configurator:

>>> cfg.show()

We can see the name parameter has been modified. We could have performed the same task without resorting to the interactive parameter prompt:

>>> cfg.configure("name", "tutorial")

The remaining mandatory parameters that are not yet set are input, output, and vocabulary. We'll do that now:

>>> cfg.configure("input", map(lambda x: [x*1.0], range(10)))

If you are familiar with the Python map function and the lambda construct, the above will look familiar to you. If not, this was just a handy way for defining the input set as shown below:

>>> cfg.show("input")
[[0.0], [1.0], [2.0], [3.0], [4.0], [5.0], [6.0], [7.0], [8.0], [9.0]]

This is a set of one dimensional input vectors. In general, the input vectors can be of any dimensionality.

We define the output set in similar fashion:

>>> import math
>>> cfg.configure("output", map(lambda x: math.sin(2*x), range(10)))

This sets the output set to be a list of values corresponding to the input set, specifically each output is the sin of its corresponding input:

>>> cfg.show("output")
[0.0, 0.8414709848078965, 0.90929742682568171, 0.14112000805986721, 
-0.7568024953079282, -0.95892427466313845, -0.27941549819892586, 
0.65698659871878906, 0.98935824662338179, 0.41211848524175659]

The final property which we will set is the vocabulary. This is the list of building blocks the programs can be constructed from. The vocabulary is divided into three parameters: "terminals" - elements which take no parameters, "one-args" function which take exactly one argument, and "two-args" functions which take exactly two arguments. We will set the vocabulary parameters as follows:

>>> cfg.configure("terminals")
Terminals [None]: INPUT1,2.0
>>>cfg.configure("one-args")
One Argument Functions [None]: cos,tan
>>>cfg.configure("two-args")
Two Argument Functions [None]: +,*,-,%

Above, INPUT1 refers to 1st element of the input vector (in this case there is only one element of each vector, but in the case of a higher dimensionality input set, you enter additional terminals in the form INPUT2, INPUT3, etc.) We also provide the additional terminal element 2.0, because it seems likely that having a 2.0 readily available may help in this case.

We also provide the one argument functions sin and cos, as well as providing the four standard arithmetic operators as the two argument functions. (% is a safe division here, not modulo. Using the real operator will result in divide by zero errors.)

Currently there is no mechanism for allowing functions which take more than two arguments or functions which take an arbitrary number of arguments, although this is a planned enhancement.

Note: Although you don't need to do this now, to be prompted for all available parameters, we can run configure with no parameters:

>>> cfg.configure()

Pressing enter at any parameter prompt leaves the value of that parameter unchanged. (The current value is display in square brackets to the left of the colon in the prompt.)

We are almost done with the configurator, but we still need to apply the parameters to the environment. This is done with the command:

>>> cfg.apply()

That's all we need to do with the configurator for now, but to see full class documentation for the instantiated Configurator, you can type:

>>> cfg.help()

Typing 'q' gets you out of the help system and back to the session prompt.

pop - The Population

With all the necessary environment parameters set, we can create the initial generation using the pop reference variable:

>>> pop.populate()

After a moment, you should see some statistics printed out. These are the statistics for your generation 0 population.

You can look at the lisp source of all the programs in the population by executing:

>>> pop.show()

This is kind of silly, but it shows that there really is a population of random programs. Also for kicks, note:

>>> len(pop)
500

This shows you the number of programs in the population. Also try:

>>> pop[0].show()

This shows you the lisp of the first program in the population.

Now, let's try getting to the next generation:

>>> pop.next()

Here, you'll see a progress bar work across the screen as the population is bred. When complete, the stats for the new generation will be calculated and printed. The variable pop now contains the next generation of programs. Calling pop.next() repeatedly will step the population along from generation to generation. If you want this process to continue until a solution is found:

>>> pop.breed()

Since the problem we've defined is non-trivial, the truth is that the parameters might take some more tweaking and perhaps a long time before we find a solution. If you do take the time to step through a few generations, you should notice the Average Adjusted Fitness of the population creeping upwards in the statistics report.

Once you get impatient, type CTRL-C to stop execution. You exit the session the same way as any python session, CTRL-D.

lsp - The Interpreter

When you enter interactive mode, a CLISP session is started in the background. While its possible to use the system without ever explicitly referencing this session, there are some cases where it is useful.

First, its interesting to note that we have a full lisp interpreter at our fingertips from within our session:

>>> lsp.evaluate("(+ 2 2)")
'4'

Also notice that this is a persistent lisp session. We can define variables that will remain defined for the remainder of the session:

>>> lsp.evaluate("(defvar test-var (+ 2 2))")
'TEST-VAR'
>>> lsp.evaluate("test-var")
'4'

We can also define functions in this way:

>>> lsp.evaluate("(defun sin2x (x) (sin (* 2 x)))")
'SIN2X'
>>> lsp.evaluate("(sin2x 0.25)")
'0.47942555'

Interactive Lisp Sessions

Interactive lisp sessions are also possible:

>>> lsp.interactive()
CLISP> (+ 2 2)
4
CLISP> exit
>>>

This feature is in an alpha state. In particular it is very fragile at the moment concerning syntax errors in expressions. The interpreter will get out of sync after a bad query. You can restart the interpreter to "clear" it by executing:

>>> lsp.restart()

Extending The Fitness Environment

The lisp session we are manipulating is the same one that the programs from the population are evaluated in. This means we can extend the environment that the population of programs "lives" in. For instance, we can define functions and then add them to the vocabulary:

>>> lsp.evaluate("(defun sin2x (x) (sin (* 2 x)))")
'SIN2X'
>>> cfg.configure("one-args")
One Argument Functions [None]: cos,tan,sin2x
>>> cfg.apply()

We could also define terminals in this fashion:

>>> lsp.evaluate("(defvar EARTHS-GRAVITY 9.7982)")
'EARTHS-GRAVITY'
>>> cfg.configure("terminals")
Terminals [None]: INPUT1,EARTHS-GRAVITY
>>> cfg.apply()

Command-Line

When you're repeatedly setting up the environment for a particulate problem it may seem tedious to have to type in commands in interactive mode. For this reason, we support scriptable command-line parameters. Any parameter which is available in the configurator can also be initialized via a corresponding command-line switch. We can start an interactive session in such a way that some parameters are already set:

$ charlemagne --name test --crossover 0.9 --replicate 0.1 --mutate 0

The application starts as normal. To confirm that the specified parameters have been set, try:

>>> cfg.show()

Press CTRL-D at the prompt to exit.

Typing the parameters out on the command-line each time is no easier than typing them out in the configurator. The power of the command-line switches is in scripting them. To illustrate, let's copy the examples that are distributed with the package to our working directory:

$ cp -r /usr/local/share/charlemagne/examples .

Change into the example0 directory:

$ cd examples/example0

Now we have some data and vocabulary files available for a trivial problem. Create a file 'tutorial' in your working directory as follows:

#!/bin/bash    
charlemagne --name tutorial-example --inputs-file inputs.asc 
--outputs-file outputs.asc --vocabulary-file vocab.asc

Here we are using predefined data files to specify the inputs, outputs and vocabulary.

Make the script executable and run it:

$ chmod +x tutorial
$ ./tutorial

And apply the contents of the configurator to the environment:

>>> cfg.apply()

If we want to enter the application with the configurator having already been applied, we can use the APPLY scripting command.

Modify the tutorial script as follows:

#!/bin/bash
charlemagne APPLY --name tutorial-example --inputs-file 
inputs.asc --outputs-file outputs.asc --vocabulary-file vocab.asc

This command effectively causes cfg.apply() to be called automatically on start up. The complete list of scripting commands is as follows:

MANUAL - activate the online manual, exit when done

CONFIGURE - interactive configure, all parameters

APPLY - apply

POPULATE - apply, populate

BREED - apply, populate, breed

BATCH - apply, populate, breed, exit

At most, one of these commands should be specified, and it must be the first parameter on the command line. For POPULATE, BREED, and BATCH to work, you must also specify all the mandatory, non-defaulted parameters on the command-line i.e. the name, input, output, and vocabulary.

Parameters

To get a brief rundown of all available parameters:

$ charlemagne --help

Below is some additional detail on the paramters.

Name (-n, --name)

The name of the run. This is used for logging runs by name. This parameter is mandatory.

Input (--input)

The list of input vectors, specified Python in the style of Python list. The vectors can be of any dimensionality, but they must be consistently so.

Example:

--input [[1.0,1.0],[1.0,-1.0],[-1.0,1.0],[-1.0,-1.0]]

Terminals (--terminals)

The terminal (atomic) elements of the vocabulary. The special keywords INPUT1, INPUT2, INPUT3, etc. are used to represent the elements of the input vector begin evaluated. The keyword CONSTANT-SYNTHESIS evaluates to a random constant in each place it is used.

Example:

--terminals INPUT1,0,1.0,CONSTANT-SYNTHESIS

One Argument Functions (--one-args)

The functional elements of the vocabulary which take exactly one argument.

Example:

--one-args sin,cos,tan

Two Argument Functions (--two-args)

The functional elements of the vocabulary which take exactly two arguments.

Example:

--two-args +,-,*,%

Note % is a special safe divide function which does not generate divide by zero errors.

Inputs File (-I, --inputs-file)

Use data from specified file as the test inputs. The specified inputs file must be an ASCII file with one input vector per line, with each term sepeated by commas. Example inputs file:

1,1
-1,1
1,-1
-1,-1

Use this instead of specifying inputs directly with the Input parameter.

Output (--output)

The list of output values, specified in the style of a Python list. Each output corresponds to the input vector in the same position in the input list

Example:

--output [2.0,0,0,-2.0]

The Output parameter is for use with the (default) OUTPUT deviance-calculation method.

Outputs File (-O, --outputs-file)

Use data from specified file correct test output. The outputs file should be an ASCII file with one output per line. The Outputs File parameter is mandatory unless you use --generate-outputs to generate the correct output values, or you use a --deviance-calculation method other than the default OUTPUT method. Example outputs file:

Use this instead of specifying the outputs directly with the Output parameter.

Generate Outputs (--generate-outputs)

Generate output values from the input values using the specified method. If you generate outputs with this option using either method, neither --outputs or --outputs-file should be specified.

Available methods are:

LISP-EXPRESSION=<expr>
PYTHON-CLASS=<module>.<classname>

LISP-EXPRESSION output generation requires that you specify a full lisp expression which uses lisp variables of the form INPUT1 INPUT2, etc. to operate on the input data.

PYTHON-CLASS output generation requires that you create an charlemagne.outputgenerator.OutputGenerator subclass which implements the generate() method to return an output list.

Vocabulary File (-V, --vocabulary-file)

Use vocabulary defined in specified file. The vocabulary file must be an ASCII file of the following form: It should be exactly three lines long with the first line being a command seperated list of available terminals values. The second line should be a comma seperated list of available functions which take exactly one argument. The third and final line should be a comma seperated list of available functions which take exactly two arguments. Example vocabulary file:

INPUT1,1.0,PI
sin,cos,tan
-,+

Population Size (-S, --population-size)

Use a population of the specified size.

Maximum Initial Depth (-d, --initial-depth)

Start with random programs no deeper than the specified depth.

Maximum Depth (-D, --maximum-depth)

Restrain programs to the maximum depth specified. This prevent crossovers and mutations from endless growing the average depth of the population.

Standard Crossover Probability (-C, --crossover)

Perform the standard crossover operation with the specified probability.

Context Sensitive Crossover Probability (-P, --cs-crossover)

Perform the context sensitive crossover operation with the specified probability.

Replicate Probability (-R, --replicate)

Perform the replicate operation with the specified probability.

Mutate Probability (-M, --mutate)

Perform the mutate operation with the specified probability.

Selection Method (--selection)

Use the specified selection method.

Available methods are:

FITNESS-PROPORTIONATE=<degree>
TOURNAMENT=<size>.

FITNESS-PROPORTIONATE selection prefers fit to unfit programs and degree is a measure of how tightly bound this selection is. A degree of 1.0 is very greedy toward fit programs, numbers less than 1 are proportionately less greedy.

TOURNAMENT selection randomly selects programs from the population and holds a tournament of the specified size. The best program in the tournament is selected.

Precision (--precision)

Use the specified precision in determining hits.

Force Best (--force-best)

Force the best program to replicate into the new generation the specified number of times.

Fitness Environment File (--fitness-environment)

Evaluate fitness-function in the lisp environment created by the specified lisp file.

Deviance Calculation (--deviance-calculation)

Calculate deviance using the specified method.

Available methods are:

OUTPUT
LISP-FUNCTION=<name>
PYTHON-CLASS=<module>.<classname>

The OUTPUT method uses the difference between the actual output and the output list to directly calculate the deviance. This is the default method and is useful if you are searching for a mapping or approximation function which maps a set of inputs to a set of outputs.

The LISP-FUNCTION method uses the output of the specified lisp function as the deviance. The function should take exactly one argument which is the lisp expression of the program to be calculated for.

The PYTHON-CLASS method requires that you must create a charlemagne.deviancecalculator.InputDevianceCalculator subclass which implements calculate() to return the deviance a program.

Both the LISP-FUNCTION and PYTHON-CLASS methods are suitable for complex domains where the fitness of an individual is a more involved measure than just its ability to map input to output. With these methods it is possible to measure deviance in terms of complex multi-step simulations. If you're using either of these two methods, you do not specify --output or --outputs-file, as the output of the program being tested could be something much more intangible. For example, if you were searching the space of checkers playing programs, you could write a lisp function checkers-deviance, or define a Python class CheckersDevianceCalculator, which interfaced with a checkers simulation and returns some measure of how well a particular program played the game. When defining deviance calculators, a deviance of zero is the perfect program with the higher the deviance is, the worse that program is.

LISP-FUNCTION Deviance Calculation Example

Here is an example of an interactive session defining a complex deviance calculating function:

>>> lsp.interactive()
CLISP> (defvar SPEED 0.05)
SPEED
CLISP> (defvar TURNS 50.0)
TURNS
CLISP> (defun simulator-helper (p turns x y)
  ...>     (if (eq turns 0)
  ...>         (sqrt (+ (* x x) (* y y)))
  ...>         (progn
  ...>             (setq INPUT1 x)
  ...>             (setq INPUT2 y)
  ...>             (let ((theta (degrees-to-radians (eval p))))
  ...>                 (simulator-helper
  ...>                     p
  ...>                     (- turns 1)
  ...>                     (+ x (* SPEED (cos theta)))
  ...>                     (+ y (* SPEED (sin theta))))))))
'SYRUP-DEVIANCE-HELPER'
CLISP> (defun simulator (p)
  ...>        (/ (deviance-helper TURNS 50 INPUT1 INPUT2) TURNS))
'SYRUP-DEVIANCE'
CLISP> exit
>>> cfg.configure("deviance-calculation", LISP-FUNCTION=simulator)
>>> cfg.configure("input", [[1.0,1.0], [0.5,0.5], [-1.0,-1.0], 
    [-0.5,-0.5], [0.25,-0.25],[-1.0,0.75], [0.25,-0.5], [1.0,-1.0]])
>>> cfg.apply()

The simulator function above measures the fitness of programs based on their ability to navigate to the origin from their starting location of (INPUT1,INPUT2) are the initialize coordinates, and the output of a program is the direction in degrees which they choose to move. The SPEED and TURNS variables define the speed in which the programs move each turn and the number of turns to be simulated, respectively.

Development

The source code for the releases is available as a .tar.gz or a .src.rpm. The unreleased development code is accessable via CVS. You can browse the CVS repository at:

http://cvs.sourceforge.net/cgi-bin/viewcvs.cgi/charlemagne/

Or to check out the code anonymously:

$ cvs -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/charlemagne 
  login  
$ cvs -z3 -d:pserver:anonymous@cvs.sourceforge.net:/cvsroot/charlemagne 
  co charlemagne

When prompted for a password for anonymous, simply press the Enter key.

"""