M. Buliga version: 21.06.2019

- The mol notation for graphs in chemlambda, hapax, interaction combinators and other small graph rewrite systems
- How are understood the rewrites in chemlambda [1], [2], or interaction combinators [3] versus how are understood the rewrites in hapax [4] (and other small graph rewrite systems [5])

- we have a set "N" of nodes (non empty)
- and a set "E" of edges.
- to any edge "e" from "E" we associate two nodes, say "a" and "b". These nodes are the extremities of the edge.

Also, remark that in this definition the extremities of an edge are given in no particular order. We say this is a graph with un-oriented or undirected edges.

These graphs don't have enough structure for our needs. That is why we shall add some structure:

- we ask that each node of the graph has a type. We first define a family of node types, think about a node type like a prototype of a node.

In chemlambda the node type is defined by one of the names "A", "L", "FI", "FO", "FOE", "Arrow", "FRIN", "FROUT", "T".

In interaction combinators the node type is "γ", "δ", "ε". - Each node type defines the valence of a node (i.e. the number of ports) and an order of the ports of the node. Therefore, compared to the general case of a graph, we do have now a particular order of the ports of a node

[type of node] FS [list of ports]

The [type of node] is, in chemlambda "A", "L", "FI", "FO", "FOE", "Arrow", "FRIN", "FROUT", "T", in interaction combinators is "γ", "δ", "ε".

The [list of ports] is an array (a list) with FS as field separator, where the element "i" of the list is the name of the edge connected at the edge port "i".

Let's revisit the definitions we have until now, by adding to them a bit of details, in order to pass from the mathematical notation to a notation we can use with computers.

A graph is:

- a finite set of nodes, enumerated by numbering them in an arbitrary order, from 1 to Nodes_number
- a finite set of edges tags (i.e. names). It is understood that each edge is referred by it's tag, so any two different edges have different tags.
- each node has a node type and connects to the edges via node ports.

For example, in chemlambda we use the LS as "newline" and FS as " " and the mol notation

A a b c

L c b a

describes a graph which has 2 nodes and 3 edges. In JSON notation we would write:

graphMol = [["A", "a", "b", "c"],

["L", "c", "b", "a"]

]

In javascript we could define this as

graphMol = [{type:"A", ports:["a", "b", "c"]},

{type:"L", ports:["c", "b", "a"]}

]

This describes a graph with 2 nodes and 3 edges, where:

- node 1 is of type "A". The port 1 is connected to the edge "a", port 2 is connected to the edge "b", port "3" is connected to the edge "c"
- node 2 is of type "L". The port 1 is connected to the edge "c", port 2 is connected to the edge "b", port "3" is connected to the edge "a"

A b a c

L c b a

because even if this second graph has also 2 nodes, of types "A" and "L", also 3 edges, named "a", "b", "c", there is a difference between the way edges are connected to ports. While for the first graph:

- edge "a" has the extremities: node 1, connected at port 1, and node 2, connected at port 3
- edge "b" has the extremities: node 1, connected at port 2, and node 2, connected at port 2
- edge "c" has the extremities: node 1, connected at port 3, and node 2, connected at port 1

- edge "a" has the extremities: node 1, connected at port 2, and node 2, connected at port 3
- edge "b" has the extremities: node 1, connected at port 1, and node 2, connected at port 2
- edge "c" has the extremities: node 1, connected at port 3, and node 2, connected at port 1

Similarly, in interaction combinators the following mol notation

δ b a c

δ a b c

describes a graph with 2 nodes and 3 edges, etc.

As you see, the mol notation of such a graph does introduce a supplimentary structure, namely the ordering of the nodes of the graph. For the purposes of graph rewriting, this ordering is irrelevant. Otherwise we would not have a graph-rewriting formalism, we would have a term-rewriting formalism.

There are two conditions:

- the [list of ports] should have length compatible with the [type of node]
- each edge name should appear exactly twice in the mol notation. Because each edge has two extremities.

However, we may work with mol files, which are mol notations for graph which may have free edges. In a mol file an edge name appears at most twice. Why? because this way a sub-array of a mol notation array is a mol file. This is important because when we shall discuss about graph rewrites, we shall need to define patterns, which are sets of nodes in a graph with certain properties. We shall denote patterns as mol files.

Also, we shall accept graphs which have some edge extremities "free", that is dangling without being connected to a node port. The mol notation for such a graph has the property that each edge name appears at most twice. All in all a mol file describes the graphs we need, as well as their subgraphs.

- each edge has a source [node port] and a target [node port]
- which implies that each node port has a type itself:
- "in" if the node port is a target of an edge
- "out" if the node port is a source of an edge

This supplimentary structure, which is not present in usual interaction combinators, forces us to modify the type of a node, namely, a node type specifies, as previously, the name of the type ("A", "L", "FI", etc.), the number of ports, but also the type "in" or "out" of each port.

For example, the nodes of type "A" and "FI" have 3 ports, with the types ["in", "in", "out"]. The nodes "L", "FO", FOE" have 3 ports, with the types ["in", "out", "out"]. In the hapax repository this is specified in the hapax-nodes.js.

In general, in hapax we shall strive to impose for 3-valent node types, the following: always the 1st port has type "in" and the 3rd port has type "out", while the middle port has either "in" or "out" type.

The second modification we have, due to the oriented edges, is that an admissible mol file for a graph with oriented edges has the supplimentary property:

- each edge name appears at most twice, and if it appears twice then it appears once in a node port of type "in" and once in a node port of type "out"

δ b a c

γ a b c

is a valid mol file of an interaction combinator graph, but the list

A b a c

L a b c

is not a valid mol file for chemlambda, because the edge "c" appears twice in a port of type "out" and the edge "a" appears twice in a port of type "in".

In chemlambda (in general in hapax) there is an unique way to add nodes to an oriented graph which has some edges with free extremities. We simply define two new node types:

- "FRIN" (i.e. "free in") which has only 1 port, of type "out"
- "FROUT" (i.e. "free out") which has only 1 port, of type "in"

FRIN e

which becomes the source of the edge "e".

In the same way, if the edge "e" appears only once, in a node port of type "out", it means that the edge has a free target, therefore we add a node

FROUT e

which becomes the target of the edge "e".

(We could do the same in the un-oriented case, by using only one new node type, say "FREE" with only one port.)

First we explain the way we do rewrites in chemlambda v2 and interaction combinators. Then we explain how we do the chemlambda rewrites in hapax, because we do them in a different way.

- the rewrite has a LEFT pattern and a RIGHT pattern, both are mol files
- the rewrite replaces the LEFT pattern with the RIGHT pattern

LEFT = [["L", "a", "b", "c"], ["A", "c", "d", "e"]] becomes RIGHT = [["Arrow", "a", "e"], ["Arrow", "d", "b"]]

And here is an interaction combinators rewrite, where we add the node types "Arrow", as in chemlambda (and we'll have to add the COMB rewrites as well, but this is for later):

LEFT = [["γ", "a", "b", "c"], ["γ", "d", "e", "c"]] becomes RIGHT = [["Arrow", "a", "e"], ["Arrow", "b", "d"]]

In chemlambda and in interaction combinators the LEFT and RIGHT patterns, as mol files, have certain properties:

- in the LEFT and RIGHT mol files the edges names which appear only once are the same. That is because in graph terms the LEFT and RIGHT patterns represents sub-graphs and the edges which appear only once in the patterns are those which connect the patterns with the rest of the graph.
- the patterns are simple, made of two nodes and there is an active edge, in the examples given is the edge "c", which connects them. This simplifies very much the search for LEFT patterns in the graph.

- they are not conservative, in the sense that some nodes and edges are replaced with others, in a way which does not conserve the number of (typed) nodes, nor the edges (number or name)
- in the case of rewrites which increase the number of edges and nodes, we need a mechanism of invention of new edge names, so that there is no clash with the edge names already present elsewhere in the graph (or mol file)

In hapax the LEFT and RIGHT patterns are mol files of connected graphs, with a specified edge name. See hapax-mol.js for a list of paterns used in chemlambda, as done in hapax.

Any rewrite needs a pattern and a "token", and outputs a pattern. The tokens are small graphs (patterns). In hapax these are specified in hapax-chem.js. For example the "A-L" rewrite, as done in hapax, is:

{kind:"A-L", needs:"Arrow-Arrow", gives:["Arrow", "Arrow", "L-A"], pisource:[4,0,3,2,1], pitarget:[3,4,0,2,1]}

which means:

LEFT pattern "A-L" + token "Arrow-Arrow" becomes Pattern "Arrow" + Pattern "Arrow" + token "L-A"

by using the following permutations of sources and targets: pisource:[4,0,3,2,1], pitarget:[3,4,0,2,1]

Let's explain the same in mol notation. Instead of the LEFT pattern from chemlambda v2, we have a pattern

{named:"A-L",

edge:"a",

mol:[["L", "c", "b", "a"],

["A", "a", "d", "e"]]

}

and a token

{named:"Arrow-Arrow",

edge:"a",

mol:[["Arrow", "b", "a"],

["Arrow", "a", "b"]]

}

which corresponds (i.e. recognizes the pattern in the graph) to the mol file:

L c b a

A a d e

Arrow b' a'

Arrow a' b'

Now, according to the type "in" or "out" of the ports, we see that we have 5 sources, colored red, and five targets, colored yellow:

L c b a

A a d e

Arrow b' a'

Arrow a' b'

This is transformed into another pattern, by permuting the sources and the targets. We get:

L b' a b'

A a' a a'

Arrow c e

Arrow d b

which is, as the rewrite claims, recognized as two patterns "Arrow"

{named:"Arrow",

edge:"a",

mol:[["Arrow", "a", "b"]]

}

and a token "L-A"

{named:"L-A",

edge:"a",

mol:[["A", "c", "a", "c"],

["L", "b", "a", "b"]]

}

The permutation of sources is described by the array pisources and the permutation of targets is described by the array pitarget. In order to understand how this is done, first recall that (in javascript in particular) arrays elements are numbered starting with 0, therefore the array pisource = [4,0,3,2,1], for example, represents a permutation in the sense that the element 0 goes to 4, 1 goes to 0, 2 goes to 3, etc.

Then, in order to apply the permutations pisource and pitarget correctly, we need to be sure in what order the pattern recognition function sees the sources and targets. The problem is that we could easily pick an order by hand of the sources and targets, but how does the program picks them? We need to know this only once, when we define the rewrites. For this we apply the pattern recognition program to the patterns themselves. (At some point in the future this has to be automatized). The page [9] tells us exactly this, namely that for the pattern recognition program the order of the sources for this rewrite is:

a, b, e, a', b'

and the order of the targets for this rewrite is:

a, c, d, a', b'

You can check by hand that pisource and pitarget permutation do indeed what they are expected to. For this see that, for example a which is source 0, appears in position 1 in pisource, therefore it will go in the place of source 1, which is b, and so on.

The hapax way of doing rewrites is more general and more flexible than the usual way. It can be applied to interaction combinators easily (one needs a way to add the sources-targets supplimentary information, in a way which respects the original formalism), but more interestingly, it can be applied to a variety of other graph rewrite systems which appear in science or mathematics. Even more, it can be used for random systems of graph rewrites, which are in vast numbers. This justifies the "hapax" name, which means "only once". See first descriptions of this idea in [10], [11].

[2] chemlambda v2 page

[3] Y. Lafont, Interaction Combinators

[4] hapax repository

[5] small graph rewrite systems

[6] Molecular computers

[7] hapax-mol.js

[8] hapax-chem.js

[9] hapax-explore.html

[10] 14400 alternatives to the beta rewrite

[11] What is the purpose of the project hapax?