Contents

 


Software

 

Download

The C++ sources of CANTATA can be downloaded here.

A precompiled Windows executable is available here.

A precompiled Mac OS X universal binary (Power PC/Intel) is available here.

 

Compiling

Windows and Mac OS X users can skip this step by downloading the above precompiled binary and extracting it to the desired location.

We recommend compiling the source code using the GCC compiler or MinGW for Windows. On Mac OS, the GCC compiler delivered with XCode Tools. The code also compiles with Microsoft Visual C++.

For older versions of GCC (prior to 4.2) or for Visual C++, you may additionally require the Boost libraries.

If using GCC/MinGW, the code can be compiled using the included Makefile:

 

Usage — A short tutorial

The following examples outline the usage of the program based on the mammalian cell cycle network [1]. The files used in this tutorial can be downloaded here. The ZIP archive contains three files:

In the following, we assume that these three files have been extracted to a directory, and that cantata is in your search path.

First, we can check whether the network models match these expectations. We first verify the original ("true") network model. Open a shell/command prompt, change to the directory where the network files are located, and type:

cantata --validate -n cellcycle.txt -r cellcycle_rules.txt

Here, --validate tells the program to validate a model specified in parameter -n according to a list of rules specified in parameter -r.

Input network:
CycD = CycD
Rb = ((!CycA & !CycB & !CycD & !CycE) | (p27 & !CycB & !CycD))
E2F = ((!Rb & !CycA & !CycB) | (p27 & !Rb & !CycB))
CycE = (E2F & !Rb)
CycA = ((E2F & !Rb & !Cdc20 & !(Cdh1 & UbcH10)) | (CycA & !Rb & !Cdc20 & !(Cdh1 & UbcH10)))
p27 = ((!CycD & !CycE & !CycA & !CycB) | (p27 & !(CycE & CycA) & !CycB & !CycD))
Cdc20 = CycB
Cdh1 = ((!CycA & !CycB) | Cdc20 | (p27 & !CycB))
UbcH10 = (!Cdh1 | (Cdh1 & UbcH10 & (Cdc20 | CycA | CycB)))
CycB = (!Cdc20 & !Cdh1)
Input network file: cellcycle.txt Rule file: cellcycle_rules.txt Random seed: 1315472969 Max. number of start states: 100000 Max. number of transitions: 1000
Violations of the rule set:
Rule 1: (no violations) Rule 2: (no violations) Finished

As expected, the true network obeys to both rules, i.e. it has the two specified attractors.

We now validate the perturbed network draft:

cantata --validate -n cellcycle_truncated.txt -r cellcycle_rules.txt -c 5
Input network:
CycD = CycD
Rb = ((!CycA & !CycB & !CycD & !CycE) | (p27 & !CycB & !CycD))
E2F = ((!Rb & !CycA & !CycB) | (p27 & !Rb & !CycB))
CycE = (E2F & !Rb)
CycA = ((E2F & !Rb & !Cdc20 & !(Cdh1 & UbcH10)) | (CycA & !Rb & !Cdc20 & !(Cdh1 & UbcH10)))
p27 = ((!CycD & !CycE & !CycA & !CycB) | (p27 & !(CycE & CycA) & !CycB & !CycD))
Cdc20 = CycB
Cdh1 = ((!CycA & !CycB) | Cdc20 | (p27 & !CycB))
UbcH10 = (!Cdh1 | (Cdh1 & UbcH10 & (Cdc20 | CycA | CycB)))
CycB = !Cdc20

Input network file: cellcycle_truncated.txt Rule file: cellcycle_rules.txt Random seed: 1315484034 Max. number of start states: 100000 Max. number of transitions: 1000
Violations of the rule set:
Rule 1:
Attractor matching 1 (using alternative 1): Attractor => Specifications 0 0 0 0 0 0 1 1 1 0 => 0 1 0 0 0 1 0 1 0 0 (State spec. 1) 0 1 1 0 0 1 0 1 1 0 => 0 1 0 0 0 1 0 1 0 0 (State spec. 1) 0 1 0 0 0 1 0 1 0 1 => 0 1 0 0 0 1 0 1 0 0 (State spec. 1) 0 0 0 0 0 0 1 0 0 1 => 0 1 0 0 0 1 0 1 0 0 (State spec. 1)
Violations caused by this matching: State specification 1: Rb != 1 State specification 1: E2F != 0 State specification 1: p27 != 1 State specification 1: Cdc20 != 0 State specification 1: Cdh1 != 1 State specification 1: UbcH10 != 0 State specification 1: CycB != 0
Start states yielding this matching: 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 0 (... further 507 states ...)
Rule 2:
Attractor matching 1 (using alternative 1): Attractor => Specifications 1 0 0 0 0 0 1 1 1 0 => 1 0 0 0 0 0 1 1 1 0 (State spec. 4) 1 0 1 0 0 0 0 1 1 0 => 1 0 1 0 0 0 0 1 1 0 (State spec. 5) 1 0 1 1 0 0 0 1 0 1 => 1 0 1 1 0 0 0 1 0 0 (State spec. 6) => 1 0 1 1 1 0 0 1 0 0 (State spec. 7) => 1 0 0 1 1 0 0 0 0 0 (State spec. 1) => 1 0 0 0 1 0 0 0 1 1 (State spec. 2) 1 0 0 1 1 0 1 0 0 1 => 1 0 0 0 1 0 1 0 1 1 (State spec. 3)
Violations caused by this matching: State specification 3: CycE != 0 State specification 3: UbcH10 != 1 State specification 6: CycB != 0 Start states yielding this matching: 1 0 0 0 0 0 0 0 0 0 1 0 0 0 0 0 0 0 0 1 1 0 0 0 0 0 0 0 1 0 1 0 0 0 0 0 0 0 1 1 1 0 0 0 0 0 0 1 0 0 (... further 507 states ...) Finished

Here, the program first prints the attractors of the network and their optimal matchings with the rules. We can see that the perturbed network draft has two 4-state attractors that are matched with the 1-state attractor and the 7-state attractor of the original network. Below each matching, the violations are listed. In this example, there are many violations. CANTATA prints the the number of the state specification and the violating gene for each violation. Furthermore, the start states that lead to the matching and cause the violations are printed. By specifying -c 5, we tell the program to print only the first 5 violating states.

Now, let us try to reconstruct the true model from the disrupted network draft using the CANTATA optimization algorithm. This algorithm is started using the main option --optimize:

cantata --optimize -n cellcycle_truncated.txt -r cellcycle_rules.txt -o result.txt -ni 1000

The parameter -o result.txt tells the program to write the results to a file result.txt. We set the number of iterations to 1000 (which is the default value) using -ni 1000. When the optimization process is complete, the file result.txt contains a header summarizing the algorithm's configuration, followed by a list of candidate networks with their three objective values, e.g.

Input network file:           cellcycle_truncated.txt
Rule file:                    cellcycle_rules.txt
Random seed:                  1316096208
Population size:              100
Number of offspring:          200
Fract. of injected nets:      0.1
Neg. every i-th offspring:    50
Number of generations:        1000
Number of restarts:           1
Initial mutations:            1
Epsilon:                      0.0005
Weights of topology scores:   0.25/0.25/0.5
Max. number of start states:  200
Max. number of transitions:   100

Best candidate networks:

CycD = CycD
Rb = ((!CycA & !CycB & !CycD & !CycE) | (p27 & !CycB & !CycD))
E2F = ((!Rb & !CycA & !CycB) | (p27 & !Rb & !CycB))
CycE = (E2F & !Rb)
CycA = ((E2F & !Rb & !Cdc20 & !(Cdh1 & UbcH10)) | (CycA & !Rb & !Cdc20 & !(Cdh1 & UbcH10)))
p27 = ((!CycD & !CycE & !CycA & !CycB) | (p27 & !(CycE | CycA) & !CycB & !CycD))
Cdc20 = CycB
Cdh1 = ((!CycA & !CycB) | Cdc20 | (p27 & !CycB))
UbcH10 = (!Cdh1 | (Cdh1 & UbcH10 & (Cdc20 | CycA | CycB)))
CycB = (!Cdc20 & !Cdh1)
Fitness: 0 0.24914 0.098 Run: 1 Generation: 419

CycD = CycD
Rb = ((!CycA & !CycB & !CycD & !CycE) | (!CycB & !CycD & p27))
E2F = ((!Rb & !CycA & !CycB) | (p27 & !Rb & !CycB))
CycE = (E2F & !Rb)
CycA = ((!Rb & !Cdc20 & !(Cdh1 & UbcH10) & E2F) | (CycA & !Rb & !Cdc20 & !(Cdh1 & UbcH10)))
p27 = ((!CycD & !CycE & !CycA & !CycB) | (p27 & !(CycE & CycA) & !CycB & !CycD))
Cdc20 = CycB
Cdh1 = ((!CycA & !CycB) | Cdc20 | (p27 & !CycB))
UbcH10 = (!Cdh1 | (Cdh1 & UbcH10 & (Cdc20 | CycA | CycB)))
CycB = (!Cdc20 & !Cdh1)
Fitness: 0 0.24914 0.108 Run: 1 Generation: 641

Finished

 

In the example printed here, the first fitness value of both resulting candidate network models is 0, which indicates that the networks match the rules perfectly. In this case, the second resulting network is equivalent to the true network, which means that the deleted dependency of CycB on Cdh1 was reconstructed and no further changes were applied. The first candidate also recovers this dependency, but changes an & to a | in the function for p27. Depending on the random initialization of the algorithm, you might get a different result when running the example.

The result file is not readable directly by BoolNet, as it contains multiple candidate network models and additional annotation (the header and objective scores). If the candidate networks should be analyzed in BoolNet, CANTATA can write them to separate network files. For example,

cantata --optimize -n cellcycle_truncated.txt -r cellcycle_rules.txt -o result.txt -on candidate_%d.txt -me 0

writes all candidates that match the rules perfectly to files candidate_0.txt, candidate_1.txt, candidate_2.txt, ..., i.e. the %d marker is replaced by a running number. The parameter -me sets a threshold for the first objective, i.e. only files with a score in the first objective that is less than or equal to this value are written to files. As we set this error to 0 (which is the default value), only candidate network models that match the rule set perfectly are written to files.

This tutorial covers only parts of the options available in CANTATA. A full description of the command line options and the file formats is available here.

 

References

[1] Fauré A., Naldi, A., Chaouiya, C., and Thieffry, D. (2006). Dynamical analysis of a generic Boolean model for the control of the mammalian cell cycle. Bioinformatics, 22(14), e124-e131.