change into the directory, where you unpacked swat and use the following command line to run swat:
java -jar swat.jar protein.fasta nucleotide.fasta
Important: The files of the protein sequence and the nucleotide sequence have to be in fasta-format and must end with ".fasta". The protein sequence has to be followed by the nucleotide sequence.
Parameters and defaults:
to control the alignment, the following parameters can be set. If not explicitly specified, the default will be used.
|g=||penalty for starting a gap||-10000 to -1||-10|
|e=||penalty for extending a gap||-10000 to -1||-1|
|s=||penalty for a single frameshift||-10000 to -1||-20|
|d=||penalty for a double frameshift||-10000 to -1||-40|
|p=||penalty for a stop codon mismatch||-10000 to -1||-4|
|m=||scoring matrix (path to matrix-file)||-||BLOSUM62|
|ng||no gaps are allowed in the alignment||true / false||false|
|na||no affine gap scoring is used||true / false||false|
|nf||no frame shift mutations are allowed||true / false||false|
|w||use the worst case calculation for wild bases||true / false||false|
|o=||define a file to save the found mutations in a file in json format||-||-|
java -jar swat.jar prot.fasta nucl.fasta g=-12 e=-2 s=-15 d=-30 m=matrices\PAM250 w
= align prot.fasta and nucl.fasta with a gap open penalty of 12 and gap extend penalty of 2. framshifts are scored with -15 and -30. As scoring matrix the PAM250 matrix is used in the folder "matrices". Also a worst case calculation is used for the wild bases.
java -jar swat.jar prot.fasta nucl.fasta ng nf o=mutas.json
= perform an alignment without gaps or frameshifts and save the mutations in the file mutas.json
the commandline output contains the identifier of the aligned sequences, the length of the squences, a list of the used parameters, the alignment score (Smith-Waterman score), the start and the end positions of the local alignment and the length of the alignment.
Also a detailed alignment is displayed.
Explanation of the Markers in the detailed output:
||||||=||exact match of an amino acid and a nucleotide triplet|
|+++||=||positive match of an amino acid and a nucleotide triplet|
|***||=||mismatch / replacement of an amino acid and a nucleotide triplet|
|III||=||insertion of a codon in the nucleotide sequence|
|DDD||=||deletion of a codon in the nucleotide sequence|
||-x||=||frameshift deletion at position x|
|-xy||=||doubleframeshift deletion at position x and y|
|||i||=||frameshift insertion at position 3 of a codon|
||i|||=||frameshift insertion at position 2 of a codon|
|i||||=||frameshift insertion at position 1 of a codon|
||ii||=||frameshift insertion at position 2 and 3 of a codon|
|i|i||=||frameshift insertion at position 1 and 3 of a codon|
|ii|||=||frameshift insertion at position 1 and 2 of a codon|
Finally informations about the amount of identical matches, positve matches, gaps, mutations and wild bases is displayed.
Our paper "Assessing phenotype order in molecular data" has been published in Scientific Reports.
Our paper "Representing dynamic biological networks with multi-scale probabilistic models"has been published in Communications Biology.