2. Running Swat

change into the directory, where you unpacked swat and use the following command line to run swat:

java -jar swat.jar protein.fasta nucleotide.fasta

Important: The files of the protein sequence and the nucleotide sequence have to be in fasta-format and must end with ".fasta". The protein sequence has to be followed by the nucleotide sequence.

 

Parameters and defaults:

 to control the alignment, the following parameters can be set. If not explicitly specified, the default will be used.

 

Parameter Meaning Range Default
g= penalty for starting a gap -10000 to -1 -10
e= penalty for extending a gap -10000 to -1 -1
s= penalty for a single frameshift -10000 to -1 -20
d= penalty for a double frameshift -10000 to -1 -40
p= penalty for a stop codon mismatch -10000 to -1 -4
m= scoring matrix (path to matrix-file) - BLOSUM62
ng no gaps are allowed in the alignment true / false false
na no affine gap scoring is used true / false false
nf no frame shift mutations are allowed true / false false
w use the worst case calculation for wild bases true / false false
o= define a file to save the found mutations in a file in json format - -

 

Examples:

java -jar swat.jar prot.fasta nucl.fasta g=-12 e=-2 s=-15 d=-30 m=matrices\PAM250 w

= align prot.fasta and nucl.fasta with a gap open penalty of 12 and gap extend penalty of 2. framshifts are scored with -15 and -30. As scoring matrix the PAM250 matrix is used in the folder "matrices". Also a worst case calculation is used for the wild bases.

java -jar swat.jar prot.fasta nucl.fasta ng nf o=mutas.json

= perform an alignment without gaps or frameshifts and save the mutations in the file mutas.json

 

Output:

the commandline output contains the identifier of the aligned sequences, the length of the squences, a list of the used parameters, the alignment score (Smith-Waterman score), the start and the end positions of the local alignment and the length of the alignment.

Also a detailed alignment is displayed.


swat alignment protocol

Explanation of the Markers in the detailed output:

||| = exact match of an amino acid and a nucleotide triplet
+++  = positive match of an amino acid and a nucleotide triplet
***  = mismatch / replacement of an amino acid and a nucleotide triplet
III  = insertion of a codon in the nucleotide sequence
DDD  = deletion of a codon in the nucleotide sequence
|-x  = frameshift deletion at position x
-xy  = doubleframeshift deletion at position x and y
||i  = frameshift insertion at position 3 of a codon
|i|  = frameshift insertion at position 2 of a codon
i||  = frameshift insertion at position 1 of a codon
|ii  = frameshift insertion at position 2 and 3 of a codon
i|i  = frameshift insertion at position 1 and 3 of a codon
ii|  = frameshift insertion at position 1 and 2 of a codon

 Finally informations about the amount of identical matches, positve matches, gaps, mutations and wild bases is displayed.