Package Bio :: Package Saf :: Module saf_format
[hide private]
[frames] | no frames]

Module saf_format

source code

Martel based parser to read SAF formatted files.

This is a huge regular regular expression for SAF, built using
the 'regular expressiona on steroids' capabilities of Martel.

http://www.embl-heidelberg.de/predictprotein/Dexa/optin_safDes.html


Notes:
Just so I remember -- the new end of line syntax is:
  New regexp syntax - \R
     \R    means "
|
?"
     [\R]  means "[

]"

This helps us have endlines be consistent across platforms.

Variables [hide private]
  digits = '0123456789'
  valid_sequence_characters = 'abcdefghijklmnopqrstuvwxyzABCDEFG...
  white_space = '\t '
  valid_residue_characters = '0123456789\t .'
  residue_number_line = Group("residue_number_line", Rep1(Any(va...
  comment_line = Group("comment_line", Str("#")+ ToEol())
  ignored_line = Group("ignored_line", Alt(comment_line, residue...
  candidate_line = Group("candidate_line", Assert(Str("#"), 1)+ ...
  saf_record = Group("saf_record", candidate_line+ Rep(Alt(candi...
Variables Details [hide private]

valid_sequence_characters

Value:
'abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ-. \t'

residue_number_line

Value:
Group("residue_number_line", Rep1(Any(valid_residue_characters))+ AnyE\
ol())

ignored_line

Value:
Group("ignored_line", Alt(comment_line, residue_number_line))

candidate_line

Value:
Group("candidate_line", Assert(Str("#"), 1)+ Assert(Any(valid_residue_\
characters), 1)+ ToSep(sep= ' ')+ Rep(Any(valid_sequence_characters))+\
 ToEol())

saf_record

Value:
Group("saf_record", candidate_line+ Rep(Alt(candidate_line, ignored_li\
ne))+ Opt(Str("#")))