1 """Index a file based on information in a SeqRecord object.
2
3 This indexer tries to make it simple to index a file of records (ie. like
4 a GenBank file full of entries) so that individual records can be
5 readily retrieved.
6
7 The indexing in this file takes place by converting the elements in the
8 file into SeqRecord objects, and then indexing by some item in these
9 SeqRecords. This is a slower method, but is very flexible.
10
11 We have two default functions to index by the id and name elements of a
12 SeqRecord (ie. LOCUS and accession number from GenBank). There is also
13 a base class which you can derive from to create your own indexer
14 which allows you to index by anything you feel like using python code.
15 """
16 from Bio.builders.SeqRecord.sequence import BuildSeqRecord
17
18
20 """Base class for indexing using SeqRecord information.
21
22 This is the class you should derive from to index using some type of
23 information in a SeqRecord. This is an abstract base class, so it needs
24 to be subclassed to be useful.
25 """
28
32
34 raise NotImplementedError("Please implement in derived classes")
35
37 raise NotImplementedError("Please implement in derived classes")
38
40 raise NotImplementedError("Please implement in derived classes")
41
43 """Index a file based on .id and .name attributes of a SeqRecord.
44
45 A simple-minded indexing scheme which should work for simple cases. The
46 easiest way to use this is trhough the create_*db functions of this
47 module.
48 """
51
54
56 return ["name", "aliases"]
57
59
60 id_info = {"id" : [seq_record.id],
61 "name" : [seq_record.name],
62 "aliases" : []}
63 return id_info
64
66 """Indexer to index based on values returned by a function.
67
68 This class is passed a function which will return id, name and alias
69 information from a SeqRecord object. It needs to return either one item,
70 which is an id from the title, or three items which are (in order), the id,
71 a list of names, and a list of aliases.
72
73 This indexer allows indexing to be completely flexible based on passed
74 functions.
75 """
79
82
84 return ["name", "aliases"]
85
103
105 """A SAX builder-style class to make a parsed SeqRecord available.
106
107 This class does a lot of trickery to make things fit in the SAX
108 framework and still have the flexibility to use a built SeqRecord
109 object.
110
111 You shouldn't really need to use this class unless you are doing
112 something really fancy-pants; otherwise, just use the
113 BaseSeqRecordIndexer interfaces.
114 """
116 """Intialize with a callback function to gets id info from a SeqRecord.
117
118 get_ids_callback should be a callable function that will take a
119 SeqRecord object and return a dictionary mapping id names to
120 the valid ids for these names.
121 """
122 BuildSeqRecord.__init__(self)
123 self._ids_callback = get_ids_callback
124
126 """Overrride the builder function to muck with the document attribute.
127 """
128
129 BuildSeqRecord.end_record(self, tag)
130
131 self.document = self._ids_callback(self.document)
132
133
134
144
154