Package Bio :: Package Medline :: Module NLMMedlineXML
[hide private]
[frames] | no frames]

Module NLMMedlineXML

source code

This module provides code to work the NCBI's XML format for Medline.

Functions: choose_format Pick the right data format to use to index an XML file. index Index a Medline XML file. index_many Index multiple Medline XML files.

Classes [hide private]
  Citation
Holds information about a Medline citation.
  CitationParser
Parses a citation into a Record object.
  _IndexerHandler
Handles the results from the nlmmedline_format.
  _SavedDataHandle
Functions [hide private]
module
choose_format(data)
Look at some data and choose the right format to parse it.
source code
list of (PMID, MedlineID, start, end)
index(handle, index_fn=...)
Index a Medline XML file.
source code
 
index_many(files_or_paths, index_fn, nprocs=...)
Index multiple Medline XML files.
source code
Variables [hide private]
  StringTypes = (<type 'str'>, <type 'unicode'>)
  __warningregistry__ = {('Bio.Medline.NLMMedlineXML was depreca...
  xml_support = 1
Function Details [hide private]

choose_format(data)

source code 

Look at some data and choose the right format to parse it. data should be the first 1000 characters or so of the file. The module will contain 2 attributes: citation_format and format. citation_format is a Martel format to parse one citation. format will parse the whole file.

Returns: module

index(handle, index_fn=...)

source code 

Index a Medline XML file. Returns where the records are, as offsets from the beginning of the handle. index_fn is a callback function with parameters (PMID, MedlineID, start, end) and is called as soon as each record is indexes.

Returns: list of (PMID, MedlineID, start, end)

index_many(files_or_paths, index_fn, nprocs=...)

source code 

Index multiple Medline XML files. files_or_paths can be a single file, a path, a list of files, or a list of paths.

index_fn is a callback function that should take the following parameters: index_fn(file, event, data)

where file is the file being indexed, event is one of "START", "RECORD", "END", and data is extra data dependent upon the event. "START" and "END" events are passed to indicate when a file is being indexed. "RECORD" is passed whenever a new record has been indexed. When a "RECORD" event is passed, then data is set to a tuple of (pmid, medline_id, start, end). Otherwise it is None. start and end indicate the location of the record as offsets from the beginning of the file.


Variables Details [hide private]

__warningregistry__

Value:
{('Bio.Medline.NLMMedlineXML was deprecated, as it does not seem to be\
 able to parse recent Medline XML files. If you want to continue to us\
e this module, please get in contact with the Biopython developers at \
biopython-dev@biopython.org to avoid permanent removal of this module \
from Biopython',
  <type 'exceptions.DeprecationWarning'>,
  18): 1}