Package nltk_lite :: Package tag :: Module unigram :: Class Unigram
[hide private]
[frames] | no frames]

Class Unigram

source code

     object --+            
              |            
yaml.YAMLObject --+        
                  |        
               TagI --+    
                      |    
      SequentialBackoff --+
                          |
                         Unigram
Known Subclasses:
contrib.marshal.MarshalUnigram

A unigram stochastic tagger. Before tag.Unigram can be used, it should be trained on a tagged corpus. Using this training data, it will find the most likely tag for each word type. It will then use this information to assign the most frequent tag to each word. If tag.Unigram encounters a word which it has no data, it will assign it the tag None.

Nested Classes [hide private]

Inherited from yaml.YAMLObject: __metaclass__, yaml_dumper, yaml_loader

Instance Methods [hide private]
 
__init__(self, cutoff=1, backoff=None)
Construct a new unigram stochastic tagger.
source code
 
train(self, tagged_corpus, verbose=True)
Train tag.Unigram using the given training data.
source code
 
tag_one(self, token, history=None) source code
 
size(self) source code
 
__repr__(self)
repr(x)
source code

Inherited from SequentialBackoff: tag, tag_sents

Inherited from SequentialBackoff (private): _backoff_tag_one

Inherited from object: __delattr__, __getattribute__, __hash__, __new__, __reduce__, __reduce_ex__, __setattr__, __str__

Class Methods [hide private]

Inherited from yaml.YAMLObject: from_yaml, to_yaml

Class Variables [hide private]
  yaml_tag = '!tag.Unigram'

Inherited from yaml.YAMLObject: yaml_flow_style

Properties [hide private]

Inherited from object: __class__

Method Details [hide private]

__init__(self, cutoff=1, backoff=None)
(Constructor)

source code 

Construct a new unigram stochastic tagger. The new tagger should be trained, using the train() method, before it is used to tag data.

Overrides: object.__init__

train(self, tagged_corpus, verbose=True)

source code 

Train tag.Unigram using the given training data.

Parameters:
  • tagged_corpus (list or iter(list)) - A tagged corpus. Each item should be a list of tagged tokens, where each consists of text and a tag.

__repr__(self)
(Representation operator)

source code 

repr(x)

Overrides: object.__repr__
(inherited documentation)