| Home | Trees | Indices | Help |
|
|---|
|
|
This module provides tools for parsing and manipulating the contents of a Shoebox text without reference to its metadata.
|
|||
|
Word This class defines a word object, which consists of fixed number of attributes: a wordform, a gloss, a part of speech, and a list of morphemes. |
|||
|
Morpheme This class defines a morpheme object, which consists of fixed number of attributes: a surface form, an underlying form, a gloss, and a part of speech. |
|||
|
Line This class defines a line of interlinear glossing, such as: |
|||
|
Paragraph This class defines a unit of analysis above the line and below the text. |
|||
|
Text This class defines an interlinearized text, which consists of a collection of Paragraph objects. |
|||
|
|||
|
|||
|
|||
|
|||
This method finds the indices for the leftmost boundaries of the units in a line of aligned text. Given the field \um, this function will find the indices identifing leftmost word boundaries, as follows:
0 5 8 12 <- indices
| | | |
|||||||||||||||||||||||||||
\sf dit is een goede <- surface form
\um dit is een goed -e <- underlying morphemes
\mg this is a good -ADJ <- morpheme gloss
\gc DEM V ART ADJECTIVE -SUFF <- grammatical categories
t This is a good explanation. <- free translation
The function walks through the line char by char: c flag.before flag.after index? -- ----------- ---------- ------ 0 1 0 yes 1 0 1 no 2 1 0 no 3 0 1 no 4 1 0 no 5 1 0 yes
|
Given a string and a list of indices, this function returns a list of the substrings defined by those indices. For example, given the arguments: str='antidisestablishmentarianism', indices=[4, 7, 16, 20, 25] this function returns the list: ['anti', 'dis', 'establish', 'ment', arian', 'ism']
|
| Home | Trees | Indices | Help |
|
|---|
| Generated by Epydoc 3.0beta1 on Wed May 16 22:47:18 2007 | http://epydoc.sourceforge.net |