Module ycoe
source code
Reads tokens from the York-Toronto-Helsinki Parsed Corpus of
Old English Prose (YCOE), a 1.5 million word syntactically-
annotated corpus of Old English prose texts. The corpus is
distributed by the Oxford Text Archive: http://www.ota.ahds.ac.uk/
The YCOE corpus is divided into 100 files, each representing
an Old English prose text. Tags used within each text complies
to the YCOE standard: http://www-users.york.ac.uk/~lang22/YCOE/YcoeHome.htm
Output of the reader is as follows:
Raw:
['+D+atte',
'on',
'o+dre',
'wisan',
'sint',
'to',
'manianne',
'+da',
'unge+dyldegan',
',',
'&',
'on',
'o+dre',
'+da',
'ge+dyldegan',
'.']
Tagged:
[('+D+atte', 'C'),
('on', 'P'),
('o+dre', 'ADJ'),
('wisan', 'N'),
('sint', 'BEPI'),
('to', 'TO'),
('manianne', 'VB^D'),
('+da', 'D^N'),
('unge+dyldegan', 'ADJ^N'),
(',', ','),
('&', 'CONJ'),
('on', 'P'),
('o+dre', 'ADJ'),
('+da', 'D^N'),
('ge+dyldegan', 'ADJ^N'),
('.', '.')]
Bracket Parse:
(CP-THT: (C: '+D+atte') (IP-SUB: (IP-SUB-0: (PP: (P: 'on') (NP: (ADJ: 'o+dre') (N: 'wisan')))
(BEPI: 'sint') (IP-INF: (TO: 'to') (VB^D: 'manianne') (NP: '*-1')) (NP-NOM-1: (D^N: '+da')
(ADJ^N: 'unge+dyldegan'))) (,: ',') (CONJP: (CONJ: '&') (IPX-SUB-CON=0: (PP: (P: 'on')
(NP: (ADJ: 'o+dre'))) (NP-NOM: (D^N: '+da') (ADJ^N: 'ge+dyldegan'))))) (.: '.')),
Chunk Parse:
[(S:
('C', '+D+atte')
(PP: ('P', 'on') ('ADJ', 'o+dre') ('N', 'wisan'))
('BEPI', 'sint') ('TO', 'to') ('VB^D', 'manianne')
(NP: ('NP', '*-1')) ('D^N', '+da') ('ADJ^N', 'unge+dyldegan') (',', ',') ('CONJ', '&')
(PP: ('P', 'on') ('ADJ', 'o+dre')) ('D^N', '+da') ('ADJ^N', 'ge+dyldegan') ('.', '.'))]
|
|
|
|
|
raw(files=['coprefcura.o2', 'cosolsat2', 'coprefsolilo', 'comarvel.o23',...) |
source code
|
|
|
|
tagged(files=['coprefcura.o2', 'cosolsat2', 'coprefsolilo', 'comarvel.o23',...) |
source code
|
|
|
|
chunked(files=['coprefcura.o2', 'cosolsat2', 'coprefsolilo', 'comarvel.o23',...,
chunk_types=('NP'),
top_node='S',
partial_match=True,
collapse_partials=True,
cascade=True) |
source code
|
|
|
|
bracket_parse(files=['coprefcura.o2', 'cosolsat2', 'coprefsolilo', 'comarvel.o23',...) |
source code
|
|
|
|
|
|
|
|
|
|
| _chunk_parse(files,
chunk_types,
top_node,
partial_match,
collapse_partials,
cascade) |
source code
|
|
|
|
|
|
|
item_name = {'coadrian.o34': 'Adrian and Ritheus', 'coaelhom.o...
|
|
|
items = ['coprefcura.o2', 'cosolsat2', 'coprefsolilo', 'comarv...
Reads files from a given list, and converts them via the
conversion_function.
|
item_name
- Value:
{'coadrian.o34': 'Adrian and Ritheus',
'coaelhom.o3': '\xc6lfric, Supplemental Homilies',
'coaelive.o3': '\xc6lfrics Lives of Saints',
'coalcuin': 'Alcuin De virtutibus et vitiis',
'coalex.o23': 'Alexanders Letter to Aristotle',
'coapollo.o3': 'Apollonius of Tyre',
'coaugust': 'Augustine',
'cobede.o2': 'Bedes History of the English Church',
...
|
|
items
Reads files from a given list, and converts them via the
conversion_function. Can return raw or tagged read files.
- Value:
['coprefcura.o2',
'cosolsat2',
'coprefsolilo',
'comarvel.o23',
'cochdrul',
'coalex.o23',
'colawwllad.o4',
'cocathom1.o3',
...
|
|