Vous êtes sur la page 1sur 1

Syntax: grammars, datasets, tree search

Linguistics 165, Professor Roger Levy


27 February 2015
1. Resources for todays lecture:
NLTK book chapter on syntax: http://www.nltk.org/book/ch08.html
Penn Treebank hand-annotated parsed dataset: http://www.cis.upenn.edu/
~treebank/. This is installed in /home/linux/ieng6/ln165w/public/stanford-tregex.
Stanford Tregex/Tsurgeon tree-search and tree-surgery software: http://nlp.
stanford.edu/software/tregex.shtml. The READMEs and the javadoc page
http://nlp.stanford.edu/nlp/javadoc/javanlp/edu/stanford/nlp/trees/tregex/
TregexPattern.html are highly useful!
2. Goals for todays lecture:
Use NLTK to write fragments of an English syntactic grammar, illustrating problems of coverage and ambiguity
Browse the Penn Treebank a bit, to get a sense of the diversity and richness of
natural language syntax
Learn how to do tree searches with Tregex

Linguistics 165 Syntax: grammars, datasets, tree search lecture notes,


Roger
page
Levy,
1 Winter 2015

Vous aimerez peut-être aussi