NLP Qual Reading List
Choose at least 11 of 17 areas, with at least 4 depth areas.
Contents
- NLP Qual Reading List
- Important Early Systems
- Language Models
- Lexical Semantics
- Parsing
- Formal Language Theory
- Natural Language Understanding
- Machine Translation
- Natural Language Generation and Summarization
- Sequence Models for NLP
- Discourse
- Information Retrieval
- Text Clustering
- Text Categorization
- Information Extraction
- Dialogue
- Speech Recognition
- Speech Synthesis
Important Early Systems
Core:
R. F. Simmons, Natural language question-answering systems: 1969. Communications of the ACM. Volume 13, 1970. [link]
- Schank and Abelson. 1976. Scripts, Plans, Goals, and Understanding. Chapters 1-3.
Language Models
Core:
Joshua Goodman. 2001. A Bit of Progress in Language Modeling. Computer Speech and Language, 403-434. [link]
Lexical Semantics
Core:
- Chapter 20 "Computational Lexical Semantics" in Jurafsky and Martin 2nd edition.
McCarthy, D., Koeling, R., Weeds, J. and Carroll, J. (2004) Finding predominant word senses in untagged text. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Barclona, Spain. pp 280-287 http://acl.ldc.upenn.edu/P/P04/P04-1036.pdf
Depth:
Budanitsky, Alexander and Graeme Hirst. 2006. Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics.
Marti Hearst. COLING 1992. Automatic Acquisition of Hyponyms from Large Text Corpora http://www.cs.mu.oz.au/acl/C/C92/C92-2082.pdf
Automatic Labeling of Semantic Roles, Daniel Gildea and Daniel Jurafsky. In Proceedings of the 38th Annual Conference of the Association for Computational Linguistics (ACL-00), pp. 512-520, Hong Kong, October 2000. (Please read the 8-page conference version, although the 45-page extended version might also be useful.)
Parsing
Core:
Michael Collins. 2003. Head-Driven Statistical Models for Natural Language Parsing. In Computational Linguistics. http://acl.ldc.upenn.edu/J/J03/J03-4003.pdf
Depth:
Stuart M. Shieber. Using restriction to extend parsing algorithms for complex-feature-based formalisms. In 23rd Annual Meeting of the Association for Computational Linguistics, Chicago, 1985. http://acl.ldc.upenn.edu/P/P85/P85-1018.pdf
Jason Eisner. 2000. Bilexical Grammars and their Cubic-Time Parsing Algorithms, ch. 3 of H. Bunt and A. Nijholt (eds), Advances in Probabilistic and Other Parsing Technologies. http://www.cs.jhu.edu/%7Ejason/papers/eisner.iwptbook00.pdf
Dan Klein and Chris Manning, "Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency," Proceedings of the 42nd Annual Meeting of the ACL, 2004 http://acl.ldc.upenn.edu/P/P04/P04-1061.pdf
Formal Language Theory
Core:
- Chapter 3 "Words and Transducers" in Jurafsky and Martin 2nd edition.
Depth:
- Mehryar Mohri. Weighted Finite-State Transducer Algorithms: An Overview. In Carlos Martin-Vide, Victor Mitrana, and Gheorghe Paun, editors, Formal Languages and Applications. volume 148, VIII, 620 p. Springer, Berlin, 2004.
- Joshi, Vijay-Shanker and Weir, 1991. The convergence of mildly context-sensitive grammar formalisms. In Sells, P., Shieber, S. and Waswo, T. (eds.), Foundational Issues in Natural Language Processing. MIT Press pp. 31¡X82 [Green library; P98 .F65 1991]
Kaplan and Kay. 1994. Regular Models of Phonological Rule Systems. Computational Lingustics 20. http://acl.ldc.upenn.edu/J/J94/J94-3001.pdf
Natural Language Understanding
Core:
P. Blackburn and J. Bos. Computational Semantics. Theoria 18 No.46: 27-45, 2003. http://www.cogsci.ed.ac.uk/%7Ejbos/pubs/theoria.pdf
Depth:
Interpretation as Abduction. Jerry R. Hobbs, Mark Stickel, Paul Martin and Douglas Edwards. In Proceedings of the 26th Annual Meeting of the ACL, 1988. http://acl.ldc.upenn.edu/P/P88/P88-1012.pdf
Dekang Lin and Patrick Pantel. 2001. Discovery of Inference Rules for Question Answering. Natural Language Engineering 7(4):343-360. http://www.cs.ualberta.ca/%7Elindek/papers/jnle01.pdf
Machine Translation
Core:
Brown, Della Pietra x2, Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. CL 19. http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf
P. Koehn, F.J. Och, and D. Marcu (2003). Statistical phrase based translation. In Proceedings of the Joint Conference on Human Language Technologies and the Annual Meeting of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL). http://www.inf.ed.ac.uk/publications/online/0731.pdf
Depth:
Och, 2003; Minimum Error Rate Training for Statistical Machine Translation. http://acl.ldc.upenn.edu/acl2003/main/pdfs/Och.pdf
Franz Josef Och, Hermann Ney. "The alignment template approach to statistical machine translation." Accepted for publication in Computational Linguistics, 2004. http://www.mitpressjournals.org/doi/abs/10.1162/0891201042544884
D. Chiang (2005). A Hierarchical Phrase-Based Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05) http://www.isi.edu/%7Echiang/papers/chiang-acl05.pdf
Natural Language Generation and Summarization
Core:
E. Krahmer and M. Theune. Efficient context-sensitive generation of referring expressions. In: K. van Deemter and R.Kibble (eds.), Information Sharing: Givenness and Newness in Language Processing, CSLI Publications, 223-264, 2002. http://fdlwww.uvt.nl/%7Ekrahmer/Pubs%5Cbook.ps (seems broken?)
Depth:
Catching the drift: Probabilistic content models, with applications to generation and summarization. Regina Barzilay and Lillian Lee. Proceedings of HLT-NAACL, pp. 113¡V120, 2004. http://acl.ldc.upenn.edu/N/N04/N04-1015.pdf
Sanda Harabagiu, Andrew Hickl, and Finley Lacatusu. Satisying Information Needs with Multi-Document Summaries. 2007. http://gricean.com/papers/harabagiuHicklLacatusu2007.pdf
Sequence Models for NLP
Core:
An Introduction to Conditional Random Fields for Relational Learning. Charles Sutton and Andrew McCallum. In Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press. 2006. http://www.cs.umass.edu/%7Emccallum/papers/crf-tutorial.pdf
Depth:
Collins, 2002; Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. http://people.csail.mit.edu/mcollins/papers/tagperc.ps
References:
Klein & Manning ACL tutorial on Maxent models, conditional estimation, and optimization. http://www.cs.berkeley.edu/~klein/papers/maxent-tutorial-slides-6.pdf
Discourse
Core:
- Chapter 21 "Computational Discourse" in Jurafsky and Martin 2nd edition.
Depth:
- Mann, W. C. and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8(3) (243-281).
Marcu and Echihabi. An Unsupervised Approach to Recognizing Discourse Relations. ACL 2002. http://acl.ldc.upenn.edu/P/P02/P02-1047.pdf
Information Retrieval
Core:
"Inverted files for text search engines", J. Zobel and A. Moffat, ACM Computing Surveys, 38(2):1-56, 2006. http://portal.acm.org/citation.cfm?doid=1132956.1132959
Depth:
Concept Based Query Expansion. Qiu and Frei. (1993)http://citeseer.ist.psu.edu/qiu93concept.html.
The Anatomy of a Large-Scale Hypertextual Web Search Engine. 1998. Sergey Brin, Lawrence Page. http://citeseer.ist.psu.edu/brin98anatomy.html Computer Networks and ISDN Systems
Chapter 9 "Relevance Feedback and Query Expansion", Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter09-queryexpansion.pdf
Text Clustering
Core:
Chapter 16 "Flat Clustering", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter16-flatclust.pdf
Chapter 17 "Hierarchical Clustering", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter17-hclust.pdf
Depth:
Chapter 18 "Dimensionality Reduction and Latent Semantic Indexing", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter18-lsi.pdf
Latent Dirichlet Allocation. Blei, Ng, and Jordan. http://citeseer.ist.psu.edu/blei03latent.html
References: Bonus paper for the bibliography. Hierarchical Dirichlet Processes. Teh, Jordan, Beal, Blei. www.gatsby.ucl.ac.uk/~ywteh/research/npbayes/jasa2006.pdf
Text Categorization
Core:
Chapter 13 "Text Classification and Naive Bayes", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter13-nbayes.pdf
Depth:
Machine Learning in Automated Text Categorization. Sebastiani. http://citeseer.ist.psu.edu/518620.html
F. Li and Y. Yang. A loss function analysis for classification methods in text categorization. The Twentieth International Conference on Machine Learning (ICML'03), pp. 472-479, 2003. http://citeseer.ist.psu.edu/618984.html
Information Extraction
Core:
Douglas E. Appelt and David Israel. Introduction to Information Extraction Technology. http://www.ai.sri.com/~appelt/ie-tutorial/IJCAI99.pdf
Depth:
Raymond J. Mooney and Razvan Bunescu. Mining Knowledge from Text Using Information Extraction. http://www.acm.org/sigs/sigkdd/explorations/issues/7-1-2005-06/2-Mooney.pdf
Snowball: Extracting Relations from Large Plain-Text Collections. Agichtein and Gravano. http://citeseer.ist.psu.edu/271085.html.
Unsupervised Named Entity Extraction from the Web: An Experimental Study. Etzioni et al. http://www.cs.washington.edu/research/knowitall/papers/KnowItAll_AIJ.pdf
Dialogue
Core:
Chapter 24: Dialog and Conversational Agents. Jurafsky and Martin 2nd Edition. http://www.cs.colorado.edu/~martin/SLP/Updates/23.pdf
James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, William Taysom, PLOW: A Collaborative Task Learning Agent, Proceedings of the AAAI Conference on Artificial Intelligence: Special Track on Integrated Intelligence, 2007.
Speech Recognition
Core:
- Chapter 9 "Automatic Speech Recognition" in Jurafsky and Martin 2nd edition.
- Chapter 10 "Speech Recognition: Advanced Topics" in Jurafsky and Martin 2nd edition.
Depth:
- Yves Normandin. Maximum mutual information estimation of hidden Markov models. In: Chin-Hui Lee, F. K. Soong, and K. K. Paliwal, "Automatic Speech and Speaker Recognition", Kluwer 1996, pp. 57-82.
- Chapter 13 "Large-Vocabulary Search Algorithms" in Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, "Spoken Language Processing", Prentice Hall 2001.
References: The following reading is not required but may be useful to clarify any confusions in Chapter 13 of Huang et al.: Richard Schwartz, L. Nguyen, and J. Makhoul. Multiple-Pass Search Strategies. In: Chin-Hui Lee, F. K. Soong, and K. K. Paliwal, "Automatic Speech and Speaker Recognition", Kluwer 1996, pp. 429-456.
Speech Synthesis
Core:
- Chapter 8 "Speech Synthesis" in Jurafsky and Martin 2nd edition.
Depth:
Chapter 15 "Hidden Markov Model Synthesis" in Taylor "Text-to-Speech Synthesis" 2008. http://mi.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf
Chapter 16 "Unit Selection Synthesis" in Taylor "Text-to-Speech Synthesis" 2008. http://mi.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf
