NLP Qual Reading List

Choose at least 11 of 17 areas, with at least 4 depth areas.

dogbert

Important Early Systems

Core:

  1. R. F. Simmons, Natural language question-answering systems: 1969. Communications of the ACM. Volume 13, 1970. [link]

  2. Schank and Abelson. 1976. Scripts, Plans, Goals, and Understanding. Chapters 1-3.

Language Models

Core:

  1. Joshua Goodman. 2001. A Bit of Progress in Language Modeling. Computer Speech and Language, 403-434. [link]

Lexical Semantics

Core:

  1. Chapter 20 "Computational Lexical Semantics" in Jurafsky and Martin 2nd edition.
  2. McCarthy, D., Koeling, R., Weeds, J. and Carroll, J. (2004) Finding predominant word senses in untagged text. In Proceedings of the 42nd Annual Meeting of the Association for Computational Linguistics. Barclona, Spain. pp 280-287 http://acl.ldc.upenn.edu/P/P04/P04-1036.pdf

Depth:

  1. Budanitsky, Alexander and Graeme Hirst. 2006. Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics.

  2. Marti Hearst. COLING 1992. Automatic Acquisition of Hyponyms from Large Text Corpora http://www.cs.mu.oz.au/acl/C/C92/C92-2082.pdf

  3. Automatic Labeling of Semantic Roles, Daniel Gildea and Daniel Jurafsky. In Proceedings of the 38th Annual Conference of the Association for Computational Linguistics (ACL-00), pp. 512-520, Hong Kong, October 2000. (Please read the 8-page conference version, although the 45-page extended version might also be useful.)

Parsing

Core:

  1. Michael Collins. 2003. Head-Driven Statistical Models for Natural Language Parsing. In Computational Linguistics. http://acl.ldc.upenn.edu/J/J03/J03-4003.pdf

Depth:

  1. Stuart M. Shieber. Using restriction to extend parsing algorithms for complex-feature-based formalisms. In 23rd Annual Meeting of the Association for Computational Linguistics, Chicago, 1985. http://acl.ldc.upenn.edu/P/P85/P85-1018.pdf

  2. Jason Eisner. 2000. Bilexical Grammars and their Cubic-Time Parsing Algorithms, ch. 3 of H. Bunt and A. Nijholt (eds), Advances in Probabilistic and Other Parsing Technologies. http://www.cs.jhu.edu/%7Ejason/papers/eisner.iwptbook00.pdf

  3. Dan Klein and Chris Manning, "Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency," Proceedings of the 42nd Annual Meeting of the ACL, 2004 http://acl.ldc.upenn.edu/P/P04/P04-1061.pdf

Formal Language Theory

Core:

  1. Chapter 3 "Words and Transducers" in Jurafsky and Martin 2nd edition.

Depth:

  1. Mehryar Mohri. Weighted Finite-State Transducer Algorithms: An Overview. In Carlos Martin-Vide, Victor Mitrana, and Gheorghe Paun, editors, Formal Languages and Applications. volume 148, VIII, 620 p. Springer, Berlin, 2004.
  2. Joshi, Vijay-Shanker and Weir, 1991. The convergence of mildly context-sensitive grammar formalisms. In Sells, P., Shieber, S. and Waswo, T. (eds.), Foundational Issues in Natural Language Processing. MIT Press pp. 31¡X82 [Green library; P98 .F65 1991]
  3. Kaplan and Kay. 1994. Regular Models of Phonological Rule Systems. Computational Lingustics 20. http://acl.ldc.upenn.edu/J/J94/J94-3001.pdf

Natural Language Understanding

Core:

  1. P. Blackburn and J. Bos. Computational Semantics. Theoria 18 No.46: 27-45, 2003. http://www.cogsci.ed.ac.uk/%7Ejbos/pubs/theoria.pdf

Depth:

  1. Interpretation as Abduction. Jerry R. Hobbs, Mark Stickel, Paul Martin and Douglas Edwards. In Proceedings of the 26th Annual Meeting of the ACL, 1988. http://acl.ldc.upenn.edu/P/P88/P88-1012.pdf

  2. Dekang Lin and Patrick Pantel. 2001. Discovery of Inference Rules for Question Answering. Natural Language Engineering 7(4):343-360. http://www.cs.ualberta.ca/%7Elindek/papers/jnle01.pdf

Machine Translation

Core:

  1. Brown, Della Pietra x2, Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. CL 19. http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf

  2. P. Koehn, F.J. Och, and D. Marcu (2003). Statistical phrase based translation. In Proceedings of the Joint Conference on Human Language Technologies and the Annual Meeting of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL). http://www.inf.ed.ac.uk/publications/online/0731.pdf

Depth:

  1. Och, 2003; Minimum Error Rate Training for Statistical Machine Translation. http://acl.ldc.upenn.edu/acl2003/main/pdfs/Och.pdf

  2. Franz Josef Och, Hermann Ney. "The alignment template approach to statistical machine translation." Accepted for publication in Computational Linguistics, 2004. http://www.mitpressjournals.org/doi/abs/10.1162/0891201042544884

  3. D. Chiang (2005). A Hierarchical Phrase-Based Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05) http://www.isi.edu/%7Echiang/papers/chiang-acl05.pdf

Natural Language Generation and Summarization

Core:

  1. E. Krahmer and M. Theune. Efficient context-sensitive generation of referring expressions. In: K. van Deemter and R.Kibble (eds.), Information Sharing: Givenness and Newness in Language Processing, CSLI Publications, 223-264, 2002. http://fdlwww.uvt.nl/%7Ekrahmer/Pubs%5Cbook.ps (seems broken?)

Depth:

  1. Catching the drift: Probabilistic content models, with applications to generation and summarization. Regina Barzilay and Lillian Lee. Proceedings of HLT-NAACL, pp. 113¡V120, 2004. http://acl.ldc.upenn.edu/N/N04/N04-1015.pdf

  2. Sanda Harabagiu, Andrew Hickl, and Finley Lacatusu. Satisying Information Needs with Multi-Document Summaries. 2007. http://gricean.com/papers/harabagiuHicklLacatusu2007.pdf

Sequence Models for NLP

Core:

  1. An Introduction to Conditional Random Fields for Relational Learning. Charles Sutton and Andrew McCallum. In Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press. 2006. http://www.cs.umass.edu/%7Emccallum/papers/crf-tutorial.pdf

Depth:

  1. Collins, 2002; Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. http://people.csail.mit.edu/mcollins/papers/tagperc.ps

References:

  1. Klein & Manning ACL tutorial on Maxent models, conditional estimation, and optimization. http://www.cs.berkeley.edu/~klein/papers/maxent-tutorial-slides-6.pdf

Discourse

Core:

  1. Chapter 21 "Computational Discourse" in Jurafsky and Martin 2nd edition.

Depth:

  1. Mann, W. C. and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8(3) (243-281).
  2. Marcu and Echihabi. An Unsupervised Approach to Recognizing Discourse Relations. ACL 2002. http://acl.ldc.upenn.edu/P/P02/P02-1047.pdf

Information Retrieval

Core:

  1. "Inverted files for text search engines", J. Zobel and A. Moffat, ACM Computing Surveys, 38(2):1-56, 2006. http://portal.acm.org/citation.cfm?doid=1132956.1132959

Depth:

  1. Concept Based Query Expansion. Qiu and Frei. (1993)http://citeseer.ist.psu.edu/qiu93concept.html.

  2. The Anatomy of a Large-Scale Hypertextual Web Search Engine. 1998. Sergey Brin, Lawrence Page. http://citeseer.ist.psu.edu/brin98anatomy.html Computer Networks and ISDN Systems

  3. Chapter 9 "Relevance Feedback and Query Expansion", Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter09-queryexpansion.pdf

Text Clustering

Core:

  1. Chapter 16 "Flat Clustering", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter16-flatclust.pdf

  2. Chapter 17 "Hierarchical Clustering", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter17-hclust.pdf

Depth:

  1. Chapter 18 "Dimensionality Reduction and Latent Semantic Indexing", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter18-lsi.pdf

  2. Latent Dirichlet Allocation. Blei, Ng, and Jordan. http://citeseer.ist.psu.edu/blei03latent.html

References: Bonus paper for the bibliography. Hierarchical Dirichlet Processes. Teh, Jordan, Beal, Blei. www.gatsby.ucl.ac.uk/~ywteh/research/npbayes/jasa2006.pdf

Text Categorization

Core:

  1. Chapter 13 "Text Classification and Naive Bayes", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter13-nbayes.pdf

Depth:

  1. Machine Learning in Automated Text Categorization. Sebastiani. http://citeseer.ist.psu.edu/518620.html

  2. F. Li and Y. Yang. A loss function analysis for classification methods in text categorization. The Twentieth International Conference on Machine Learning (ICML'03), pp. 472-479, 2003. http://citeseer.ist.psu.edu/618984.html

Information Extraction

Core:

  1. Douglas E. Appelt and David Israel. Introduction to Information Extraction Technology. http://www.ai.sri.com/~appelt/ie-tutorial/IJCAI99.pdf

Depth:

  1. Raymond J. Mooney and Razvan Bunescu. Mining Knowledge from Text Using Information Extraction. http://www.acm.org/sigs/sigkdd/explorations/issues/7-1-2005-06/2-Mooney.pdf

  2. Snowball: Extracting Relations from Large Plain-Text Collections. Agichtein and Gravano. http://citeseer.ist.psu.edu/271085.html.

  3. Unsupervised Named Entity Extraction from the Web: An Experimental Study. Etzioni et al. http://www.cs.washington.edu/research/knowitall/papers/KnowItAll_AIJ.pdf

Dialogue

Core:

  1. Chapter 24: Dialog and Conversational Agents. Jurafsky and Martin 2nd Edition. http://www.cs.colorado.edu/~martin/SLP/Updates/23.pdf

  2. James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, William Taysom, PLOW: A Collaborative Task Learning Agent, Proceedings of the AAAI Conference on Artificial Intelligence: Special Track on Integrated Intelligence, 2007.

Speech Recognition

Core:

  1. Chapter 9 "Automatic Speech Recognition" in Jurafsky and Martin 2nd edition.
  2. Chapter 10 "Speech Recognition: Advanced Topics" in Jurafsky and Martin 2nd edition.

Depth:

  1. Yves Normandin. Maximum mutual information estimation of hidden Markov models. In: Chin-Hui Lee, F. K. Soong, and K. K. Paliwal, "Automatic Speech and Speaker Recognition", Kluwer 1996, pp. 57-82.
  2. Chapter 13 "Large-Vocabulary Search Algorithms" in Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, "Spoken Language Processing", Prentice Hall 2001.

References: The following reading is not required but may be useful to clarify any confusions in Chapter 13 of Huang et al.: Richard Schwartz, L. Nguyen, and J. Makhoul. Multiple-Pass Search Strategies. In: Chin-Hui Lee, F. K. Soong, and K. K. Paliwal, "Automatic Speech and Speaker Recognition", Kluwer 1996, pp. 429-456.

Speech Synthesis

Core:

  1. Chapter 8 "Speech Synthesis" in Jurafsky and Martin 2nd edition.

Depth:

  1. Chapter 15 "Hidden Markov Model Synthesis" in Taylor "Text-to-Speech Synthesis" 2008. http://mi.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf

  2. Chapter 16 "Unit Selection Synthesis" in Taylor "Text-to-Speech Synthesis" 2008. http://mi.eng.cam.ac.uk/~pat40/ttsbook_draft_2.pdf

NLPQual (last edited 2008-07-17 00:21:32 by StephanStiller)