NLP Qual Reading List
Qualifying Examination in Artificial Intelligence Spring 2011
The purpose of the AI Qualification Exam is ensure that the successful candidate has in-depth knowledge of one or more substantial subareas of AI (usually the area or areas in which the student intends to do his or her doctoral research). Subareas include things like vision, robotics, probabilistic reasoning, computational logic, machine learning, multi-agent systems, and natural language processing (though there is no requirement that the subarea be one of these possibilities). (The breadth expected is one or more such "top-level" divisions of AI - or a range of work equivalent in scope. It can't be "Markov Decision Processes for Robot Control".)
To pass this exam, the candidate must first assemble a qualification exam committee of three AI experts, of which at least two are active members of the Stanford Academic Council and affiliated with SAIL. The committee should be chaired by the student's research advisor.
Together with this committee, the candidate should agree upon the area or areas of AI that will be the focus of the exam. The candidate should then prepare a reading list of material in the selected areas, and the candidate should write a one-page explanation of the rationale behind this list, giving the area or areas covered. This explanation, along with the reading list, should be circulated to the committee and to the chair of the AI Qualifying Exam Committee for approval. The Chair of the AI Qualification Exam and the student's individual committee will then either approve the list or suggest further improvements to the candidate.
The AI Qualification Exam itself is an oral exam administered in one sitting. The committee may request the student to first present the content of the papers in the reading list (or a subset thereof, which are chosen as the focus) to the committee. Following the presentation (if any), the committee should quiz the student about the topics covered by the reading list. A successful candidate must exhibit in-depth knowledge in the scientific areas covered by the reading list and must respond insightfully to the questions asked by the committee. The exam usually lasts for 1.5 hours.
In evaluating the student's performance, the committee will consider three potential outcomes of the exam: Pass, Conditional Pass, and Fail. In the case of a conditional pass, the committee might place certain requirements on the student, such as taking or TAing classes. A failing candidate can retake this exam but has to begin the process from the very beginning. You may take the qual at most twice during your program. If you fail twice, you are out. Normal progress guidelines say you should pass in your *second* year. However, if a student takes the qual in the second year and fails, we have generally allowed the student to try again in year 3. But that is it.
Specific Instructions for NLP Qual
Your committee must be chosen, and the list of areas and the one-page explanation of the rationale behind the list must be sent to your committee no later than 21 days before the date of the exam.
Textbook Material
You are responsible for basic undergraduate textbook knowledge in Natural Language Processing. This means a reasonable knowledge of all material in the Jurafsky and Martin (2nd edition) textbook *except* the speech chapters (i.e. all chapters except chapters 7, 8, 9, 10, 11). We obviously don't expect you to memorize all the details and every algorithm, our goal is not to trip you up on random details, but you should have the knowledge you would get by taking or TAing a course using this book.
Daniel Jurafsky and James H. Martin. 2009. [Speech and Language Processing]. Pearson.
In addition, for some areas, we've listed a specific textbook chapter in either Jurafsky and Martin or Manning, Raghavan, and Schütze, indicating that we expect you to know this textbook material particularly well if you choose this area:
Christopher D. Manning, Prabhakar Raghavan, and Hinrich Schütze. [Introduction to Information Retrieval], Cambridge University Press. 2008.
Research Material
Choose at least 10 of 17 areas with at least 4 depth areas. One of your 10 areas must be Important Early Systems. If the list below does not specify any depth reading for your chosen depth area, you must arrange with your committee an appropriate depth reading.
Contents
- NLP Qual Reading List
- Qualifying Examination in Artificial Intelligence Spring 2011
- Specific Instructions for NLP Qual
- Textbook Material
- Research Material
- Important Early Systems
- Language Models
- Lexical Semantics
- Parsing
- Formal Language Theory
- Natural Language Understanding
- Machine Translation
- Natural Language Generation and Summarization
- Sequence Models for NLP
- Discourse
- Sentiment Analysis
- Information Retrieval
- Text Clustering
- Text Categorization
- Information Extraction
- Dialogue
- Speech Recognition
- Other Areas that have been used in the past in special cases
Important Early Systems
Core:
- Jerry Hobbs. 1978. Resolving pronoun references. Lingua, 44:311–338.
- Schank and Abelson. 1976. Scripts, Plans, Goals, and Understanding. Chapters 1-3.
R. F. Simmons, Natural language question-answering systems: 1969. Communications of the ACM. Volume 13, 1970. [link]
Depth:
- Aravind K. Joshi and Phil Hopely. 1999. A Parser from Antiquity. In "Extended Finite State Models of Language", edited by Andras Kornai. Cambridge University Press, 6--15
- Terry Winograd. 1972. Understanding Natural Language. Academic Press, New York. Sections 1-2.
Language Models
Core:
- Chapter 4 "Language Modeling" in Jurafsky and Martin 2nd edition.
Joshua Goodman. 2001. A Bit of Progress in Language Modeling. Computer Speech and Language, 403-434. [link]
Lexical Semantics
Core:
- Chapter 20 "Computational Lexical Semantics" in Jurafsky and Martin 2nd edition.
Peter Turney and Patrick Pantel. 2010. From Frequency to Meaning: Vector Space Models of Semantics. Journal of Artificial Intelligence Research (JAIR), 37(1):141-188. AI Access Foundation. [pdf]
Depth:
Budanitsky, Alexander and Graeme Hirst. 2006. Evaluating WordNet-based Measures of Lexical Semantic Relatedness. Computational Linguistics.
Marti Hearst. COLING 1992. Automatic Acquisition of Hyponyms from Large Text Corpora http://www.cs.mu.oz.au/acl/C/C92/C92-2082.pdf
Automatic Labeling of Semantic Roles, Daniel Gildea and Daniel Jurafsky. In Proceedings of the 38th Annual Conference of the Association for Computational Linguistics (ACL-00), pp. 512-520, Hong Kong, October 2000. (Please read the 8-page conference version, although the 45-page extended version might also be useful.)
Parsing
Core:
- Chapter 14 "Statistical Parsing" in Jurafsky and Martin
Michael Collins. 2003. Head-Driven Statistical Models for Natural Language Parsing. In Computational Linguistics. http://acl.ldc.upenn.edu/J/J03/J03-4003.pdf
Depth:
Stuart M. Shieber. Using restriction to extend parsing algorithms for complex-feature-based formalisms. In 23rd Annual Meeting of the Association for Computational Linguistics, Chicago, 1985. http://acl.ldc.upenn.edu/P/P85/P85-1018.pdf
Jason Eisner. 2000. Bilexical Grammars and their Cubic-Time Parsing Algorithms, ch. 3 of H. Bunt and A. Nijholt (eds), Advances in Probabilistic and Other Parsing Technologies. http://www.cs.jhu.edu/%7Ejason/papers/eisner.iwptbook00.pdf
Dan Klein and Chris Manning, "Corpus-Based Induction of Syntactic Structure: Models of Dependency and Constituency," Proceedings of the 42nd Annual Meeting of the ACL, 2004 http://acl.ldc.upenn.edu/P/P04/P04-1061.pdf
Formal Language Theory
Core:
- Chapter 3 "Words and Transducers" in Jurafsky and Martin 2nd edition.
- Mehryar Mohri. Weighted Finite-State Transducer Algorithms: An Overview. In Carlos Martin-Vide, Victor Mitrana, and Gheorghe Paun, editors, Formal Languages and Applications. volume 148, VIII, 620 p. Springer, Berlin, 2004.
Depth:
- Joshi, Vijay-Shanker and Weir, 1991. The convergence of mildly context-sensitive grammar formalisms. In Sells, P., Shieber, S. and Waswo, T. (eds.), Foundational Issues in Natural Language Processing. MIT Press pp. 31¡X82 [Green library; P98 .F65 1991]
Kaplan and Kay. 1994. Regular Models of Phonological Rule Systems. Computational Lingustics 20. http://acl.ldc.upenn.edu/J/J94/J94-3001.pdf
Natural Language Understanding
Core:
P. Blackburn and J. Bos. Computational Semantics. Theoria 18 No.46: 27-45, 2003. http://www.cogsci.ed.ac.uk/%7Ejbos/pubs/theoria.pdf
Depth:
Interpretation as Abduction. Jerry R. Hobbs, Mark Stickel, Paul Martin and Douglas Edwards. In Proceedings of the 26th Annual Meeting of the ACL, 1988. http://acl.ldc.upenn.edu/P/P88/P88-1012.pdf
Dekang Lin and Patrick Pantel. 2001. Discovery of Inference Rules for Question Answering. Natural Language Engineering 7(4):343-360. http://www.cs.ualberta.ca/%7Elindek/papers/jnle01.pdf
Bill MacCartney and Christopher D. Manning. Modeling semantic containment and exclusion in natural language inference. The 22nd International Conference on Computational Linguistics (Coling-08), Manchester, UK, August 2008. [[ http://nlp.stanford.edu/~wcmac/papers/natlog-coling08.pdf ]]
Machine Translation
Core:
Brown, Della Pietra x2, Mercer. 1993. The Mathematics of Statistical Machine Translation: Parameter Estimation. CL 19. http://acl.ldc.upenn.edu/J/J93/J93-2003.pdf
P. Koehn, F.J. Och, and D. Marcu (2003). Statistical phrase based translation. In Proceedings of the Joint Conference on Human Language Technologies and the Annual Meeting of the North American Chapter of the Association of Computational Linguistics (HLT/NAACL). http://www.inf.ed.ac.uk/publications/online/0731.pdf
- Chapter 25 "Machine Translation" in Jurafsky and Martin.
Depth:
Och, 2003; Minimum Error Rate Training for Statistical Machine Translation. http://acl.ldc.upenn.edu/acl2003/main/pdfs/Och.pdf
Franz Josef Och, Hermann Ney. "The alignment template approach to statistical machine translation." Accepted for publication in Computational Linguistics, 2004. http://www.mitpressjournals.org/doi/abs/10.1162/0891201042544884
D. Chiang (2005). A Hierarchical Phrase-Based Model for Statistical Machine Translation. In Proceedings of the 43rd Annual Meeting of the Association for Computational Linguistics (ACL'05) http://www.isi.edu/%7Echiang/papers/chiang-acl05.pdf
Natural Language Generation and Summarization
Core:
- Chapter 23: Question Answering and Summarization in Jurafsky and Martin
Christina Sauper and Regina Barzilay. 2009. Automatically Generating Wikipedia Articles: A Structure-Aware Approach. Proceedings of ACL, 2009. pdf
Depth:
Catching the drift: Probabilistic content models, with applications to generation and summarization. Regina Barzilay and Lillian Lee. Proceedings of HLT-NAACL, pp. 113¡V120, 2004. http://acl.ldc.upenn.edu/N/N04/N04-1015.pdf
Sanda Harabagiu, Andrew Hickl, and Finley Lacatusu. Satisying Information Needs with Multi-Document Summaries. 2007. http://gricean.com/papers/harabagiuHicklLacatusu2007.pdf
Neil McIntryre and Mirella Lapata. 2009. Learning to Tell Tales: A Data-driven Approach to Story Generation. In Proceedings of the Joint Conference of the 47th Annual Meeting of the ACL and the 4th International Joint Conference on Natural Language Processing of the AFNLP, 11--20. Singapore. http://www.aclweb.org/anthology/P/P09/P09-1025.pdf
Sequence Models for NLP
Core:
An Introduction to Conditional Random Fields for Relational Learning. Charles Sutton and Andrew McCallum. In Introduction to Statistical Relational Learning. Edited by Lise Getoor and Ben Taskar. MIT Press. 2006. http://www.cs.umass.edu/%7Emccallum/papers/crf-tutorial.pdf
Depth:
Collins, 2002; Discriminative Training Methods for Hidden Markov Models: Theory and Experiments with Perceptron Algorithms. http://people.csail.mit.edu/mcollins/papers/tagperc.ps
References:
Klein & Manning ACL tutorial on Maxent models, conditional estimation, and optimization. http://www.cs.berkeley.edu/~klein/papers/maxent-tutorial-slides-6.pdf
Discourse
Core:
- Chapter 21 "Computational Discourse" in Jurafsky and Martin 2nd edition.
Aria Haghighi and Dan Klein. 2009. Simple Coreference Resolution with Rich Syntactic and Semantic Features. In proceedings of EMNLP 2009 [[ http://aclweb.org/anthology-new/D/D09/D09-1120.pdf http://aclweb.org/anthology-new/D/D09/D09-1120.pdf ]]
Depth:
- Mann, W. C. and S. A. Thompson. 1988. Rhetorical Structure Theory: Toward a Functional Theory of Text Organization. Text 8(3) (243-281).
Marcu and Echihabi. An Unsupervised Approach to Recognizing Discourse Relations. ACL 2002. http://acl.ldc.upenn.edu/P/P02/P02-1047.pdf
Sentiment Analysis
Core:
Chapter 4 of: Bo Pang and Lillian Lee. 2008. Opinion mining and sentiment analysis. Foundations and Trends in Information Retrieval 2(1-2):1-135. http://www.cs.cornell.edu/home/llee/opinion-mining-sentiment-analysis-survey.html
Bing Liu. "Sentiment Analysis and Subjectivity." In the Handbook of Natural Language Processing, Second Edition. March, 2010. http://www.cs.uic.edu/~liub/FBS/NLP-handbook-sentiment-analysis.pdf
Depth:
1. Burt L. Monroe, Michael P. Colaresi, and Kevin M. Quinn. 2008. Fightin’ Words: Lexical Feature Selection and Evaluation for Identifying the Content of Political Conflict. Political Analysis (2008) 16:372–403. link
And either
1. Reza Zafarani, William Cole, and Huan Liu. 2010. Sentiment extraction in social networks: a case study in LiveJournal. In Advances in Social Computing, 6007:413–420 pdf
Or
1. Potts, Christopher. 2011. On the negativity of negation. In Nan Li and David Lutz, eds., Proceedings of Semantics and Linguistic Theory 20, 636-659. [www.stanford.edu/~cgpotts/papers/potts-salt20-negation.pdf pdf]
Information Retrieval
Core:
"Inverted files for text search engines", J. Zobel and A. Moffat, ACM Computing Surveys, 38(2):1-56, 2006. http://portal.acm.org/citation.cfm?doid=1132956.1132959
Depth:
Document Language Models, Query Models, and Risk Minimization for Information Retrieval. John Lafferty and Chengxiang Zhai. 2001. pdf
A Markov Random Field Model for Term Dependencies. 2005. Donald Metzler and W Bruce Croft. pdf
Text Clustering
Core:
Chapter 16 "Flat Clustering", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter16-flatclust.pdf
Chapter 17 "Hierarchical Clustering", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter17-hclust.pdf
Depth:
Chapter 18 "Dimensionality Reduction and Latent Semantic Indexing", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter18-lsi.pdf
Latent Dirichlet Allocation. Blei, Ng, and Jordan. http://citeseer.ist.psu.edu/blei03latent.html
References: Bonus paper for the bibliography. Hierarchical Dirichlet Processes. Teh, Jordan, Beal, Blei. www.gatsby.ucl.ac.uk/~ywteh/research/npbayes/jasa2006.pdf
Text Categorization
Core:
Chapter 13 "Text Classification and Naive Bayes", In Manning, Raghavan, and Schuetze. http://nlp.stanford.edu/IR-book/pdf/chapter13-nbayes.pdf
Machine Learning in Automated Text Categorization. Sebastiani. http://citeseer.ist.psu.edu/518620.html
Depth:
F. Li and Y. Yang. A loss function analysis for classification methods in text categorization. The Twentieth International Conference on Machine Learning (ICML'03), pp. 472-479, 2003. http://citeseer.ist.psu.edu/618984.html
Information Extraction
Core:
Sunita Sarawagi. Chapter 1,2,3 of Information Extraction, Foundation and Trends in Databases. http://www.it.iitb.ac.in/~sunita/papers/ieSurvey.pdf
Depth:
Sunita Sarawagi. Chapter 4 of Information Extraction, Foundation and Trends in Databases. http://www.it.iitb.ac.in/~sunita/papers/ieSurvey.pdf
Unsupervised Named Entity Extraction from the Web: An Experimental Study. Etzioni et al. http://www.cs.washington.edu/research/knowitall/papers/KnowItAll_AIJ.pdf
Dialogue
Core:
- Chapter 24: Dialog and Conversational Agents. Jurafsky and Martin 2nd Edition.
James Allen, Nathanael Chambers, George Ferguson, Lucian Galescu, Hyuckchul Jung, Mary Swift, William Taysom, PLOW: A Collaborative Task Learning Agent, Proceedings of the AAAI Conference on Artificial Intelligence: Special Track on Integrated Intelligence, 2007.
Depth:
Matthew Frampton and Oliver Lemon. 2009. Recent research advances in Reinforcement Learning in Spoken Dialogue Systems. Knowledge Engineering Review, 24(4): 375-408, 2009. http://journals.cambridge.org/repo_A67PB6Tu
- S. Young, M. Gasic, S. Keizer, F. Mairesse, J. Schatzmann, B. Thomson and K. Yu (2010). "The Hidden Information State Model: a practical framework for POMDP-based spoken dialogue management." Computer Speech and Language, 24(2): 150-174
Speech Recognition
Core:
- Chapter 9 "Automatic Speech Recognition" in Jurafsky and Martin 2nd edition.
- Chapter 10 "Speech Recognition: Advanced Topics" in Jurafsky and Martin 2nd edition.
Depth:
- Yves Normandin. Maximum mutual information estimation of hidden Markov models. In: Chin-Hui Lee, F. K. Soong, and K. K. Paliwal, "Automatic Speech and Speaker Recognition", Kluwer 1996, pp. 57-82.
- Chapter 13 "Large-Vocabulary Search Algorithms" in Xuedong Huang, Alex Acero, and Hsiao-Wuen Hon, "Spoken Language Processing", Prentice Hall 2001.
References: The following reading is not required but may be useful to clarify any confusions in Chapter 13 of Huang et al.: Richard Schwartz, L. Nguyen, and J. Makhoul. Multiple-Pass Search Strategies. In: Chin-Hui Lee, F. K. Soong, and K. K. Paliwal, "Automatic Speech and Speaker Recognition", Kluwer 1996, pp. 429-456.
Other Areas that have been used in the past in special cases
Web Search
Core:
The Anatomy of a Large-Scale Hypertextual Web Search Engine. 1998. Sergey Brin, Lawrence Page. http://citeseer.ist.psu.edu/brin98anatomy.html Computer Networks and ISDN Systems
Depth:
Link Analysis and Web Search. Chap 14 in Book Networks, Crowds and Markets: Reasoning about a highly connected world. David Easley and Jon Kleinbergpdf
Adapting Ranking SVM to Document Retrieval. Yunbo CAO, Jun XU, Tie-Yan LIU, Hang LI, Yalou HUANG and Hsiao-Wuen HON pdf
Blogosphere and Social Networks
Core:
Complex networks and decentralized search algorithms. 2006. Jon Kleinberg. pdf
Depth:
1. D. Kempe, J. Kleinberg, E. Tardos. Maximizing the Spread of Influence through a Social Network. Proc. 9th ACM SIGKDD Intl. Conf. on Knowledge Discovery and Data Mining, 2003 pdf
1. Cost-effective Outbreak Detection in Networks by J. Leskovec, A. Krause, C. Guestrin, C. Faloutsos, J. VanBriesen, N. Glance. ACM SIGKDD International Conference on Knowledge Discovery and Data Mining (KDD), 2007. pdf
1. Chapter 14. Link Analysis and Web Search of Kleinberg and Easley. pdf
