Information Retrieval Resources

Information on Information Retrieval (IR) books, courses, conferences and other resources.

Books on Information Retrieval (General)
Introduction to Information Retrieval. C.D. Manning, P. Raghavan, H. Schütze. Cambridge UP, 2008. Classical and web information retrieval systems: algorithms, mathematical foundations and practical issues.
Modern Information Retrieval. R. Baeza-Yates, B. Ribeiro-Neto. Addison-Wesley, 1999. Widely used and cited.
Information Retrieval: Algorithms and Heuristics. D.A. Grossman, O. Frieder. Springer, 2004. Excellent textbook.
Managing Gigabytes. I.H. Witten, A. Moffat, T.C. Bell. Morgan Kaufmann, 1999. The authority on index construction and compression.
Finding Out About. R. Belew. Cambridge UP, 2001. More suitable for undergraduate classes than other books listed here.
Information Retrieval: A Health and Biomedical Perspective. W.R. Hersh. Springer, 2002. As the title says: a health/biomedical perspective.
TREC: Experiment and Evaluation in Information Retrieval. E.M. Voorhees, D.K. Harman. MIT Press, 2005. A survey of recent research results.
Language Modeling for Information Retrieval. W.B. Croft, J. Lafferty. Springer, 2003. Language models are of increasing importance in IR.
Readings in Information Retrieval. K. Sparck Jones, P. Willett. Morgan Kaufmann, 1997. A collection of classical IR papers.
Recommended Reading for IR Research Students. A. Moffat, J. Zobel, D. Hawking. SIGIR Forum, 39(2), 2005. Not a book, but a collection of seminal papers, more up-to-date than Sparck-Jones et al.
Information Storage and Retrieval Systems. G. Kowalski, M.T. Maybury. Springer, 2005. "... takes a system approach, discussing all aspects of an Information Retrieval System."
The Geometry of Information Retrieval. C.J. van Risjbergen. Cambridge UP, 2004. Am ambitious attempt to develop quantum mechanics as a new foundation for IR.
Introduction to Modern Information Retrieval. G.G. Chowdhury. Neal-Schuman, 2003. Intended for students of library and information studies.
Text Information Retrieval Systems. C.T. Meadow, B.R. Boyce, D.H. Kraft, C.L. Barry. Academic Press, 2007. Also takes a library/information science perspective.
More Books

Books on Web Information Retrieval
Information Retrieval in Practice. B. Croft, D. Metzler, T. Strohman. Pearson Education, 2009.
Mining the Web: Analysis of Hypertext and Semi Structured Data. S. Chakrabarti. Morgan Kaufmann, 2002. The best introduction for web-centric IR.
Google's PageRank and beyond: The science of Search Engine Rankings. Amy N. Langville, Carl D. Meyer. Princeton University Press, 2006. More focused on the algorithms of PageRank, but also covers general web IR.
Modeling the Internet and the Web: Probabilistic Methods and Algorithms. P. Baldi, P. Frasconi, P. Smyth. Wiley, 2003. A bit terse. Recommended for those who have a good foundation in probability theory, but are new to IR.

Good books for implementing a search engine
Managing Gigabytes (see above)
Building Search Applications: Lucene, Lingpipe, and Gate. M. Konchady. Mustru Publishing, 2008.
Lucene in Action. O. Gospodnetic, E. Hatcher. Manning Publications, 2004.
Spidering Hacks. K. Hemenway, T. Calishain. O'Reilly, 2003.

Online Books - Browsable
Introduction to Information Retrieval (see above)
Finding Out About (see above)
Information Retrieval. C. J. van Rijsbergen. Butterworths, 1979. The classic. Almost 40 years old, but still worth reading.
Information Retrieval. T. van der Weide. 2004. Introduction to IR and hypertext.

Online Books - PDF
Introduction to Information Retrieval (see above)
Information Retrieval in Practice. B. Croft, D. Metzler, T. Strohman. Pearson Education, 2009. (two chapters)
Information Retrieval. C. J. van Rijsbergen. Butterworths, 1979.
Information Retrieval Interaction. P. Ingwersen. Taylor Graham, 1992. Focuses on user interaction in IR.
Information Retrieval: A Survey. Ed Greengrass. 2000. Good survey of "classical" IR, but little or no coverage of recent work (e.g., language models, PageRank, SVMs).
Various tutorials at Mi Islita

Research Centers
CMU (LTI)
Dublin CU
Geneva (Viper)
Glasgow
Helsinki Institute for Information Technology
IBM
Illinois Institute of Technology
Information Retrieval Facility (IRF)
Microsoft Research
NIST
Peking
Pittsburgh
Queen Mary
Sheffield
UIUC
UMASS

Courses
Berkeley (SIMS)
CMU
Cornell
DePaul
IIT
Johns Hopkins I
Johns Hopkins II
Maryland
MPI
Otago
Pittsburgh
Princeton
Stanford
Stuttgart
Texas
UMASS

Problem Sets / Assignments
IIR exercises
Bilkent
DePaul
Minas Gerais
North Texas
Stuttgart
Tennessee

Web Information Retrieval
webir.org
Search Engine Watch
Users' Guide to Web Searching
PageRank

Subareas, Applications, Methods
Graphical interfaces to support information search
Information Retrieval & Extraction
Information Retrieval & Machine Learning
Text Mining & Web Mining
INEX: XML retrieval
Geographic Information Retrieval
Music Information Retrieval
CLIR & Multilingual Information Retrieval
Cross-Language Information Retrieval (CLIR) Resources
N-Grams in Information Retrieval
Agent-based Information Retrieval
Audio Information Retrieval
Adversarial Information Retrieval

Conferences
TREC
Cross Language Evaluation Forum (CLEF)
SIGIR 2008 (last), SIGIR 2009 (next)
CIKM 2008, CIKM 2009
WWW 2008, WWW 2009
JCDL 2008, JCDL 2009
RIAO 2007, RIAO 2010
ECIR 2009, ECIR 2010
SPIRE 2008 SPIRE 2009
Norbert Fuhr's IR conference calendar

Journals
ACM Transactions on Information Systems (TOIS): dblp home
Information Processing and Management (IP&M): dblp home
Information Retrieval: dblp home
International Journal on Digital Libraries: dblp home
Journal of the American Society of Information Science and Technology (JASIST): dblp home
SIGIR Forum: dblp home
Journal of Documentation
D-Lib Magazine
Data & Knowledge Engineering: dblp home
Information Processing Letters: dblp home
Information Research
Information Systems: dblp home
Journal of Intelligent Information Systems: dblp home
Knowledge and Information Systems: dblp home
Foundations and Trends in Information Retrieval: home

Popular Articles
Wikipedia: Information Retrieval
A. Singhal: Modern Information Retrieval: A Brief Overview
D. Austin: How Google Finds Your Needle in the Web's Haystack
S.E. Robertson, K. Sparck Jones: Simple, proven approaches to text retrieval
Bruce Croft: What Do People Want From IR
Information Retrieval on the World Wide Web
Michael Lesk: The Seven Ages of Information Retrieval

Software
C. Middleton, R. Baeza-Yates: A Comparison of Open Source Search Engines (contains an up-to-date list of available search engine software)
Doug Oard's list of available text retrieval systems
Avi Rappoport: open source search engines
MySQL full text search
Text to Matrix Generator, a MATLAB toolbox for indexing, retrieval and other text processing tasks

Collections
U. of Glasgow list of available text retrieval collections
NLP/IR corpus list at NUS
NLP/IR corpus list at Edinburgh
SMART at Cornell (downloads of a number of collections, stop lists, SMART retrieval system etc.)
Internet archive (limited availability)
Linguistic Data Consortium

Professional Organizations
ACM SIGIR
BCS IRSG

Other Collections of Information Retrieval Links
ACM SIGIR
David Karger

Other Resources
Glossary (Modern Information Retrieval)
Information retrieval research links @ Search Tools
BUBL: Information Retrieval Links
LSU: Information Retrieval Systems
Open Directory: Information Retrieval Links
UBC: Indexing Resources
IR & Neural Networks, Symbolic Learning, Genetic Algorithms
A stop list (a list of stop words)
Chris Manning's NLP resources
Weiguo Patrick Fan's text mining links

2009.04.12