INFO 556: Text Retrieval and Web Search


Most of the web data today consists of unstructured text. Of course, the fact that this data exists is irrelevant, unless it is made available such that users can quickly find information that is relevant for their needs. This course will cover the fundamental knowledge necessary to build such systems, such as web crawling, index construction and compression, boolean, vector-based, and probabilistic retrieval models, text classification and clustering, link analysis algorithms such as PageRank, and computational advertising. The students will also complete one programming project, in which they will construct one complex application that combines multiple algorithms into a system that solves real-world problems.  Graduate level requirements include implementing more complex, state-of-the-art algorithms for the programming project, which might require additional reading of research articles. Written assignments will have additional questions for graduate students.