Most of web data today consists of unstructured text. This course will cover the fundamental knowledge necessary to organize such texts, search them a meaningful way, and extract relevant information from them. This course will teach natural language processing through the design and development of end-to-end natural language understanding applications, including sentiment analysis (e.g., is this review positive or negative?), information extraction (e.g., extracting named entities and their relations from text), and question answering (retrieving exact answers to natural language questions such as "What is the capital of France" from large document collections). We will use several natural language processing toolkits, such as NLTK and Stanford's CoreNLP. The main programming language used in the course will be Python, but code written in Java or Scala will be accepted as well. Graduate-level requirements include implementing more complex, state-of-the-art algorithms for the three proposed projects. This will require additional reading of conference papers and journal articles.
As we work together to battle the coronavirus, we will continue to offer safe and secure online sessions . Even though our physical office is closed, in accordance with the guidelines recommended by CDC, we are working remotely and continuing to provide student, staff, and faculty assistance. We can be reached Monday-Friday 9am-4pm Mountain Standard Time at 520-621-3565 or by email – please refer to the iSchool Directory. Please allow up to 24 hours response time. Faculty and Adjuncts will respond as their schedules permit.