Data Science

INFO 531: Data Warehousing and Analytics in the Cloud

Data Warehousing and Analytics In the Cloud will utilize concepts, frameworks, and best practices for designing a cloud-based data warehousing solution and explore how to use analytical tools to perform analysis on your data. In the first half of the course, I will provide an overview of the field of Cloud Computing, its main concepts, and students will get hands-on experience through projects which utilize cloud computing platforms.

INFO 539: Statistical Natural Language Processing (Cross-listed LING 539)

This course introduces the key concepts underlying statistical natural language processing. Students will learn a variety of techniques for the computational modeling of natural language, including: n-gram models, smoothing, Hidden Markov models, Bayesian Inference, Expectation Maximization, Viterbi, Inside-Outside Algorithm for Probabilistic Context-Free Grammars, and higher-order language models.  Graduate-level requirements include assignments of greater scope than undergraduate assignments.

INFO 556: Text Retrieval and Web Search

Most of the web data today consists of unstructured text. Of course, the fact that this data exists is irrelevant, unless it is made available such that users can quickly find information that is relevant for their needs. This course will cover the fundamental knowledge necessary to build such systems, such as web crawling, index construction and compression, boolean, vector-based, and probabilistic retrieval models, text classification and clustering, link analysis algorithms such as PageRank, and computational advertising.

INFO 565: Information Architecture and Controlled Vocabularies (3 credits)

Introduction to organization systems that use controlled vocabularies. Principles, standards, design and maintenance of thesauri using computer software are studied. The use of controlled vocabularies in website design and digital libraries is also explored.

INFO 570: Data Base Development and Management

This course covers theory, methods, and techniques widely used to design and develop a relational database system and students will develop a broad understanding of modern database management systems. Applications of fundamental database principles in a stand-alone database environment using MS Access and Windows are emphasized. Applications in an Internet environment will be discussed using MySQL in the Linux platform.

INFO 557: Neural Networks

Neural networks are a branch of machine learning that combines a large number of simple computational units to allow computers to learn from and generalize over complex patterns in data. Students in this course will learn how to train and optimize feed forward, convolutional, and recurrent neural networks for tasks such as text classification, image recognition, and game playing.

INFO 555: Applied Natural Language Processing

Most of web data today consists of unstructured text. This course will cover the fundamental knowledge necessary to organize such texts, search them a meaningful way, and extract relevant information from them.

INFO 550: Artificial Intelligence

This course provides a broad technical introduction to the tools, techniques and concepts of artificial intelligence. The course will focus on methods for automating decision making under a variety of conditions, including full and partial information, and dealing with uncertainty. Students will gain practical experience writing programs that use these techniques to solve a variety of problems.

INFO 523: Data Mining and Discovery

This course will introduce students to the concepts and techniques of data mining for knowledge discovery. It includes methods developed in the fields of statistics, large-scale data analytics, machine learning, pattern recognition, database technology and artificial intelligence for automatic or semi-automatic analysis of large quantities of data to extract previously unknown interesting patterns.

Subscribe to RSS - Data Science