Emphasis - Data Science

ISTA 431: Data Warehousing and Analytics in the Cloud

Data Warehousing and Analytics In the Cloud will utilize concepts, frameworks, and best practices for
designing a cloud-based data warehousing solution and explore how to use analytical tools to perform
analysis on your data. In the first half of the course, I will provide an overview of the field of Cloud
Computing, its main concepts, and students will get hands-on experience through projects which utilize
cloud computing platforms. In the second half of the course, we will examine the construction of a cloudbased

ISTA 456: Text Retrieval and Web Search

Most of the web data today consists of unstructured text. Of course, the fact that this data exists is irrelevant, unless it is made available such that users can quickly find information that is relevant for their needs. This course will cover the fundamental knowledge necessary to build such systems, such as web crawling, index construction and compression, boolean, vector-based, and probabilistic retrieval models, text classification and clustering, link analysis algorithms such as PageRank, and computational advertising.

ESOC 414: Computational Social Science

This course will guide students through advanced applications of computational methods for social science research. Students will be encouraged to consider social problems from across sectors, like health science, education, environmental policy and business. Particular attention will be given to the collection and use of data to study social networks, online communities, electronic commerce and digital marketing.

ISTA 320: Applied Data Visualization

This course will introduce students to the fundamental concepts and tools used to convey the information contained within large, complex data sets through a variety of visualization techniques. Students will learn the fundamentals of data exploration data via visualizations, how to manipulate and reshape data to make it suitable for visualization, and how to prepare everything from simple single-variable visualizations to large multi-tiered and interactive visualizations.

ISTA 439: Statistical Natural Language Processing (Cross-listed LING 439)

This course introduces the key concepts underlying statistical natural language processing. Students will learn a variety of techniques for the computational modeling of natural language, including: n-gram models, smoothing, Hidden Markov models, Bayesian Inference, Expectation Maximization, Viterbi, Inside-Outside Algorithm for Probabilistic Context-Free Grammars, and higher-order language models.  Graduate-level requirements include assignments of greater scope than undergraduate assignments.

ISTA 322: Data Engineering

This course will be inviting for a wide variety of students from across disciplines, and they will learn how to use industry standard tools and practices to make large data sets usable for scientists and other decision makers. From data collection and preparation, to the creation of big data stores, databases, or systems to make data flow, this course will focus on the practical work needed to prepare big data for analyses across contexts. Students will be introduced to a variety of technical tools for data management, storage, use, and manipulation.

ISTA 410: Bayesian Modeling and Inference

Bayesian modeling and inference is a powerful modern approach to representing the statistics of the world, reasoning about the world in the face of uncertainty, and learning about it from data. It cleanly separates the notions of representation, reasoning, and learning. It provides a principled framework for combining multiple source of information such as prior knowledge about the world with evidence about a particular case in observed data.

ISTA 457: Neural Networks

Neural networks are a branch of machine learning that combines a large number of simple computational units to allow computers to learn from and generalize over complex patterns in data. Students in this course will learn how to train and optimize feed forward, convolutional, and recurrent neural networks for tasks such as text classification, image recognition, and game playing.

ISTA 450: Artificial Intelligence

The methods and tools of Artificial Intelligence used to provide systems with the ability to autonomously problem solve and reason with uncertain information. Topics include: problem solving (search spaces, uninformed and informed search, games, constraint satisfaction), principles of knowledge representation and reasoning (propositional and first-order logic, logical inference, planning), and representing and reasoning with uncertainty (Bayesian networks, probabilistic inference, decision theory).

ISTA 455: Applied Natural Language Processing

Most of web data today consists of unstructured text. This course will cover the fundamental knowledge necessary to organize such texts, search them a meaningful way, and extract relevant information from them.

Subscribe to RSS - Emphasis - Data Science