As data continue to grow in volume and penetrate everything we do in contemporary work across many professions, employers are seeking data scientists to extract meanings and patterns from large quantities of data. This user-friendly course will provide an introduction to a variety of skills required for data analytics in organizations, education, health contexts, and the sciences. Specifically, this course examines information management in the context of massive sets of data, provides students proficiency with a variety of data analysis tools, and exposes learners to varied data platforms as well as skills and concepts related to data mining and statistical analysis. Particular attention will be given to toolkits imbedded in R and other platforms.
Data Science and Visualization Certificate
About the Program
The purpose of this certificate is to appeal to a wide variety of learners from across the campus. While a certificate is not a professional certification, this certificate will signal to employers that students have dedicated the time and energy necessary to develop the skills and confidence for tackling messy, real-world data problems using modern programming languages.
The Data Science and Visualization Certificate will provide undergraduate students the confidence and training they need in data collection, exploration, manipulation and storage, analysis, and presentation in order to navigate data-rich workplace environments. In completing the Certificate, students will obtain practical experience using a variety of data science techniques and software applications, gain hands-on experience working with real-world data sets drawn from science, social media, and business and build on basic statistical and programming knowledge to become familiar with the tools utilized for advanced work in today’s data-rich landscape.
Up to 6 units may be shared with a degree requirement (major, minor, General Education) or second certificate.
9 units must be completed at UArizona (not transfer)
A certificate can be completed as a stand-alone program or alongside an undergraduate degree. There are no additional application requirements for the Data Science and Visualization certificate.
If you are a current UArizona student, you can declare your certificate at the button above.
If you are not a current UArizona student, you can apply for admission as a certificate-seeking student.
Data Science and Visualization is available in the Main campus and in Arizona Online.
Students must meet the same general UArizona admissions criteria as degree-seeking students. The requirements and expectations are the same as a first-year, transfer or readmit student depending on what admit type a student is (first-year, transfer or readmit). Students have to fill out the application fully and submit all required transcripts and requested materials. Certificate seeking students (as in Certificate seeking only, not as part of a degree) are not eligible for merit aid or financial aid and if they apply as degree-seeking in the future, they are considered “readmits”.
Students are required to maintain a 2.0 or C average in certificate coursework to complete the certificate. Student who wish to graduate with an undergraduate certificate should submit a certificate application to graduation services when they have finished the certificate requirements. There is a one-time $15 graduation fee. There are no deadlines to apply to graduate with a certificate.
- 12 units are required for the certificate
Up to 6 units may be shared with a degree requirement (major, minor, General Education) or second certificate.
All students, including Information Science students, may only 'double use' 6 units towards another program of study (major, minor, General Education, or another certificate)
Choose either ESOC 214 or ISTA 116, then take ISTA 320 Data Visualization, and ISTA 321 Data Mining. One elective is also required.
Understanding uncertainty and variation in modern data: data summarization and description, rules of counting and basic probability, data visualization, graphical data summaries, working with large data sets, prediction of stochastic outputs from quantitative inputs. Operations with statistical computer packages such as R.
This course will introduce students to the fundamental concepts and tools used to convey the information contained within large, complex data sets through a variety of visualization techniques. Students will learn the fundamentals of data exploration data via visualizations, how to manipulate and reshape data to make it suitable for visualization, and how to prepare everything from simple single-variable visualizations to large multi-tiered and interactive visualizations. Visualization theory will be presented alongside the technical aspect of the course to develop a holistic understanding of the topic.
This course introduces students to the theory and practice of data mining for knowledge discovery. This includes methods developed in the fields of statistics, large-scale data analytics, machine learning, and artificial intelligence for automatic or semi-automatic analysis of large quantities of data to extract previously unknown and interesting patterns. Topics include understanding varieties of data, classification, association rule analysis, cluster analysis, and anomaly detection. We will use software packages for data mining, explaining the underlying algorithms and their use and limitations. The course will include laboratory exercises, with data mining case studies using data from biological sequences and networks, social networks, linguistics, ecology, geo-spatial applications, marketing and psychology.
- Complete 1 additional course (3 units)
This course introduces biostatistical methods and applications, covering descriptive statistics, probability, and inferential techniques necessary for appropriate analysis and interpretation of data relevant to health sciences. Students will use a statistical software package.
This course will explore broad research paradigms and theoretical approaches that inform contemporary social research, varying study designs, as well as the systematic methods utilized in differing types of data analyses. Though this course will introduce research processes across the academic spectrum, quantitative analysis of both small and large data sets will be emphasized. Therefore, students will learn about basic statistical analyses and will be introduced to the emerging worlds of data science and social media analytics. Students will also consider related topics such as data visualization or research presentations.
This course will guide students through advanced applications of computational methods for social science research. Students will be encouraged to consider social problems from across sectors, like health science, education, environmental policy and business. Particular attention will be given to the collection and use of data to study social networks, online communities, electronic commerce and digital marketing. Students will consider the many research designs used in contemporary social research and will learn to think critically about claims of causality, mechanisms, and generalization in big data studies.
An introduction to computational techniques and using a modern programming language to solve current problems drawn from science, technology, and the arts. Topics include control structures, elementary data structures, and effective program design and implementation techniques. Weekly laboratory.
**Programming-intensive Course, College Algebra recommended
This course will be inviting for a wide variety of students from across disciplines, and they will learn how to use industry standard tools and practices to make large data sets usable for scientists and other decision makers. From data collection and preparation, to the creation of big data stores, databases, or systems to make data flow, this course will focus on the practical work needed to prepare big data for analyses across contexts. Students will be introduced to a variety of technical tools for data management, storage, use, and manipulation.
This course surveys the techniques central to the modern practice of extracting useful patterns and models from large bodies of data and the theory behind these techniques. Students will learn the purpose, power, and limitations of models, with concrete examples from business and science. Course subject matter may include classification and regression, supervised segmentation and decision trees, similarity/distance metrics and recommender systems, clustering and nearest neighbors, support vector machines, understanding and avoiding overfitting, natural language processing and sentiment analysis, machine learning, neural networks, and AI, and logistic regression.
Machine learning describes algorithms which can modify their internal parameters (i.e., "learn") to recognize patterns and make decisions based on examples or through interaction with the environment. This course will introduce the fundamentals of machine learning, will describe how to implement several practical methods for pattern recognition, feature selection, clustering, and decision making for reward maximization, and will provide a foundation for the development of new machine learning algorithms.
Students will learn from experts from projects that have developed widely adopted foundational Cyberinfrastrcutrue resources, followed by hands-on laboratory exercises focused around those resources. Students will use these resources and gain practical experience from laboratory exercises for a final project using a data set and meeting requirements provided by domain scientists. Students will be provided access to computer resources at: UA campus clusters, iPlant Collaborative and at NSF XSEDE. Students will also learn to write a proposal for obtaining future allocation to large scale national resources through XSEDE.
This course covers theory, methods, and techniques widely used to design and develop a relational database system and students will develop a broad understanding of modern database management systems. Applications of fundamental database principles in a stand-alone database environment using MS Access and Windows are emphasized. Applications in an Internet environment will be discussed using MySQL in the Linux platform.
Do you want to live permanently on Antarctica? Now is your chance, apply for Mission Antarctica! The ice is melting, the penguins are marching; it seems like a perfect time to settle, but many challenges await. Data can help you live and thrive in this changing environment and not be eaten by a leopard seal. However, most of us do not know how to organize, analyze, and translate real-life data into decisions. In this class, we undergo a series of scenarios to teach you how to use data to design and evaluate if we are making a difference in our new society. These scenarios include case studies related to disease, food security, conservation, sustainability, and nutrition. Through a combination of lectures, hands-on problem solving, and collaboration, this course teaches introductory data literacy skills such as data management, analytics, and visualization useful for decision making and your careers. No programming experience is required and students are encouraged to have in class laptops for in-class activities and assignments. All readings and supplemental material are open source, or free to students. Most importantly, no penguins will be harmed in this adventure, we promise.