Statistics & Machine Learning
- COS 302/SML 305/ECE 305: Mathematics for Numerical Computing and Machine LearningThis course provides a comprehensive and practical background for students interested in continuous mathematics for computer science. The goal is to prepare students for higher-level subjects in artificial intelligence, machine learning, computer vision, natural language processing, graphics, and other topics that require numerical computation. This course is intended students who wish to pursue these more advanced topics, but who have not taken (or do not feel comfortable) with university-level multivariable calculus (e.g., MAT 201/203) and probability (e.g., ORF 245 or ORF 309). See "Other Information"
- COS 513/SML 513: Foundations of Probabilistic ModelingThis course covers fundamental topics in probabilistic modeling and allows you to contribute to this important area of machine learning and apply it to your work. We learn how to model data arising from different fields and devise algorithms to learn the structure underlying these data for the purpose of prediction and decision making. We cover several model classes--including deep generative models--and several inference algorithms, including variational inference and Hamiltonian Monte Carlo. Finally, we cover evaluation methods for probabilistic modeling as well as tools to challenge our models' assumptions.
- SML 201: Introduction to Data ScienceIntroduction to Data Science provides a practical introduction to the burgeoning field of data science. The course introduces students to the essential tools for conducting data-driven research, including the fundamentals of programming techniques and the essentials of statistics. Students will work with real-world datasets from various domains; write computer code to manipulate, explore, and analyze data; use basic techniques from statistics and machine learning to analyze data; learn to draw conclusions using sound statistical reasoning; and produce scientific reports. No prior knowledge of programming or statistics is required.
- SML 310: Research Projects in Data ScienceA project-based seminar course in which students work individually or in small teams to tackle data science and machine learning problems, working with real-world datasets. The course emphasizes critical thinking about experiments and large dataset analysis and the ability to clearly communicate one's research. This course is intended to support students in developing the analytical skills necessary for quantitative independent work; students should consult with their home department about how this course could appropriately complement, but not replace, their independent work requirements.
- SML 505/AST 505: Modern StatisticsThe course provides an introduction to modern statistics and data analysis. It addresses the question, "What should I do if these are my data and this is what I want to know"? The course adopts a model based, largely Bayesian, approach. It introduces the computational means and software packages to explore data and infer underlying parameters from them. An emphasis will be put on streamlining model specification and evaluation by leveraging probabilistic programming frameworks. The topics are exemplified by real-world applications drawn from across the sciences.
- SML 510: Graduate Research SeminarThis course is for graduate students enrolled in the CSML Graduate Certificate Program and is part of the certificate requirements. Students enrolled in the certificate must enroll, attend and present their research during at least one semester. Each week features a presentation by a student, invited faculty or external visitors. All students are required to read materials prior to the workshop and come prepared to engage in conversation. Each week a student presents, a second student introduces the speaker and gives background on the work and a third student moderates the post-presentation discussion.
- SOC 306/SML 306: Machine Learning with Social Data: Opportunities and ChallengesThis is a class about using the tools of machine learning to study social data. The power of machine learning tools is their applicability around a wide range of tasks. There are huge opportunities for applying these tools to learn and make decisions about real people but there are also important challenges. This course aims to (1) show social scientists and digital humanities scholars the potential of machine learning to help them learn about humans, make policy and help people while also (2) showing computer scientists how a social science research design perspective can improve their work and give them new outlets for their skills.