# Probability for Data Science

Probability is a key mathematical concept that is essential for modeling and understanding computer system performance and real-world data generated from day-today activities and interactions. Data science, machine learning, natural language processing and computer vision rely heavily on probabilistic models.

This introductory course in probability is designed to provide the necessary background for learning and understanding machine learning and data science concepts. It will introduce the concept of probability, provide an overview of discrete and continuous random variables and describe how to compute expectation and variance. The course will also discuss specific distributions such as geometric, binomial, Poisson, uniform, exponential and normal distributions.

### LEARNING OUTCOMES

At the end of the course, course participants will:

• Be able to describe the basic probability concepts such as mean, variance, conditional probability, Bayes rule and statistical independence.
• Be able to compute the mean and variance of random variables.
• Be able to describe discrete and continuous distributions such as geometric, binomial, Poisson, uniform, exponential and normal.
• Be able to compute the derive properties of functions of random variables.
• Be able to understand how real-world phenomena can be modeled using probability distributions.

Anand Seetharam is an assistant professor in computer science in the Thomas J. Watson College of Engineering and Applied Science at Binghamton University. Dr. Seetharam is broadly interested in the field of computer networking. His research interests encompasses wireless networks, information-centric networks, ubiquitous computing, Internet of Things (IoT) and smart grids.

