## Data Science Essentials Suite

### COURSE INFORMATION

#### ONLINE AND SELF PACED

Complete the four courses and you will receive the Data Science Essentials Suite Badge and a Binghamton University-issued completion certificate.

##### Introduction to Python Programming

Probability for Data Science

Introduction to Algorithms

Introduction to Machine Learning

The Data Science Essentials Suite is for those who would like to learn basic, in-demand skills required for an entry-level data scientist. You will be introduced to Python programming, the most popular and adaptable programming language in the industry today, as well as important concepts of probability, machine learning and algorithms.

In today’s world, the competent data scientist can use machine learning to analyze large chunks of data using algorithms and data-driven models, and is familiar with machine-learning techniques such as supervised and unsupervised machine learning, decision trees, and logistic regression.

If you are not able to complete all courses in the bundle you can retake the course for a fee of $45 (students) and $75 (non-students)

#### DELIVERY FORMAT

Each course has pre-recorded learning modules, self-assessment quizzes (ungraded) and a final graded online exam or assignment.

#### CREDENTIALS

If you successfully complete all four courses you will receive the Data Science Essentials Microcertificate Badge and a Binghamton University-issued completion certificate. You have to pass each course with a grade of 70% or higher in order to receive the credentials

#### PREREQUISITES

It is recommended that learners should have a STEM (Science, Technology, Engineering, and Math) interest and should have completed the necessary math courses that are typically covered in 1st and 2nd year college programs.

#### COURSES

**Introduction to Python Programming**

This course introduces computer programming in Python for those who have little or no prior computer programming experience. Topics covered are introduction and software installation, language and Jupyter Notebook basics, variables and data structures, algorithms, flow control and function definition, object-oriented programming, plotting and visualization, file input/output and manipulation and using external modules. No coursebook is required.

**Probability for Data Science**

This introductory course in probability is designed to provide the necessary background for learning and understanding machine-learning and data-science concepts. It will introduce the concept of probability, provide an overview of discrete and continuous random variables, and describe how to compute expectation and variance. The course will also discuss specific distributions such as geometric, binomial, Poisson, uniform, exponential and normal distributions. No coursebook is required.

**Introduction to Algorithms**

This course will provide an introduction to design and analysis of algorithms. In particular, upon successful completion of this course, students will be able to understand, explain and apply key algorithmic concepts and principles, including sorting algorithms (selection sort, bubble sort and insertion sort), time and space complexity analysis (big-oh, omega and theta notations), recursive algorithms and master theorem, divide-and-conquer algorithms (merge sort, quick sort and matrix multiplication), trees (binary search trees, AVL trees and red-black trees). No coursebook is required.

**Introduction to Machine Learning**

This course will provide an introduction to machine learning. In particular, upon successful completion of this course, students will be able to understand, explain and apply key machine learning concepts and algorithms, including: probability review, introduction to different types of machine learning and supervised learning, decision trees algorithm, Naïve Bayes algorithm, logistic regression algorithm, and machine-learning concepts such as regularization, overfitting and Laplace smoothing. No coursebook is required.

#### INSTRUCTORS

**
**

**Hiroki Sayama** (Introduction to Python Programming) is a Professor in the Department of Systems
Science and Industrial Engineering. His research interests include complex dynamical
networks, collective behaviors, social systems modeling, artificial life/chemistry,
mathematical biology, computer and information sciences.

- Director, Center for Collective Dynamics of Complex Systems
- Director, Advanced Graduate Certificate Program in Complex Systems Science and Engineering
- Professor, Faculty of Commerce, Waseda University, Japan
- Affiliate, New England Complex Systems Institute
- Chancellor's Award for Excellence in Teaching (2015–2016)

**Arti Ramesh** (Introduction to Machine Learning) is a former assistant professor in the Department
of Computer Science at Binghamton University. She received her PhD in computer science
from the University of Maryland, College Park. Ramesh’s primary research interests
are in the field of machine learning, data-mining and natural language processing,
particularly statistical relational models and deep learning. Her research focuses
on building structured, fair, and interpretable models for reasoning about interconnectedness,
structure, and heterogeneity in networked data. She has published papers in peer-reviewed
conferences such as IJCAI, AAAI, ACL, WWW, ECAI, and DSAA. She has served on the TPC/reviewer
for notable conferences such as ICML, IJCAI, AAAI, NIPS, SDM, and EDM.

**Anand Seetharam **(Probability for Data Science and Introduction to Algorithms) is a former assistant
professor in computer science in the Thomas J. Watson College of Engineering and Applied
Science at Binghamton University. Dr. Seetharam is broadly interested in the field
of computer networking. His research interests include wireless networks, information-centric
networks, ubiquitous computing, the Internet of Things (IoT) and smart grids.