What is Data Science?

Data science is an interdisciplinary field that uses scientific methods, processes, algorithms and systems to extract knowledge and insights from structured and unstructured data, and apply knowledge and actionable insights from data across a broad range of application domains.

The Four Pillars of Data Science

While data scientists often come from many different educational and work experience backgrounds, most should be strong in, or in an ideal case be experts in four fundamental areas. In no particular order of priority or importance, these are:

  • Business/Domain

  • Mathematics (includes statistics and probability)

  • Computer science (e.g., software/data architecture and engineering)

  • Communication (both written and verbal)


There are other skills and expertise that are highly desirable as well, but these are the primary four in my opinion. These will be referred to as the data scientist pillars for the rest of this article.

In reality, people are often strong in one or two of these pillars, but usually not equally strong in all four. If you do happen to meet a data scientist that is truly an expert in all, then you’ve essentially found yourself a unicorn.

Based on these pillars, my data scientist definition is a person who should be able to leverage existing data sources, and create new ones as needed in order to extract meaningful information and actionable insights. A data scientist does this through business domain expertise, effective communication and results interpretation, and utilization of any and all relevant statistical techniques, programming languages, software packages and libraries, and data infrastructure. The insights that data scientists uncover should be used to drive business decisions and take actions intended to achieve business goals.

What you’ll learn.

Here is an overview what you’ll learn:

  • Introduction to Data Science
  • Understanding Exploratory Data Analysis
  • Machine Learning
  • Model selection and evaluation
  • Data Warehousing
  • Data Mining
  • Data Visualization
  • Cloud Computing
  • Business Intelligence
  • Storytelling with Data
  • Intro to Python
  • Communication and Presentation
  • and much more

