What is this thing called data science?
Learning objectives of the course
This module provides students with a hands-on introduction to the methods of data science, with an emphasis on applying these methods to solve business problems. By the end of this course, it is expected that students will:
Throughout the course, students will also have the opportunity to learn several technical skills:
Lesson 1.1 - The Big Picture
By the end of this course you will:
• Know how to approach business problems from a data science perspective
• Understand the fundamental principles behind extracting useful knowledge from data
• Gain hands-on experience with mining data for insights
In this course you are going to learn several skills:
• Python programming and core libraries for data analysis, visualisation, and modelling
• Working with data: collecting, cleaning, transforming
• Creating and interpreting descriptive statistics
• Creating and interpreting data visualisations
• Creating statistical models for inference
• Practical machine learning
What is Data Science?
Data science is about the extraction of useful information and knowledge from large volumes of data, in order to improve business decision-making
Is an interdisciplinary subject with 3 key areas:
• Statistics
• Computer science
• Domain expertise
Why is Data Science Important?
In the past, data analysis was typically slow: needed teams of statisticians, analysts etc to explore data manually
Today: volume, velocity, and variety make manual analysis impossible …
… but fast computers and good algorithms allow much deeper analyses than before )
--> data-driven decision making
--> base decisions on analysis of data, not intuition
How is data science performed?
• Iterative process • Non-sequential • Early termination • Established processes, e.g. CRISP-DM (https://bit.ly/1tX6508)
Typical data science work flow
Raw data, little value --> Data exploration --> Model building and analysis --> Reporting, Automation