Premium Partner

Data Science

lesson01-lesson15

lesson01-lesson15


Set of flashcards Details

Flashcards 118
Language English
Category Computer Science
Level University
Created / Updated 15.06.2020 / 28.12.2022
Licencing Not defined
Weblink
https://card2brain.ch/box/20200615_data_science
Embed
<iframe src="https://card2brain.ch/box/20200615_data_science/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

Which trends are driving the data science "revolution"?

Mainly Big Data and Machine Learning, 

Give a definition of Data science:

1.
Data science is about the extraction of useful
information and knowledge from large
volumes of data, on order to improve
business decision-making.

2.
Data science is an interdisciplinary subject with 3 key areas:
- Statistics
- Computer Science
- Domain expertise

Why is Data Science important?

In the past, data analysis was typically slow: Needed teams of statisticians, analysts etc. to explore data manually.

Today colume, velocity and variety make manual analysis impossible but fast computers and good algorithms allow much deeper analyses than before.

--> data-driven decision making
--> base decisions on alysis of data, not intuition

Draw the Data Science performing process:

- Iterative process
- Non-sequential
- Early termination
- Established processes, e.g. CRISP-DM

Name the approximately year of invention of Machine Learning, Deep learning and Artificial Intelligence:

  • AI 1950's
    Creation of first "intelligent" algorithms and programs
  • ML 1980's
    Statistical models and algorithms that can learn from data
  • DL 2010's
    Statistical models and algorithms inspired by neurones that can learn from data

Name the 3 main branches of ML and some of its applications:

  • Supervised Learning
    • Classification
      • Diagnostics
      • Customer Retention (Kundenbindung)
      • Image Classification
    • Regression
      • Estimating life expextancy
      • Population Growht Prediction
      • Market Forecasting
  • Unsupervised Learning
    • Clustering
      • Recommender System
      • Customer Segmentation
      • Targetted marketing
    • Dimensionality Reduction
      • Big data Visualisation
      • Structure Discovery
  • Reinforcement Learning
    • Game AI
    • Robot Navigation
    • Real-time decisions

Explain supervised learning:

In supervised learning the training data consicts of input / output pairs and we train a function to map the inputs to the outputs. The predicted variable consists is therby either a continuous variable like Price / Cost / Weight (Regression Problems) or categorical variable like A, B or C / Dogs or Cats.

Explain unsupervised learning:

In unsupervised learning there are no labels available, insights are gained without prior knowledge.

For Anomaly / Outlier detection is the task, finding samples in a dataset tat raise suspicion.
The problem therby is, that you usally do not know, what you are looking for.
The solution is to use statistics and characteristics of the dataset to find outliers.