IAI | HSLU | Magdalena Picariello

Introduction to AI | HSLU

Introduction to AI | HSLU


Set of flashcards Details

Flashcards 92
Language English
Category Computer Science
Level University
Created / Updated 17.10.2023 / 02.11.2023
Weblink
https://card2brain.ch/box/20231017_iai_%7C_hslu_%7C_magdalena_picariello
Embed
<iframe src="https://card2brain.ch/box/20231017_iai_%7C_hslu_%7C_magdalena_picariello/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

Case Study: Deployment: Sedimentum

  • -  Stream smartphone data

  • -  Store smartphone data

  • -  Combine existing and new model

  • -  Simplify model to bring it to the edge

  • -  Provide live inference from data streams

  • -  Generate initial map with smartphone

  • -  Generate periodic map updates

First steps: Sedimentum

Error analysis in ML 

Improvement ideas? 

●  Collect more data

●  Collect more diverse training set

●  Train algorithm longer with gradient descent

●  Try Adam instead of gradient descent

●  Try bigger network

●  Try smaller network

●  Try dropout

●  Add L2 regularization

●  Change network architecture

What problem does orthogonalization address in machine learning?

Orthogonalization addresses the challenge of having numerous options to explore when fine-tuning a model.

What is the main goal of orthogonalization?

The primary goal of orthogonalization is to determine which parameters to adjust to achieve specific effects without causing side effects on other components.

How can you define orthogonalization in the context of machine learning?

In machine learning, orthogonalization is a system design principle that ensures modifying one component of an algorithm does not create or propagate unintended side effects to other components.

 

What are the assumptions in ML? 

1. Fit training set well on cost function

2. Fit dev set well on cost function

3.Fit test set well on cost function

4. Performs well in real world

Train/dev/test sets splitting?

Old way vs. Today:

Why are different distributions across the dev and test sets problematic?

Different distributions can lead to issues because you may end up optimizing your model for the dev set and achieving poor performance on the test set, which is the true measure of your model's generalization.

What is the guideline for selecting a development set and test set in machine learning?

Choose a dev set and test set to reflect data you expect to get in the future and consider important to do well.

An exapmle of mismatched train/test distribution

How can handling a mismatch in train/test distribution benefit long-term performance in machine learning?

Mismatches between train/test distributions are often unavoidable, but ensuring that dev/test sets have the same distributions can lead to better long-term performance. This practice helps align the model evaluation with real-world scenarios, leading to more robust and reliable results.

What can be done if the model is not fitting well on the cost function during training?

  • Try using a bigger neural network to increase model capacity.
  • Experiment with other optimization algorithms to improve convergence.

 

What steps should be taken if the model is not fitting well on the cost function for the dev set?

  • Apply regularization techniques to prevent overfitting.
  • Increase the training set size to enhance model learning.

How to improve the model's performance if it is not fitting well on the cost function for the test set?

  • Consider using a bigger dev set to ensure it represents the target distribution.

What measures can be taken to ensure the model performs well in the real world when it is not fitting well on the cost function?

  • Evaluate if a change in the dev set is required to make it more representative of real-world data.
  • Investigate the possibility of modifying the cost function to better align the model with the desired task.

Bias and variance

High bias and high variance

Bias-variance analysis

Error = Bias^2 + Variance + Niose

What does high variance indicate in a machine learning model?

High variance suggests a model is overfitting the training data, resulting in a large gap between train set and dev set errors.

low train error + high dev error + big gap 

What does high bias signify in a machine learning model?

High bias suggests a model is underfitting the data, causing small gaps between train set and dev set errors, but both have high error rates.

high train error + high dev error + small gap 

What does it mean when a model exhibits both high bias and high variance?

When a model has both high bias and high variance, it means it both oversimplifies and overcomplicates the problem, resulting in high errors and big gap.

High train error + high dev error + high gap

What is the desired scenario for a machine learning model?

The ideal scenario is low bias and low variance, where the model finds a balance between complexity and generalization, leading to accurate predictions.

Low train and low dev errors + small gap

Bias-variance analysis

When do you find avoidable Bias and Variance? 

What can you do if you have high bias? 

-  Bigger neural network

-  Train longer

-  (NN architecture search)

What can you do if you dont have high bias but high variance? 

-  More data

-  Regularization

-  (NN architecture search)

Bias & variance analysis

Comparing to human-level performance

Focus on Bias

  • A strategy to address high bias, often related to underfitting, by improving the model's capacity and complexity.

Focus on Variance

  • A strategy to address high variance, often related to overfitting, by reducing the model's complexity and regularization.

Avoidable Bias

  • Error that can be reduced through improved algorithms, more data, or better features.

Variance

  • The model's sensitivity to small fluctuations or noise in the data.

How to evaluate single idea?

Look at dev examples to evaluate ideas Error analysis:

●  Get ~ 100 mislabeled dev set examples

●  Count up how many are muffins


 

Improvement Ceiling

  • The maximum potential improvement that can be achieved by addressing the identified mislabeled examples.

In Scenario 1:

  • 50 out of 100 mislabeled examples were muffins.
  • The model error is 10%.
  • The improvement ceiling is calculated as 50% * 10% = 5%.

In Scenario 2:

  • 5 out of 100 mislabeled examples were muffins.
  • The model error is 10%.
  • The improvement ceiling is calculated as 5% * 10% = 0.5%.

What is the main goal of evaluating a single idea in machine learning?

The main goal is to determine whether the idea is worth pursuing further or investing effort into.

 

What are the key criteria for deciding to evaluate a single idea in a machine learning project?

You should consider evaluating a single idea in your machine learning project if the following criteria are met:

  • Your algorithm is not performing at Human Level Performance.
  • You have a specific idea or change to test.
  • You can manually examine the mistakes your algorithm is making to assess the idea's potential impact.

Q1: What is the purpose of evaluating multiple ideas for improving a machine learning model's performance?

Q2: How is the "Gap to HLP" calculated, and why is it significant?

Q3: What does the "Improvement ceiling" refer to, and how is it calculated?

Q4: Why is it essential to evaluate the performance gap in different categories, such as Dog, Wolf, Instagram, and Blurry, when improving a machine learning model?

A1: The purpose is to identify and prioritize ideas for enhancing the model's performance.

A2: The "Gap to HLP" represents the difference between the model's accuracy and human-level performance. It is crucial because it indicates how close the model is to achieving human-level performance.

 

A3: The "Improvement ceiling" represents the maximum potential improvement that can be achieved for a specific category. It is calculated by multiplying the gap to human-level performance by the percentage of data in that category.

A4: Evaluating the performance gap in different categories allows for a focused approach to improving the model's accuracy in areas where it lags behind human-level performance.

How to decide on most important categories to work on?

●  How much room for improvement there is

●  How frequently that category appears

●  How easy it is to improve accuracy in that category

●  How important it is to improve this category

What is the general approach for tuning hyperparameters in a machine learning model?

The approach involves exploring hyperparameter choices by trying different values in a systematic manner.

Why is it essential to consider "lots of choices" when tuning hyperparameters?

Exploring a broad range of hyperparameter values is crucial to find the settings that optimize the model's performance.

Why does "order matter" in the process of tuning hyperparameters?

    The sequence in which hyperparameters are tuned can impact the final result, as adjustments can interact with each other.

    What is the strategy of "trying random values" when tuning hyperparameters?

    It involves randomly sampling hyperparameter values to avoid potential biases and to ensure a thorough search.

    What does the concept of "coarse to fine" mean in hyperparameter tuning?

    It refers to starting with a wide exploration of hyperparameter values (coarse) and gradually narrowing down the search to find the optimal values (fine).

    What is the primary goal of data augmentation in machine learning?

    Create realistic examples that (1) algorithm does poorly on, but (2) humans do well on


    Data augmentation aims to generate additional training examples to enhance the performance of a machine learning model.