IAI | HSLU | Magdalena Picariello
Introduction to AI | HSLU
Introduction to AI | HSLU
Set of flashcards Details
Flashcards | 92 |
---|---|
Language | English |
Category | Computer Science |
Level | University |
Created / Updated | 17.10.2023 / 02.11.2023 |
Weblink |
https://card2brain.ch/box/20231017_iai_%7C_hslu_%7C_magdalena_picariello
|
Embed |
<iframe src="https://card2brain.ch/box/20231017_iai_%7C_hslu_%7C_magdalena_picariello/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
Create or copy sets of flashcards
With an upgrade you can create or copy an unlimited number of sets and use many more additional features.
Log in to see all the cards.
What is feature engineering in machine learning?
With structured data, we can do feature engineering.
Example: Restaurant recommender
Vegetarians are frequently recommended restaurants with only meat options Possible features to add:
● Is person vegetarian (based on past orders)?
● Does restaurant have vegetarian options (based on menu)
What is the role of data iteration in the feature engineering process?
Data iteration involves continuously revising and improving features based on the results of error analysis, user feedback, and benchmarking. It helps enhance the quality and relevance of features.
Why can error analysis be more challenging without a good baseline (HLP)?
- Error analysis becomes challenging without a good baseline (Human-Level Performance or HLP) because you lack a reliable reference point for comparison. A baseline provides insight into how well a human can perform the task, which is essential for identifying areas where the model falls short.
Q1: How can user feedback contribute to the data iteration process in feature engineering?
Q2: Why is benchmarking against competitors a useful source of inspiration for feature engineering?
A1: User feedback is valuable for identifying which features users find relevant or missing. It provides insights that guide the selection and engineering of features to enhance user satisfaction and model performance.
A2: Benchmarking against competitors helps identify areas where your model can gain a competitive edge. By analyzing competitor performance, you can inspire the development of unique features that set your model apart in the field.
Build your system quickly, then iterate
what are the directions you can take to improve speech recognition system:
● Noisy background:
○ Cafe noise
○ Car noise
● Accented speech
● Far from microphone
● Young children
● Stuttering
What is the typical mindset in Data or Modeling?
- Traditional machine learning research driven by improving benchmark dataset performance.
- Researchers often work on a fixed dataset they download.
- This approach has led to significant progress in machine learning.
- In production systems, the dataset doesn't need to remain fixed.
- It's common to edit the training and test sets to enhance data quality for better system performance.
Questions:
- In what cases can focusing on optimizing the data and hyperparameters be more effective than code optimization?
- What does a machine learning system comprise in terms of components?
- How does taking a non-model-centric approach differ in terms of optimization?
- What is a crucial step during the modeling phase of machine learning?
- How can error analysis be utilized in the context of data improvement?
- Why is collecting more data not always the most efficient solution?
- How can error analysis contribute to a high-accuracy model?
Brainstorming framework
- Think about automating tasks rather than automizing jobs, e.g. call center routing, radiologists
- What are the main drives of the business value?
- What are the main pain points in the business?
Key Questions in Technical diligence:
● Can we meet desired performance?
● Can we use pre-existing components?
● How much data is needed?
● What resources are needed?
● What are the dependencies?
● Are there any legal constraints?
Key Questions in Business diligence:
● Does it lower costs?
● Does it generate revenue?
● Does it enable launching new product?
● Does it generate ENOUGH value?
Is the project technically feasible?
Questions to ask yourself:
Do other people solve similar problems?
What performance do they achieve?
With my skills, what performance can I achieve?
Do I need additional resources?
Can I do it in a reasonable time?
Can I make an AI project without big data?
Yes, you can make progress without big data.
Is having more data beneficial for AI projects?
Having more data never hurts and can often enhance AI performance.
What is the downside of relying on big data?
Gathering large volumes of data can be very expensive and resource-intensive.
Can limited data be valuable for AI projects?
Yes, you may be able to bring value to your AI project even with the limited data that you have. It depends on the specific project's goals and requirements.
Why are ML models from scientific publications often irreproducible?
● There is no obligation to publish code, model, and data
● If model (inference code) is published, it doesn’t mean that training code is published too
● The exact training datasets are rarely available
● Even if they are, the preprocessing code may be missing
● Code quality is often poor (not compiling, missing dependencies etc.)
● Described models are tweaked to some metric
Machine Learning Workflow
Echo / Alexa
Collect data
○ Collect audio clips of people saying “Alexa”
○ Collect audio clips of people saying other stuff
Train model
○ Classify audio clips (Alexa/Not Alexa)
○ Iterate many times till good enough
Deploy model
○ Put ML software in the smart speaker
○ Get data back for failing cases
○ Maintain/update model
-
- 1 / 92
-