Machine Learning
Machine Learning
Machine Learning
Set of flashcards Details
Flashcards | 22 |
---|---|
Language | Deutsch |
Category | Computer Science |
Level | University |
Created / Updated | 30.04.2020 / 02.05.2020 |
Weblink |
https://card2brain.ch/box/20200430_machine_learning
|
Embed |
<iframe src="https://card2brain.ch/box/20200430_machine_learning/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
Create or copy sets of flashcards
With an upgrade you can create or copy an unlimited number of sets and use many more additional features.
Log in to see all the cards.
realizable
if the hypothesis space contains the true functions
Trade off Accuracy vs. Generalization
Tradeoff between complex hypothesis that fit well and a simpler hypothesis that generalize well
Supervised learning (x,y)? y=f(x)? and h?
(x,y) input-output pair
y = f(x) true function to be approximated
h hypothesis
inducitve learning
learning a general function (generalization)
deductive learning
from known general function to a new rule
reinforcment learning
agent decides at the end by its own if id did right (exampel with taxi driver - tip or no tip)
or how to act or behave when given occasional reward
No free lunch theorem
No universally best Model. A set of assumptions work well for a Problem A but necesseraly for Problem B. Diffrent Models lead to diffrent Algo.
Types of Machine Learning
Supervised: input-output value pair
Unsupervised: classification
Reinforcment:
What is Machine Learning
Set of methos that can automaticcaly detect pattern in data and performe predition or other types of decision making.
Unsupervised Learning
Only Input
Detect Patterns (classification)
there exists no obvious error variable
Generalization Error
Expected value of the missclassification rate when averaged over future data => on test set
Hypothesis Space
It contains all possible hypothesis that can be built with the choosen representaion.
-Representation
-Evalutaion
-Optimization
Representation: Classifier must be represented in some formal lanugage. HYpothesis Space of the lerarner.
Evalutaion: objective funtion / scoring functionis needed to distinguish good and bad calssifier.
Optimization: Method to serach among the classifers in the language for the highest scoring one. Chocie of Opti. Algo is keay to performance.
Bias Variance Trade Off
If our model is too simple and has very few parameters then it may have high bias and low variance. On the other hand if our model has large number of parameters then it’s going to have high variance and low bias. So we need to find the right/good balance without overfitting and underfitting the data.
This tradeoff in complexity is why there is a tradeoff between bias and variance. An algorithm can’t be more complex and less complex at the same time.
-True Error
-Empirical Error
True Error : The true error is not observable.
Empirical Error: Proportion of examples from sample s element of D missclassifyed by h. Gets better with more data.
Overfitting
Overfitting becomes more likely as the number of hypothesis space and the number of input attributes grows, and less likely as we increase the number of training exampels.
- Distribution of Traning and Test set are not the same.
AnTeDe: Oft ist dies ein Hinweis auf Overfitting (oder dass das die Daten im validation set nicht der Datengrundlage im training set entspricht). In diesem Lab sind allerdings die initialen Parameter schlecht gewählt und auch das Modell ist verbesserungswürdig.
Decision Tree pruning
Pruning comabts overfitting by eliminating nodes that are not clearly relevant,
Wrapper (model selection)
The wrapper enumarates the models accoriding to a parameter e.p size. For each size, is uses corssvalidation on Learner to complete the average errror rate on the traning and test set. The corss validation procedure selects the one with the lowest validation set error.
Regularization
Explicity penalize complex hypothesis - looks for a function htat is more regular or less complex. Loss function and complexits function.
PAC probably approximatly correct.
The underlying principle is that any hypothesis that is seriously
wrong will almost certainly be “found out” with high probability after a small number
of examples, because it will make an incorrect prediction. Thus, any hypothesis that is consistent
with a sufficiently large set of training examples is unlikely to be seriously wrong: that is,
it must be probably approximately correct.
Batch / Batch Size
When all training samples are used to create one batch, the learning algorithm is called batch gradient descent. When the batch is the size of one sample, the learning algorithm is called stochastic gradient descent. When the batch size is more than one sample and less than the size of the training dataset, the learning algorithm is called mini-batch gradient descent.
- Batch Gradient Descent. Batch Size = Size of Training Set
- Stochastic Gradient Descent. Batch Size = 1
- Mini-Batch Gradient Descent. 1 < Batch Size < Size of Training Set
Epoche
The number of epochs is a hyperparameter that defines the number times that the learning algorithm will work through the entire training dataset.
One epoch means that each sample in the training dataset has had an opportunity to update the internal model parameters. An epoch is comprised of one or more batches. For example, as above, an epoch that has one batch is called the batch gradient descent learning algorithm.
You can think of a for-loop over the number of epochs where each loop proceeds over the training dataset. Within this for-loop is another nested for-loop that iterates over each batch of samples, where one batch has the specified “batch size” number of samples.
The number of epochs is traditionally large, often hundreds or thousands, allowing the learning algorithm to run until the error from the model has been sufficiently minimized. You may see examples of the number of epochs in the literature and in tutorials set to 10, 100, 500, 1000, and larger.
-
- 1 / 22
-