StatUoB
This is stat
This is stat
Kartei Details
Karten | 44 |
---|---|
Sprache | English |
Kategorie | Mathematik |
Stufe | Universität |
Erstellt / Aktualisiert | 08.10.2017 / 29.10.2017 |
Weblink |
https://card2brain.ch/box/20171008_statuob
|
Einbinden |
<iframe src="https://card2brain.ch/box/20171008_statuob/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>
|
Lernkarteien erstellen oder kopieren
Mit einem Upgrade kannst du unlimitiert Lernkarteien erstellen oder kopieren und viele Zusatzfunktionen mehr nutzen.
Melde dich an, um alle Karten zu sehen.
What is normal distribution and what is the probability of mean +- 1 sd, 2 sd and 3 sd?
- Bell Shaped, Symmetrical, Mean = Median = Mode
- Location is determined by the mean, μ
- Spread is determined by the standard deviation, σ
- The random variable has an infinite theoretical range
- μ ± 1σ encloses about 68% of X’s
- μ ± 2σ covers about 95% of X’s
- μ ± 3σ covers about 99.73% of X’s
What are discrete probability distribution and what are the rules?
Recap: Discrete variables à Variables producing outcomes that come from a counting process.
Rules:
- A fixed number of observations, n
- e.g. 15 tosses of a coin
- e.g. 10 light bulbs taken from a warehouse
- Constant probability for the event of interest occurring (π) for each observation
- e.g. Probability of getting a tail is the same each time we toss the coin.
- Each observation is categorized as to whether the “event of interest” occurred or not.
- e.g. head or tail in each toss of a coin
- e.g. defective or not defective light bulb
- When the probability of the event of interest is represented as π, then the probability of the event of interest not occurring is 1 – π.
- Observations are independent
- The outcome of one observation does not affect the outcome of the other
negatively skewed (LS) when p > 0,5; Symmetric when n = 10 and p = 0,5 and prositively skewed (RS) when p < 0,5
Mean = np
Var = np(1-p)
Sd = sqrt(var)
What is probability?
- A quantitative measure of uncertainty
- A measure of the strength of belief in the occurrence of an uncertain event
- A measure of the degree of chance or likelihood of occurrence of an uncertain event
- Measured by a number between 0 and 1 (or between 0% and 100%)
What is set, empty set, universal set, compelement, intersection, union, mutually exclusive and partition in term of probability?
set: a collection of elements or objects of interest
empty set : a set containing no elements
universal set: a set containing all possible elements
complement (not): the compelement of A is Abar and is a set containing all elements of S not in A.
Intersection AND: a set containing all elements in both A and B AnB
Union( OR): a set containing all elements in A or B. AuB
Mutually exclusive: or disjoint set: Sets having no elements in commen, having no intersection, whose intersection is the empty set.
Partition: a collection of mutually exclusive sets which together include all possible elements, whose union is the universal set. AKA collectively exhaustive.
What is an experiment?
- Process that leads to one of several outcomes
- Each trial of an experiment has a single observed outcome
- The precise outcome of a random experiment is unknown before a trial.
What are events in probability?
Sample Space or Event Set: Set of all possible outcomes (universal set) for a given experiment. --> Roll a six-sided dice S={1,2,3,4,5,6}
Event: Collection of outcomes having a common characteristic. --> Even numbers A = {2,4,6}.
Probability of an event: Sum of the probabilities of the outcomes of which it consits --> P(A) = P(2) + P84 + P(6)
Permutations and combinations, what formulas?
Order important:
- replace = True = n^k
- replace = false = n!/(n-k)!
Order not important:
- replace =True = (n-1+k)!/(n-1)!k! --> not important
- replace = false = (n k)
What is regression analysis used for?
Regression analysis is used for
- explaining the impact of changes in independent variables on the dependent variable.
- predicting the value of a dependent varibale based on the value of independent variable.
Dependent variable --> the variable we wish to predict or explain. --> Y
Independent variable --> the variable used to predict or explain the dependent variable. --> X
What is the coefficiont of determination and howw can we calculate it?
The coeffision of determination is the portion of the total veariation in the dependet varibale that is explained by variation in the independent variable
the coefficient of determination is also called r-squared and is denoted as r^2
What are the assuptions of regression/diagnostics of regression? --> When these are not satisfied we probably missed something in the model.
Linearity: the relationship between x and y is linear
- Plot the residuals and check if the residual are linear distributed.
Independent:
- Error values are statistically independent
- Durbin-Watson test (dwtest()) p < 5% must be satisfied.
Normality
- Error values are normally distributed for any given value of x
- Examine the Histogram of the residuals
- Construct a Normal Probability Plot of the residuals
- In Normal Probability Plot, if the QQ plot approximately
follows a straight line, the residuals can be regarded as
normally distributed
- In Normal Probability Plot, if the QQ plot approximately
- Use Shapiro-Wilk normality test --> p < 5% must be satisfied.
Equal variance (homoscedasticity)
- Use Breusch-Pagan test against heteroskedasticity: e.g.
bptest() {lmtest} in R
If data are spread out, the range, variance and standard deviation will decrease (T/F)?
F
If the data values are all the same, the range, vairance and standard deviation will be zero?(T/F)
T
All the range, variance, and standard deviation cannot be negative? (T/F)
T
Which of the following statistics is not a measure of central tendency?
Which of the following statements about the median is NOT true?
In a perfectly shaped symmetrical distribution
Which one is a categorical variable?
Does a high correlation imply there is a
causality between two variables? (T/F)
Which does not describe correlation
accurately?
Which statement is true?
What is the error (residual)? Y = 40 and Yhat = 20
Cost = 25.2 - 4.4 Capacity
Which one best describes the equation above?
Regression can explain all the variations in
data?
Given for a SST, which model do you prefer in
regression modelling?
If you calculate SSR / SST, would it fairly explain
how much a model can explain the data?
-
- 1 / 44
-