Lernkartei StatUoB

Karten	44
Sprache	English
Kategorie	Mathematik
Stufe	Universität
Erstellt / Aktualisiert	08.10.2017 / 29.10.2017
Weblink	https://card2brain.ch/box/20171008_statuob
Einbinden	<iframe src="https://card2brain.ch/box/20171008_statuob/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

What is normal distribution and what is the probability of mean +- 1 sd, 2 sd and 3 sd?

Bell Shaped, Symmetrical, Mean = Median = Mode
Location is determined by the mean, μ
Spread is determined by the standard deviation, σ
The random variable has an infinite theoretical range
μ ± 1σ encloses about 68% of X’s
μ ± 2σ covers about 95% of X’s
μ ± 3σ covers about 99.73% of X’s

What is the student t distribution? What are degrees of freedom?

Idea: Number of observations that are free to vary after sample mean has been calculated. It increases variation and therefore also standard deviation extremely when n is small and approximates n when n is getting bigger.

What are discrete probability distribution and what are the rules?

Recap: Discrete variables à Variables producing outcomes that come from a counting process.

Rules:

A fixed number of observations, n
1. e.g. 15 tosses of a coin
2. e.g. 10 light bulbs taken from a warehouse
Constant probability for the event of interest occurring (π) for each observation
1. e.g. Probability of getting a tail is the same each time we toss the coin.
Each observation is categorized as to whether the “event of interest” occurred or not.
1. e.g. head or tail in each toss of a coin
2. e.g. defective or not defective light bulb
3. When the probability of the event of interest is represented as π, then the probability of the event of interest not occurring is 1 – π.
Observations are independent
1. The outcome of one observation does not affect the outcome of the other

negatively skewed (LS) when p > 0,5; Symmetric when n = 10 and p = 0,5 and prositively skewed (RS) when p < 0,5

Mean = np

Var = np(1-p)

Sd = sqrt(var)

What is probability?

A quantitative measure of uncertainty
A measure of the strength of belief in the occurrence of an uncertain event
A measure of the degree of chance or likelihood of occurrence of an uncertain event
Measured by a number between 0 and 1 (or between 0% and 100%)

What is set, empty set, universal set, compelement, intersection, union, mutually exclusive and partition in term of probability?

set: a collection of elements or objects of interest

empty set : a set containing no elements

universal set: a set containing all possible elements

complement (not): the compelement of A is Abar and is a set containing all elements of S not in A.

Intersection AND: a set containing all elements in both A and B AnB

Union( OR): a set containing all elements in A or B. AuB

Mutually exclusive: or disjoint set: Sets having no elements in commen, having no intersection, whose intersection is the empty set.

Partition: a collection of mutually exclusive sets which together include all possible elements, whose union is the universal set. AKA collectively exhaustive.

What is an experiment?

Process that leads to one of several outcomes
Each trial of an experiment has a single observed outcome
The precise outcome of a random experiment is unknown before a trial.

What are events in probability?

Sample Space or Event Set: Set of all possible outcomes (universal set) for a given experiment. --> Roll a six-sided dice S={1,2,3,4,5,6}

Event: Collection of outcomes having a common characteristic. --> Even numbers A = {2,4,6}.

Probability of an event: Sum of the probabilities of the outcomes of which it consits --> P(A) = P(2) + P84 + P(6)

What are the basic rules of Probability (Basic and conditional probability)?

bild

When is an event statistically independent?

bild

What is the bayes theorem and for what is it used?

Permutations and combinations, what formulas?

Order important:

replace = True = n^k
replace = false = n!/(n-k)!

Order not important:

replace =True = (n-1+k)!/(n-1)!k! --> not important
replace = false = (n k)

What is regression analysis used for?

Regression analysis is used for

explaining the impact of changes in independent variables on the dependent variable.
predicting the value of a dependent varibale based on the value of independent variable.

Dependent variable --> the variable we wish to predict or explain. --> Y

Independent variable --> the variable used to predict or explain the dependent variable. --> X

How is variation measued in regression?

What is the coefficiont of determination and howw can we calculate it?

The coeffision of determination is the portion of the total veariation in the dependet varibale that is explained by variation in the independent variable

the coefficient of determination is also called r-squared and is denoted as r^2

What are the assuptions of regression/diagnostics of regression? --> When these are not satisfied we probably missed something in the model.

Linearity: the relationship between x and y is linear

Plot the residuals and check if the residual are linear distributed.

Independent:

Error values are statistically independent
Durbin-Watson test (dwtest()) p < 5% must be satisfied.

Normality

Error values are normally distributed for any given value of x
Examine the Histogram of the residuals
Construct a Normal Probability Plot of the residuals
- In Normal Probability Plot, if the QQ plot approximately
  follows a straight line, the residuals can be regarded as
  normally distributed
Use Shapiro-Wilk normality test --> p < 5% must be satisfied.

Equal variance (homoscedasticity)

Use Breusch-Pagan test against heteroskedasticity: e.g.
bptest() {lmtest} in R

If data are spread out, the range, variance and standard deviation will decrease (T/F)?

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

F

If the data values are all the same, the range, vairance and standard deviation will be zero?(T/F)

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

T

All the range, variance, and standard deviation cannot be negative? (T/F)

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

T

Which of the following statistics is not a measure of central tendency?

Mean

Median

Mode

Variance

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

Mean

Median

Mode

Variance

Which of the following statements about the median is NOT true?

It is less affected by extreme values than the mean

It is a measure of central tendency

It is equal to variance

it is the same to mode in bell-shaped distribution

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

It is less affected by extreme values than the mean

It is a measure of central tendency

It is equal to variance

it is the same to mode in bell-shaped distribution

In a perfectly shaped symmetrical distribution

the range equals the variance

the range equals the mean

the mean equals the median

the variance equals the standard deviation

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

the range equals the variance

the range equals the mean

the mean equals the median

the variance equals the standard deviation

Which one is a categorical variable?

Temperature

Salary

Shoe brand

Weight

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

Temperature

Salary

Shoe brand

Weight

Does a high correlation imply there is a
causality between two variables? (T/F)

True

False

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

True

False

Which does not describe correlation
accurately?

It indicates the relationship between two variables only.

It tests the linear relationship only.

The correlation equals to the slope of the linear line fitted to the data.

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

It indicates the relationship between two variables only.

It tests the linear relationship only.

The correlation equals to the slope of the linear line fitted to the data.

Which statement is true?

Y is an estimate and Ŷ is an actual

Mean(Y) is an actual and Ŷ is an estimate

Y is an actual and Ŷ is an estimate

Tastatur-Befehle:

= drehen,

= vor-/rückwärts,

= scrollen

Y is an estimate and Ŷ is an actual

Mean(Y) is an actual and Ŷ is an estimate

Y is an actual and Ŷ is an estimate

What is the error (residual)? Y = 40 and Yhat = 20

0.5

2

20

-10

0.5

2

20

-10

Cost = 25.2 - 4.4 Capacity

Which one best describes the equation above?

Unit rise in capacity increases cost by 4.4.

Unit rise in capacity increases cost by 25.2.

-4.4 means that capacity and cost has positive relationship.

25.2 can be thought of as the y- intercept.

Unit rise in capacity increases cost by 4.4.

Unit rise in capacity increases cost by 25.2.

-4.4 means that capacity and cost has positive relationship.

25.2 can be thought of as the y- intercept.

Regression can explain all the variations in
data?

True

False

True

False

Given for a SST, which model do you prefer in
regression modelling?

Smaller SSE and larger SSR

Larger SSE and smaller SSR

Smaller SSE and larger SSR

Larger SSE and smaller SSR

If you calculate SSR / SST, would it fairly explain
how much a model can explain the data?

True

False

True

False

StatUoB

Lernkarteien erstellen oder kopieren

Lernkarteien erstellen oder kopieren

Melde dich an, um alle Karten zu sehen.

SWITCHaai

Office 365

Edulog

Apple ID

Google