# Lernkarten

Karten 88 Karten 1 Lernende English Universität 21.07.2018 / 27.08.2018 Keine Angabe
0 Exakte Antworten 88 Text Antworten 0 Multiple Choice Antworten

Axioms of probability (Axioms of Kolmogorov)

Probability P : $$\Omega\;\rightarrow\;\ \mathbb{R}$$ (the probability p is a transformation from the event space to the real numbers)

Given events A in an event space $$\Omega$$, i.e., $$A\subset \Omega$$ (A is a subset of Omega; Omega is a superset of A)

1. $$0 \leq P(A) \leq 1$$
2. $$P(\Omega)=1$$
3. given $$A_i\cap A_j =\emptyset$$ for $$i \neq j$$, then $$P(\bigcup_iA_i)=\sum_i P(A_i)$$  (If the intersection of two subsets is zero, then the probability of the union is just the sum of the probabilities of the subsets)

consequences of the Axioms of Kolmogorov

1. $$P(\bar{\bar{A}})=1-P(A)$$
2. $$P(\emptyset)=0$$
3. if A and B are exclusive, then $$P(A\cup B)=P(A)+P(B)$$
4. in general $$P(A\cup B)=P(A)+P(B)-P(A\cap B)$$ (additive law of probability)

Independent events

Two events are independent when the following is valid:

$$P(A\cap B)=P(A)*P(B)$$

Conditional probability of two events

The conditional probability of an event A, given an event B is:

$$P(A|B)=P(A\cap B)/P(B)$$

if A and B are independent than:

$$P(A|B)=P(A)$$

Bayes' theorem

$$.\\P(A_j|B)=\frac{P(B|A_j)P(A_j)}{P(B)}$$

what types of random variables do exist?

1. discrete: number of wet days
2. continuous (not really!): temperature
3. categorial: Head or tail?

Cumulative distribution function (CDF)

$$F_X(x)=P(X\leq x)$$ continuous random variables

$$F_X(x)=\sum_{x_i< x}P(X=x_i)$$  discrete random variables

1. $$F_X$$ monotonically increasing ($$0\leq F_X(x)\leq 1$$)
2. $$lim_{x\rightarrow -\infty}F_X(x)=0,\;\;lim_{x\rightarrow \infty}F_X(x)=1$$
3. $$P(X \epsilon [a,b])=P(a\leq X\leq b)=F_X(b)-F_X(a)$$

Probability distribution function

Probability mass function (only for discrete variables!):

$$f_X(x)=P(X=x)$$

Probability density function (PDF, for continous random variables!):

$$f_X(x)=\frac{dF_X(x)}{dx}$$

proberties:

1. $$f_X(x)\geq 0$$
2. $$\int f_X(x)dx=1\;(cont.)\;\;\sum_{X\epsilon \Omega}f_X(x)=1\;(discrete)$$
3. $$P(X\epsilon [a,b])=P(a\leq X\leq b)=F_X(b)-F_X(a)$$

Independent random variables

continuous random variables:

Random variables X and Y are independent if for any x and y:

$$P(X\leq x, Y\leq y)=P(X\leq x)P(Y\leq y)=F(x)G(y)$$

where F(x) and G(x) are the corresponding CDFs.

discrete random variables:

Random variables X and Y are independent if for any $$x_i$$and $$y_i$$:

$$P(X\leq x_i,Y\leq y_j)=P(X\leq x_i)P(Y\leq y_j)$$

Define the expressions Quantile, Percentile, Median and Quartile

Percentile: quantiles expressed in percentages: The 0.2 quantile is the 20th percentile

Quartiles: are 25th and 75th percentiles

Median: is the 0.5-quantile

What is a moment?

The nth moment $$\mu_n$$ of a probability density $$f_X(x)$$ is defined as:

• (cont.):  $$\mu_n=E(X^n)=\int x^n*f_X(x)dx$$
• (discr.):  $$\mu_n=E(X^n)=\sum x^n_k * f_X(x_k)$$

The n th central moment $$\mu'_n$$ of a probability density $$f_X(x)$$ is defined with respect to the first moment ($$\mu$$) as

$$\mu_n'=E((X-\mu)^n)=\int (x-\mu)^n * f_X(x)dx$$

How is the expected value and the variance defined?

The expected value, also called the mean is defined as the first moment:

$$\mu=E(x)=\int x*f(x)dx$$

The expected value can be physically seen as the centroid of mass in physics.

The variance is defined as the second central moment:

$$\sigma^2=Var(x)=E((X-\mu)^2)=E(X^2)-\mu^2$$

The variance gives the spread around the expected value.

Explain Skewness!

Lizenzierung: Keine Angabe

The skewness is the third central moment divided by the standard diviation to the third power:

$$.\\\gamma_1=E[(\frac{X-\mu}{\sigma })^3 ]=\frac{E[(X-\mu )^3]}{(E[(X-\mu )^2])^{3/2}}$$

What is the fourth central moment?

Lizenzierung: Keine Angabe

Kurtosis (measure of peakness)

The kurtosis of any univariate normal distribution is 3. It is common to compare the kurtosis of a distribution to this value. Distributions with kurtosis less than 3 are said to be platykurtic, although this does not imply the distribution is "flat-topped" as sometimes reported. Rather, it means the distribution produces fewer and less extreme outliers than does the normal distribution. An example of a platykurtic distribution is the uniform distribution, which does not produce outliers.

Der Exzess gibt die Differenz der Wölbung der betrachteten Funktion zur Wölbung der Dichtefunktion einer normalverteilten Zufallsgröße an.

What is the Mode?

The mode is the value that appears most often in a set of data. For a continuous probability distribution it is the peak.

What are the probability density and the cumulative distribution function of the uniform distribution?

Lizenzierung: Keine Angabe

probability density function (PDF):

$$f(x)=\frac{1}{b-a}\;for\;x\;\epsilon\;[a,b]$$

cumulative distribution function (CDF):

$$F_X(x)=\left\{ \begin{array}{c} 0\;\;\;for\;x< a\\\frac{x-a}{b-a}\;\;\;for\;\;x\;\epsilon\;[a,b]\\1\;\;\;for\;x\geq b \end{array} \right.$$

Normal (Gaussian) distribution

Lizenzierung: Keine Angabe

$$f_X(x; \mu,\sigma)=\frac{1}{\sqrt{2\pi\sigma^2}}exp \left(-\frac{(x-\mu) ^2}{2\sigma^2}\right)=\mathcal{N}(\mu,\sigma)$$

What is intermittency?

A signal is said to be intermittent if rare events of large magnitude are separated by long periods with events of low magnitude. Spatial intermittency implies that the signal displays localized regions with events of large magnitude, and wide areas with events of low magnitude.

• PDFs of intermittent flows are not Gaussian.
• Kurtosis is often used as a measure of intermittency (high intermittency means high kurtosis)

Can the variance be zero?

Yes, then:

• the distribution only consists of one constant
• mean, median and mode are the same

Tell me a distribution where no moments exist:

The Cauchy distribution

• expected value, variance and standard deviation do not exist since it's integrals are infinite.

Law of large numbers

Given a sequence of random variables $$X_1,X_2,...$$ with mean $$\mu$$ . Then it holds:

$$lim_{n\rightarrow \infty}\frac{1}{n}\sum^{n}_{i=1}X_i\rightarrow \mu$$

Central limit theorem

Given a sequence of independent and identical distributed random variables $$X_1,...,X_n$$ with expected value $$\mu$$ and variance $$\sigma^2$$ , then the distribution of $$S_n=\frac{1}{n}(X_1+...+X_n)$$is approximately normal with mean $$\mu$$ and variance $$\frac{1}{n}\sigma^2$$ or,

$$\sqrt{n}\left(\frac{1}{n}\sum\limits^n_{i=1}X_i-\mu \right)\xrightarrow d\mathcal{N}(0,\sigma)$$

the $$\xrightarrow d$$ reads "converges in distribution to".

How large does n to be chosen? Depend on the underlying distributions of the sample sequences.

Chebychev's inequality

For any random variable and c>0 there holds:

$$P(|X-E(X)|\geq c)\leq\frac{Var(X)}{c^2}$$

Empirical vs. theoretical quantities

Quantities estimated from a given sample are often referred to as empirical or sample quantities. $$\hat{\;\mu}$$

The corresponding true or model quantities are often referred to as the theoretical$$\mu$$

Given a sample $$x_1,...,x_N$$of a random variable X. Consider a parameter $$\Theta$$ on X, e.g., the mean $$\mu$$ .

Then the estimator $$\hat{\;\Theta}$$ is a function of a sample (i.e., is a statistics) of the random variable X which assigns to the sample values which distribution depend on (and are close to) $$\Theta$$ .

Estimators: What are the formulas for the sample mean, sample variance (known mean and not known mean) and sample standard deviation?

sample mean: $$\bar{x}=\widehat{\mu}=\frac{1}{N}\sum\limits^{N}_{i=1}x_i$$

sample variance: $$\widehat{Var}(x)=\frac{1}{N-1}\sum\limits^{N}_{i=1}(x_i-\bar{x})^2$$

sample variance with known $$\mu$$$$\widehat{Var}(x)=\frac{1}{N}\sum\limits^{N}_{i=1}(x_i-\bar{x})^2$$

standard deviation: $$\widehat{s}=\sqrt{\widehat{Var}(x)}$$

Given independent random variables X and Y with expected values $$\mu_X$$ and $$\mu_Y$$ and variances $$\sigma^2_X$$ and $$\sigma^2_Y$$.

How to calculate expected value and variance if,

1. $$Z=\alpha+X$$
2. $$Z=\alpha X$$
3. $$Z=X+Y$$
4. $$Z=X*Y$$

1. $$Z=\alpha+X$$
• $$\mu_Z=\alpha+\mu_X,\;\;\;\sigma_Z^2=\sigma_X^2$$
2. $$Z=\alpha X$$
• $$\mu_Z=\alpha\mu_X,\;\;\;\sigma_Z^2=\alpha^2\sigma^2_X$$
1. $$Z=X+Y$$
• $$\mu_Z=\mu_X+\mu_Y,\;\;\;\sigma_Z^2=\sigma^2_X+\sigma^2_Y$$
1. $$Z=X*Y$$
• $$\mu_Z=\mu_X*\mu_Y$$

Note the density function and the cumulative distribution of composed random variables (as X+Y or XY) is in general not easy to determine, although mean and variance can easily be determined.

What is the estimator of the probability density function?

The histogram, which contains the relative occurrence divided by the bin width.

Choice of the number of bins K for a histogram

non-trivial:

1. square-root choice: $$k=\sqrt{n}$$
2. Sturges' formula (assumes Gausssian): $$k=log_2n+1$$
3. Rice rule $$k=ceil(2n^{1/3})$$

When is an estimator consistent?

The estimator $$\widehat{\Theta}$$, as a function of the random variable X, is again a random variable. Therefore every estimator has an expected value and variance.

An estimator is called consistent if:

$$P(|\widehat{\Theta}-\Theta|>\epsilon)\rightarrow0\;\;for\;\;N\rightarrow\infty$$

for all $$\epsilon >0$$

Example: The estimator for the expected value $$\widehat{\Theta}=\widehat{\mu}$$ (the sample mean) is consistent (law of the large numbers).

$$\widehat{\mu}=\frac{1}{N}\sum\limits^N_{i=1}x_i$$

What is the Mean Squared Error (MSE) and Variance of an estimator?

$$MSE(\widehat{\Theta})=E[(\widehat{\Theta}-\Theta)^2]$$

The MSE is also called as risk

$$Var(\widehat{\Theta})=E[(\widehat{\Theta}-E(\widehat{\Theta}))^2]$$