Set of flashcards Analysis of Sequential Data (Page 1 of 3)

Flashcards	96
Language	Deutsch
Category	Maths
Level	University
Created / Updated	17.01.2021 / 08.09.2023
Weblink	https://card2brain.ch/box/20210117_tsmanseqda
Embed	<iframe src="https://card2brain.ch/box/20210117_tsmanseqda/embed" width="780" height="150" scrolling="no" frameborder="0"></iframe>

What can we forecast?

Sells of pills/medicine
Elextricity demand and availability
Weather
Sells of product/service
Customer churns

Which factors affect forecastability?

Some thing is easier to forecast if:
- we have a good understanding of the factors that contribute to it
- There is lots of data available
- the forecasts cannot affect the thing we are trying to forecast
- There is relatively low natural/unexplainable random variation
- the future is somewhat similar to the past

What are time series data?

Daily stock prices
Monthly rainfall
Annual business profits
Production, e.g. quartely australien beer production

What is forecasting about?

Forecasting is estimating how the sequence of observations will continue into the future

What do wee need to add to the forecast?

An uncerainty range

Why is providing an uncertainty of the forecast in forecasting important?

If you have just the probabilty (e.g. 50%) you do not know the deviation. So you could for example produce the 80% of the forecast. The worst case scenario is that customer could order or demand more. If you can not provide the asked product of piece, you have the bigger loss (in image) than if you have produced a bit too much, which you can hold and put in the sale in the end.

When is it ok to use the gaussian distribution in forecasting?

As long as the to be predicted number is far away from 0 (like hundreds of thousends of better millions)

Name the definitions of statistical forecasting:

How is a time series stored in R

In a ts object:

A list of numbers
Information about times those numbers were recorded

What do you need to add for observations that are more frequent than once per year and how is it entered in R?

A frequency argument (e.g. daily - 365, monthly - 12, quarterly - 4)

What is the command for the ts class package in R and what does it include?

library(fpp2)

Loads:

some data for use in examples and exercises
forecast package (for forecasting functions)
ggplot2 package (for graphics functions)
fma package (for lots of time series data)
expsmooth package (for more time series data)

How do you plot saisons in a ts?

With seasonal plots:

Data plotted against the individual "seasons" in which the data were observed. (in this case a "season" is a month.)
Something like a time plot except that the data from each season are overlapped
Enables the underlying sesonal pattern to be seen more clearly, and also allow any substantial departures form the seasonal pattern to be easily identified.
In R: ggseasonplot()

What are the different time series patterns? Name and explain them!

Trend
- Pattern exists when there is a long-term increase or decrease in the data
Seasonal
- Pattern exists when a series in influenced by seasonal factors (e.g., the quarter of the year, the month, or day of the week)
Cyclic
- Pattern exists when data exhibit rises and falls that are not of fixed period (duration usually of at least 2 years)

What are the differences between seasonal and cyclic patterns?

seasonal pattern constant length; cyclic pattern variable length
Average length of cycle longer than length of seasonal pattern
magnitude of cycle more variable than magnitude of seasonal pattern

The timing of peaks and troughs is predictable with seasonal data, but unpredictable in the long term with cyclic data.

What is covariance and correlation as well as autocoraviance and autocorrelation about?

Coraciance and correlation: measure extent of linear relationship between two variables (y and X).

Autocovariance and autocorrelation: measure linear relationship between lagged values of a time series y.

We measure the relationship between:

yt and yt-1
yt and yt-2
yt and yt-3
etc.

Explain the following autocorrelation graphic:

r4 higher than for the other lags. This is due to the seasonal pattern in the data: the peaks tend to be 4 quarters apart and the troughs tend to be 2 quarters apart
r2 is more negative than for the other lags because troughs tend to be 2 quarters behind peaks
Together, the autocorrelations at lags 1, 2, ..., make up the autocorrelation or ACF
The plot is known as a correlogram

What statements about trend and seasonality in ACF plots can be made?

When data have a trend, the autocorrelations for small lags tend to be large and positive.
When data sare seasonal , the autocorrelations willl be larger at the seasonal lags (e.g. at multiples of the seasonal frequency)
When data are trended and seasonal, you see a combination of these effects

What can be said about the following ACF from the monthly electricity production?

Time plot shows clear trend and seasonality
The same features are reflected in the ACF
The slowly decaying ACF indicates trend
The ACF peaks at lags 12, 24, 36, ... indicate seasonality of length 12

Allocate trend, seasonality and cyclic to the following timeseries and the corresponding ACF:

B; Trend and no seasonality
A; Seasonilty and no trend
D; Seasonality and a trend
C; Cyclic and no trend

What indicate the blue lines in the ACF plots?

95% interval

How does white noise in a time series look like?

We expext each autocorrelation to be close to zero
All autocorrelation coefficients lie witihin the 95% interval, confirming that the data is white noise (more precisely, the data cannot be distinguished from white noise)

How can you proof that a stock price is white noise (e.g. google)?

The stock can be modelled by the radnom walk model yt+1 = yt + epsilont
where epsilont = N(0, sigma2), epsilont is i.i.d.: hence epsilont is independen from epsilont-1, epsilont-2
By differencing: yt+1 - yt = epsilont
- which is indeed a white noise time series

Name some simple forecastin methods and give some explanation about them:

Average method
Naive method
Seasonal naive method

How is the drift method implemented?

What is the y in the picture about?

What means et?

Residuals in Forecasting: Difference between observed value and its fittted value.

What are the assumptions and properties of the residuals when forecasting is done well?

Assumptions
- Residuals are uncorrelated. If they aren't, then information left in residuals that should be used in computing forecasts.
- Residuals have mean zero. If they don't, then forecasts are biased.
Useful properties (for prediction intervals)
- Residuals have constant variance
- Residuals are normally distributed

Note: et are one-step-forecast residuals

What is the ACF about?

We assume that the residuals are white noise (uncorrelated, mean zero, constant variance). If they aren't, then there is information left in the residuals that should be used in computing forecasts.
So a standard residual diagnostic is to check the ACF of the residuals of a forecasting method. Where the data should lay between the blue 95% boundary.
We expect these to look like white noise

What is the Ljung-Box test about?

Consider a whole set of rk values, and develop a test to see whether the set is significantly different from a zero set.
If each rk is close to zero, Q will be small
If some rk values are large (positive or negative), Q will be large
Note mla: i assume rk are residuals

What are the recommended defaults for h?

h = 10 for non-seasonal data
h = 2m for seasonal data, where m is the length of the season

How is the Portmanteau test (Box-Ljung) interpreted?

Note: Gets automatically done with the checkresiduals() function

The test checks the null hypothesis that the data is white noise
Small p-values lead to rejecting the null hypothesis: They are evidence of significant auto-correlation
Large p-values lead insted to axxepting the null hypothesis
Typical threshold decision:
- p-value > 0.05 -> accept he null hypothesis (white noise)
- p-value < 0.05 -> reject the null hypothesis, concluding that there is a significant autocorrelation

Why do you use a training and a test set?

A model which firs the training data well will not necessarily forecast well
A perfect fit can always be obtained by using a model with enough parameters
Over-fitting a model to data is just as bad as failing to identify a systemtic pattern in the data
The test set must not be used for any aspect of model decelopment or calculation of forecasts
Forecast accuracy is based only on the test set

What are the characteristics of forecast errors?

What are the measures of forecast accuracy?

Additionally:

MAE, MSE, RMSE are all scale dependent
MAPE is scale independent but is only sensible if yt >> 0 for all t, and y has a natural zero

What are the differences of the MASE and MAE regarding non-seasonal and seasonal time series?

True or false?

Good forecast methods should have normally distributed residuals

A model with small residuals will give good forecasts

The best measure of forecast accuracy is MAPE

If your model doesn't forecast well, you should make it more complicated

Always choose the model with the best forecast accuracy as measured on the test set

What is cross-validation with time series about?

Forecast accuracy averaged over test sets (time step for time step)
Also known as "evaluation on a rolling forecasting origin"
A good way to choose the best forecasting model is to find the model with the smallest RMSE computed using time series cross-validatoin

What are prediction intervals about?

Point forecasts are often useless without prediction intervals
Prediction intervals require a stochastic model (with random errors, etc.)
Multi-step forecasts for time series require a more sophisticated approach (with PI getting wider as the forecast horizon increases)

How are the prediction intervals computed with the simpel forecasting models?

Show how the concept of predicting h steps ahead with the nive model:

The expected value of yt+h is the same (yt) for any avalue of h, but its variance grows with h

Analysis of Sequential Data

Create or copy sets of flashcards

Create or copy sets of flashcards

Log in to see all the cards.

SWITCHaai

Office 365

Edulog

Apple ID

Google