Time Series Analysis

Any metric that is measured over time is a time series. It is of high importance because of industrial relevance especially w.r.t forecasting (demand, sales, supply etc). It can be broken down to its components so as to systematically forecast it. This is a beginners introduction to time series analysis, answering fundamental questions such as: what is a stationary time series, how to decompose it, how to de-trend, de-seasonalize a time series, what is auto correlation, etc.

What is a Time Series ?

Any metric that is measured over regular time intervals makes a Time Series. Example: Weather data, Stock prices, Industry forecasts, etc are some of the common ones.

How To Create A Time Series In R ?

Upon importing your data into R, use ts() function as follows. The inputData used here is ideally a numeric vector of the class ‘numeric’ or ‘integer’.

ts (inputData, frequency = 4, start = c(1959, 2)) # frequency 4 => Quarterly Data
ts (1:10, frequency = 12, start = 1990) # freq 12 => Monthly data. The start and end params are optional.
ts (inputData, start=c(2009), end=c(2014), frequency=1) # Yearly Data

Understanding Your Time Series

Each datapoint (Yt) in a Time Series can be expressed as either a sum or a product of 3 components, namely, Seasonality(St), Trend(Tt) and Error(et) (a.k.a White Noise).

For Additive Time Series,
               Yt = St + Tt + et
For Multiplicative Time Series,
               Yt = St * Tt * et

A  multiplicative time series can be converted to additive by taking a log of the time series.

additiveTS <- log (multiplcativeTS)  # convert multiplcative to additive time series

Multiplicative Time Series Pattern
Multiplicative Time Series Pattern (click to enlarge)
Additive Time Series Pattern
Additive Time Series Pattern


What Is A Stationary Time Series ?

A time series is said to be stationary if it holds the following conditions true.

    1. The mean value of time-series is constant over time, which implies, the trend component is nullified.
    2. The variance does not increase over time.
    3. Seasonality effect is minimal.

How to extract the trend, seasonality and error?

decompose() and stl() splits the time series into seasonality, trend and error components.
tsData <- EuStockMarkets[, 1] # ts data
decomposedRes <- decompose(tsData, type="mult") # use type = "additive" for additive components
plot (decomposedRes) # see plot below
stlRes <- stl(tsData, s.window = "periodic")

# Few rows of stlRes
Time Series:
Start = c(1991, 130) 
End = c(1998, 169) 
Frequency = 260 
             seasonal    trend    remainder
1991.496   43.1900952 1602.604  -17.0445950
1991.500   55.3795008 1603.064  -44.8134914
1991.504   61.2914064 1603.523  -58.3048878
1991.508   68.4470620 1603.983  -51.3900342
1991.512   68.4527176 1604.442  -54.7351806
1991.515   70.8396232 1604.902  -65.1315770
Additive Decomposition of Time series
Additive Decomposition of Time series (click to enlarge)


Multiplicative Decomposition of Time series
Multiplicative decomposition of same time series. Notice the scale of seasonal component (click to enlarge).


How to create lags of a time-series ?

When the time base is shifted by a given number of periods, a Lag of time series is created. Lags of a time series are often used as explanatory variables to model the actual timeseries itself. The underlying reasoning is that the state of the time series few periods back may still has an influence on the serie’s current state.
laggedTS <- lag(tsData, 3) # shifted 3 periods earlier. Use “-3″ to shift by 3 periods forward (later).

What Is Autocorrelation and Partial-Autocorrelation?

Autocorrelation is the correlation of a Time Series with lags of itself. This is a significant metric because, it is used commonly to determine if the time series is stationary or not. A stationary time series will have the autocorrelation fall to zero fairly quickly but for a non-stationary series it drops gradually.

Partial Autocorrelation is the correlation of the timeseries with a lag of itself, with the linear dependence of all the lags between them removed.

acfRes <- acf(TS) # both acf() and pacf() generates plots
pacfRes <- pacf(TS)
ccfRes <- ccf(TS1, TS2) # computes cross correlation between 2 timeseries.

Autocorrelation of non-stationary timeseries
Autocorrelation of non-stationary timeseries (click to enlarge)
Autocorrelation of stationary timeseries
Autocorrelation of stationary timeseries
Partial Autocorrelation of a non-stationary timeseries
Partial Autocorrelation of a non-stationary timeseries (same series as above)

How To De-Trend a Time Series ?

Use linear regression to model the Time Series data with indices. The model residuals will usually be devoid of the trend component. If some trend is left over to be seen in the residuals (like what it seems to be with ‘JohnsonJohnson’ data below), then you might wish to add few predictors to the lm() call (like a seasonal dummy, fourier transform or may be a lag of the series itself), until the trend is filtered.

trModel <- lm(TS ~ c(1:length(TS)))
plot(resid(trModel), type="l")  # resid(trModel) contains the de-trended series.

JohnsonJohnson Earnings Trend timeseries
JohnsonJohnson Earnings Trend timeseries (click to enlarge)
JohnsonJohnson De-trended timeseries
JohnsonJohnson De-trended timeseries (click to enlarge)

How To De-Seasonalize a Time Series in R?

De-seasonalizing throws insight about the effects seasonal pattern in the time series and helps to model the data without the seasonal effects, which can later be customized.

Step 1: De-compose the Time series using stl()
Step 2: use seasadj() from ‘forecast’ package

ts.stl <- stl(TS,"periodic")  # decompose the TS
ts.sa <- seasadj(ts.stl)  # de-seasonalize
seasonplot(ts.sa,12, col=rainbow(12),year.labels=TRUE) # seasonal frequency set as 12 for monthly data.

Seasonal Timeseries
Seasonal Timeseries (AirPassengers)
De-seasonalised version of AirPassengers timeseries
De-seasonalised version of same timeseries (AirPassengers)

How To Difference A Time Series ?

Differencinga time series means, to subtract each data point in the series from its successor. It is commonly used to make a Time Series Stationary. For most time series patterns, 1 or 2 differencing is necessary to make it a stationary series.
differencedTS <- diff(ts)
differencedTwice <- diff(ts, differences= 2) # diffferenced twice

How to test if a time series is stationary?

Use Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf.test() indicates that it is stationary.

adf.test(tsData) # p-value < 0.05 indicates the TS is stationary

If you like us, please tell your friends.Share on LinkedInShare on Google+Share on RedditTweet about this on TwitterShare on Facebook