Any metric that is measured over time is a time series. It is of high importance because of industrial relevance especially w.r.t forecasting (demand, sales, supply etc). It can be broken down to its components so as to systematically forecast it. This is a beginners introduction to time series analysis, answering fundamental questions such as: what is a stationary time series, how to decompose it, how to de-trend, de-seasonalize a time series, what is auto correlation, etc.
What is a Time Series ?
Any metric that is measured over regular time intervals makes a Time Series. Example: Weather data, Stock prices, Industry forecasts, etc are some of the common ones.
How To Create A Time Series In R ?
Upon importing your data into R, use ts() function as follows. The inputData used here is ideally a numeric vector of the class ‘numeric’ or ‘integer’.
ts (inputData, frequency = 4, start = c(1959, 2)) # frequency 4 => Quarterly Data
ts (1:10, frequency = 12, start = 1990) # freq 12 => Monthly data. The start and end params are optional.
ts (inputData, start=c(2009), end=c(2014), frequency=1) # Yearly Data
Understanding Your Time Series
Each datapoint (Yt) in a Time Series can be expressed as either a sum or a product of 3 components, namely, Seasonality(St), Trend(Tt) and Error(et) (a.k.a White Noise).
For Additive Time Series,
Yt = St + Tt + et
For Multiplicative Time Series,
Yt = St * Tt * et
A multiplicative time series can be converted to additive by taking a log of the time series.
additiveTS <- log (multiplcativeTS) # convert multiplcative to additive time series
What Is A Stationary Time Series ?
A time series is said to be stationary if it holds the following conditions true.
- The mean value of time-series is constant over time, which implies, the trend component is nullified.
- The variance does not increase over time.
- Seasonality effect is minimal.
How to extract the trend, seasonality and error?
decompose() and stl() splits the time series into seasonality, trend and error components.
tsData <- EuStockMarkets[, 1] # ts data
decomposedRes <- decompose(tsData, type="mult") # use type = "additive" for additive components
plot (decomposedRes) # see plot below
stlRes <- stl(tsData, s.window = "periodic")
# Few rows of stlRes $time.series Time Series: Start = c(1991, 130) End = c(1998, 169) Frequency = 260 seasonal trend remainder 1991.496 43.1900952 1602.604 -17.0445950 1991.500 55.3795008 1603.064 -44.8134914 1991.504 61.2914064 1603.523 -58.3048878 1991.508 68.4470620 1603.983 -51.3900342 1991.512 68.4527176 1604.442 -54.7351806 1991.515 70.8396232 1604.902 -65.1315770
How to create lags of a time-series ?
When the time base is shifted by a given number of periods, a Lag of time series is created. Lags of a time series are often used as explanatory variables to model the actual timeseries itself. The underlying reasoning is that the state of the time series few periods back may still has an influence on the serie’s current state.
laggedTS <- lag(tsData, 3) # shifted 3 periods earlier. Use “-3″ to shift by 3 periods forward (later).
What Is Autocorrelation and Partial-Autocorrelation?
Autocorrelation is the correlation of a Time Series with lags of itself. This is a significant metric because, it is used commonly to determine if the time series is stationary or not. A stationary time series will have the autocorrelation fall to zero fairly quickly but for a non-stationary series it drops gradually.
Partial Autocorrelation is the correlation of the timeseries with a lag of itself, with the linear dependence of all the lags between them removed.
acfRes <- acf(TS) # both acf() and pacf() generates plots
pacfRes <- pacf(TS)
ccfRes <- ccf(TS1, TS2) # computes cross correlation between 2 timeseries.
How To De-Trend a Time Series ?
Use linear regression to model the Time Series data with indices. The model residuals will usually be devoid of the trend component. If some trend is left over to be seen in the residuals (like what it seems to be with ‘JohnsonJohnson’ data below), then you might wish to add few predictors to the lm() call (like a seasonal dummy, fourier transform or may be a lag of the series itself), until the trend is filtered.
trModel <- lm(TS ~ c(1:length(TS)))
plot(resid(trModel), type="l") # resid(trModel) contains the de-trended series.
How To De-Seasonalize a Time Series in R?
De-seasonalizing throws insight about the effects seasonal pattern in the time series and helps to model the data without the seasonal effects, which can later be customized.
Step 1: De-compose the Time series using stl()
Step 2: use seasadj() from ‘forecast’ package
ts.stl <- stl(TS,"periodic") # decompose the TS
ts.sa <- seasadj(ts.stl) # de-seasonalize
seasonplot(ts.sa,12, col=rainbow(12),year.labels=TRUE) # seasonal frequency set as 12 for monthly data.
How To Difference A Time Series ?
‘Differencing‘ a time series means, to subtract each data point in the series from its successor. It is commonly used to make a Time Series Stationary. For most time series patterns, 1 or 2 differencing is necessary to make it a stationary series.
differencedTS <- diff(ts)
differencedTwice <- diff(ts, differences= 2) # diffferenced twice
How to test if a time series is stationary?
Use Augmented Dickey-Fuller Test (adf test). A p-Value of less than 0.05 in adf.test() indicates that it is stationary.
adf.test(tsData) # p-value < 0.05 indicates the TS is stationary