Autoregressive and Moving Average Models (CFA Level 1): Autoregressive (AR) Models, Core Concept, and Why AR Models Matter. Key definitions, formulas, and exam tips.
Time-series analysis often feels like a magical crystal ball—peering into the past to divine the future. But, in all honesty, it’s less about magic and more about math and common sense. I remember once chatting with a friend about forecasting gold prices. He said, “Well, if yesterday’s price was high, maybe tomorrow will also be high.” In a nutshell, that’s kind of where autoregressive (AR) and moving average (MA) models come in. They give us a systematic way to harness patterns in a time series—be it asset returns, macroeconomic indicators, or consumption data—and project it forward, while staying humble about uncertainty.
Below, we’ll talk about the fundamentals of AR and MA models, how to build them, and why they matter for finance. We’ll also bring up some cautionary tales and practical tips you can use, especially when you’re preparing for the CFA exam or dealing with real-world investments.
An autoregressive model expresses a time series as a function of its own past values. In other words, the current observation depends on a linear combination of one or more previous observations, plus some randomness (often called “white noise”).
In mathematical terms, an AR(p) model can be written as:
where:
For instance, with \( p = 1 \), the AR(1) model becomes:
AR models are particularly helpful in capturing “momentum” or “inertia” in a time series. Many economic and financial variables—like GDP growth rates, inflation, or even certain stock returns—often show some correlation over time. If last period’s return was high, there might be a slightly higher chance that this period’s return will also be above average. AR models let us quantify this dependency formally.
To illustrate the flow of dependency in an AR model, consider an AR(1) structure:
flowchart LR
A["Y_(t-1)"] --> B["Y_t = c + φ₁Y_(t-1) + ε_t"]
B --> C["Error term ε_t"]
The arrow from \(Y_{t-1}\) to \(Y_t\) highlights how the previous value influences the current one.
Below is a brief Python snippet (using the Statsmodels library) showing how you might simulate and fit an AR(1) model. Of course, in a real-world scenario, you’d want to test stationarity, do diagnostic checks, and possibly compare models.
1import numpy as np
2import pandas as pd
3from statsmodels.tsa.arima.model import ARIMA
4
5np.random.seed(42)
6n = 200
7epsilon = np.random.normal(0, 1, n)
8
9Y = []
10Y.append(epsilon[0])
11phi = 0.6
12for t in range(1, n):
13 Y.append(phi*Y[t-1] + epsilon[t])
14
15series = pd.Series(Y)
16
17model = ARIMA(series, order=(1,0,0))
18results = model.fit()
19print(results.summary())
A big caveat: AR models require stationarity. Stationarity means that the time series has a constant mean and variance over time (among other conditions). In an AR(1) example, a necessary condition for stationarity is \(|\phi_1| < 1\). If \(\phi_1\) is close to 1 or exceeds 1 in absolute value, the series can “explode” or become nonstationary. For practical investment applications, it’s crucial to test stationarity by looking at the series’ mean reversion behavior, applying formal tests like the Augmented Dickey-Fuller (ADF) test, and regularly performing residual checks.
A moving average model relies on linear combinations of current and past error terms. An MA(q) model can be written as:
where:
For instance, with \( q = 1 \), an MA(1) model is:
In an MA model, the series incorporates the effects of past “shocks.” If your time series is significantly influenced by new information or random shocks but not so much by past values directly, an MA model may fit better than an AR model. In finance, some volatility models—even the more advanced GARCH-type models—trace their lineage back to the idea of capturing the propagation of shocks from one period to the next.
Below is a simplified diagram of the MA(1) process, illustrating how last period’s random shock affects the current value:
flowchart LR
A["ε_(t-1)"] --> B["Y_t = μ + θ₁ε_(t-1) + ε_t"]
B --> C["Error term ε_t"]
MA models of finite order \(q\) are automatically stationary (though invertibility is another related concept). One of the reasons MA models are often used is that you don’t need to impose constraints on coefficients for stationarity—unlike AR models, where you do need \(|\phi_i|<1\).
One of the biggest questions when choosing an AR(p) or MA(q) model is: “How big should \(p\) or \(q\) be?” In practice, we often look at two tools:
The ACF at lag \(k\) measures the correlation between \(Y_t\) and \(Y_{t-k}\). The PACF measures the correlation after controlling for the correlations at shorter lags.
In reality, data can be messy, so the patterns might not be perfectly neat. That’s where practice, experience, and sometimes additional diagnostic tools come in handy.
Forecasting Returns:
Investors might use an AR model if they believe their asset’s returns exhibit autocorrelation, such as bond returns that depend modestly on the previous day’s returns.
Modeling Shock Propagation:
MA models are useful when today’s outcome depends heavily on recent shocks—such as unexpected central bank announcements or other major market news.
Building Blocks for ARMA and More:
Realistically, many financial time series are modeled using ARMA (Autoregressive Moving Average) or ARIMA (Autoregressive Integrated Moving Average) processes, especially for interest rate or macroeconomic indicator forecasts.
Risk Management:
Evaluating how volatility clusters (or doesn’t) can start with analyzing AR and MA structures on residuals before employing more advanced GARCH or other stochastic volatility models.
Selecting a suitable model doesn’t end with “set \(p\) or \(q\).” We also need to verify that our final choice captures the data’s information content. Some standard diagnostic steps include:
Overfitting:
It can be tempting to add more lags until you match every wiggle in the historical data. But a “perfect” fit in-sample often means poor predictive power out-of-sample.
Nonstationarity:
If your time series isn’t stationary (e.g., you’re working with price levels that trend upward over time), applying these techniques directly could lead to spurious forecasts. Consider differencing or other transformations.
Ignoring Structural Changes:
Financial markets can shift behavior after major regulatory or geopolitical changes. A single AR(1) or MA(1) might not hold across drastically different regimes.
Violating the CFA Institute Code of Ethics and Standards of Professional Conduct:
If you’re presenting forecast performance, ensure no data snooping or misrepresentation. Provide standard disclosures and disclaimers regarding model uncertainty and assumptions.
When facing time-series questions on the CFA exam (even though this is often introduced at Level I, time-series might appear again with deeper complexity in advanced levels), keep these strategies in mind:
Important Notice: FinancialAnalystGuide.com provides supplemental CFA study materials, including mock exams, sample exam questions, and other practice resources to aid your exam preparation. These resources are not affiliated with or endorsed by the CFA Institute. CFA® and Chartered Financial Analyst® are registered trademarks owned exclusively by CFA Institute. Our content is independent, and we do not guarantee exam success. CFA Institute does not endorse, promote, or warrant the accuracy or quality of our products.