Risk Model Assembly: Factor Covariance, Specific Risk, and the Full Forecast

Chapters 3–7 produce for every period a vector of factor returns $\hat f_t$ and a vector of residuals $\hat\epsilon_t$ . A risk model is what you get by turning those two histories into forecasts: a factor covariance matrix $F$ , a specific-risk matrix $\Delta$ , and through $\Sigma = XFX^\top + \Delta$ a risk forecast for any portfolio. This chapter covers the estimation choices, the corrections that production models apply, and how to know whether the resulting forecasts are any good.

Throughout, the estimation principle is the same: risk is more stable than return, but not stable enough to treat all history equally. Most of the techniques below answer one question: how fast should the model forget? A few answer a second: when there’s too little history to forget, what do you predict from instead?

8.1 The factor covariance matrix $F$

The naive estimator: With a history $\{\hat f_t\}_{t=1}^T$ , the sample covariance

$\hat F = \frac{1}{T-1} \sum_{t=1}^T (\hat f_t - \bar f)(\hat f_t - \bar f)^\top$

is unbiased but equal-weights 1995 and last month. Volatility is strongly time-varying (clustered), so the sample estimator is both stale in crises and haunted by them afterward.

Exponentially weighted moving average (EWMA) is the most common way to address this. Weight observation $t$ (counting back from today, $s = 0, 1, 2, \dots$ ) by $\lambda^s$ , with decay $\lambda \in (0,1)$ :

$\hat F = \frac{\sum_{s\ge0} \lambda^s\, \tilde f_{t-s} \tilde f_{t-s}^\top}{\sum_{s\ge0} \lambda^s}, \qquad \tilde f = \hat f - \bar f.$

The decay is parameterized by its half-life $h$ , the lag at which a weight has fallen by half: $\lambda = 2^{-1/h}$ . The effective number of observations is roughly $2h / \ln 2 \approx 2.89h$ . A 90-day half-life behaves like ~260 equally-weighted days. Short half-life = responsive but noisy. Long = stable but slow. Typical production values: 20–60 days (short-horizon models) up to 24–48 months (long-horizon models).

Different half-lives for variances and correlations. A standard refinement exploits an empirical regularity: volatilities move fast, correlations move slowly. So estimate factor volatilities with a short half-life and the correlation matrix with a long one, then recombine: $F = D_\sigma\, C\, D_\sigma$ with $D_\sigma$ the diagonal of fast volatilities and $C$ the slow correlations. This gets crisis responsiveness without correlation noise.

Horizon scaling and serial correlation (Newey–West): A monthly-horizon forecast built from daily data cannot just be “daily covariance x 21”. Factor returns are serially correlated: illiquidity and lead–lag effects induce positive autocorrelation, so daily scaling understates monthly risk. The Newey–West estimator adds weighted autocovariance terms:

$\hat F^{NW} = \hat\Gamma_0 + \sum_{q=1}^{Q} \left(1 - \frac{q}{Q+1}\right)\left(\hat\Gamma_q + \hat\Gamma_q^\top\right),$

where $\hat\Gamma_q$ is the lag- $q$ autocovariance matrix of factor returns. The triangular (Bartlett) weights guarantee the result stays positive semi-definite.

Shrinkage and conditioning: With $K = 70$ factors, $K(K+1)/2 = 2{,}485$ entries are estimated. Sampling error makes the extreme eigenvalues of $\hat F$ too extreme (largest biased up, smallest down), exactly the directions an optimizer will exploit (Chapter 11). Ledoit–Wolf shrinkage pulls the estimate toward a structured target: $F^{shrunk} = \delta\, T_{\text{target}} + (1-\delta)\, \hat F$ , with the target a constant-correlation or diagonal matrix and $\delta$ chosen to minimize expected estimation error (closed form in their papers). Even without formal shrinkage, production models floor small eigenvalues and verify positive semi-definiteness after all corrections are applied. Each correction is individually safe, but combinations are checked.

Regime dynamics: Beyond EWMA: GARCH/DCC-type dynamics per factor, or a volatility regime adjustment, a multiplier on the whole matrix calibrated to very recent cross-sectional forecast errors, so the model re-levels quickly when the world changes faster than the half-life allows. Chapter 14 shows how to detect the need for this in bias statistics.

8.2 Specific risk $\Delta$

Per stock, the time-series route mirrors the factor route: EWMA variance of the stock’s residuals $\hat\epsilon_{it}$ , possibly Newey–West-adjusted, annualized to the model horizon. It works for liquid stocks with long, clean residual histories, and fails exactly where Chapter 5 said it would: new listings, thinly traded names, and small caps with short or noisy histories.

The structural (cross-sectional) model: Predict specific volatility from current characteristics rather than own history: regress (log) realized specific volatility on size, volatility descriptors, leverage, liquidity, industry, etc., across the estimation universe. Apply the fitted function to any stock with characteristics, IPO included. Structural estimates are stable and available everywhere, but blind to a specific company’s own turbulence.

The blend: Production models combine the two with credibility weights: $\hat\sigma_{\epsilon_i}^2 = \gamma_i\, \hat\sigma^2_{TS,i} + (1 - \gamma_i)\, \hat\sigma^2_{STR,i}$ , where $\gamma_i$ grows with the length and quality of stock $i$ ‘s residual history. This is Bayesian shrinkage toward a peer-group prior in all but name, and it is also exactly the coverage-universe machinery of Chapter 5: a coverage asset is just a stock with $\gamma_i = 0$ .

The mini example’s $\Delta$ : The MiniModel stipulates annualized specific volatilities ranging 16%–38%, small caps noisier as the structural model would predict (AXIOM 18%, JUNIPER 32%, the small-cap DIGIT highest at 38%), so $\Delta = \mathrm{diag}(0.18^2, \dots, 0.32^2)$ . Here $\Delta$ is genuinely diagonal. A real universe needs off-diagonal blocks where issuers are linked (dual-class shares or ADRs, Ch. 5). Realistic magnitudes: large-cap specific vol 15–25%, small-cap 30–50%.

8.3 The assembled model, and the MiniModel’s risk forecasts

The MiniModel’s factor covariance $F$ is built from stipulated annualized factor volatilities and correlations (full matrices in the appendix, magnitudes chosen to be realistic):

vols: MKT 16%, TECH 9%, FIN 7%, CONS 5%, VALUE 4%, MOM 6%, SIZE 4%
key correlations: VALUE–MOM −0.45 (the classic value/momentum hedge), TECH–FIN −0.40, MKT–VALUE −0.20.

With $X$ (Chapter 3), $F$ , and $\Delta$ assembled, $\Sigma = XFX^\top + \Delta$ prices any portfolio. For the three portfolios of the mini example (manager $w_p$ , cap-weighted benchmark $w_b$ , active $w_a = w_p - w_b$ ; holdings tabulated in the appendix):

	factor vol	specific vol	total vol	specific share of variance
Portfolio	15.99%	7.22%	17.55%	17%
Benchmark	16.57%	7.38%	18.14%	17%
Active	4.21%	3.42%	5.42%	40%

The pattern is the usual one: total risk is dominated by factor risk (the market, mostly), while active risk is split much more evenly. Diversification has cancelled most common exposure between portfolio and benchmark, leaving the deliberate tilts and the stock-specific bets. Chapter 9 dissects these numbers line by line.

8.4 Forecast quality: does the model tell the truth?

A risk model’s forecasts are statements about the distribution of future returns. They can be scored. The core instrument is the bias statistic. For any test portfolio, form the standardized returns $z_t = r_{p,t} / \hat\sigma_{p,t-1}$ (realized return over predicted volatility). If forecasts are calibrated, $z_t$ has standard deviation 1.

$b = \mathrm{std}(z_t) \qquad \begin{cases} b \approx 1 & \text{calibrated} \\ b > 1 & \text{risk underforecast} \\ b < 1 & \text{risk overforecast} \end{cases}$

with approximate 95% acceptance band $1 \pm \sqrt{2/T}$ under normality (e.g. $T = 52$ weeks gives the band [0.80, 1.20]). Production validation computes $b$ on rolling windows, across many test portfolios (random long-only, factor-tilted, optimized, and clients’ actual portfolios). A model can be calibrated on average yet biased for exactly the portfolios that matter, optimized ones especially (Chapter 11 and Chapter 13).

Supporting instruments: Q-statistics / log-likelihood comparisons between candidate models on the same test set (which model assigns higher density to realized outcomes), portfolio-level exceedance counts (how often did returns breach 2σ, should be ~5%), and bias-by-segment breakdowns (by size band, by volatility band) to find structural biases that aggregate numbers hide.

Known failure modes: risk underforecast after long calm regimes (EWMA has forgotten the last storm: 2007, 2021). Correlation spikes in crises that no half-life fully anticipates (diversification evaporates exactly when needed). And optimized-portfolio bias (the optimizer found the matrix’s too-small eigenvalues). The first two are inherent trade-offs to be managed with regime adjustments and stress tests (Chapter 9). The third is treatable (shrinkage here, alpha-alignment in Chapter 11). Chapter 14 builds the full evaluation framework on these instruments.

8.5 Short-horizon vs. long-horizon variants

Vendors ship the same factor structure at multiple horizons because one covariance matrix cannot forecast both next week and next year:

	Short-horizon model	Long-horizon model
Data / half-lives	daily returns, half-lives of weeks	monthly or overlapped daily, half-lives of 1–4 years
Responds to a vol spike in	days	quarters
Forecast stability	low, risk numbers move daily	high
Right for	trading desks, hedging ratios, short-term risk limits	strategic allocation, IC reporting, long-horizon mandates

Using a short-horizon model to set a long-term allocation chases noise. A long-horizon model sizing a hedge in a fast market does the opposite: it sets today’s hedge from last year’s correlations. “Matched horizon” is the first line of the fit-for-purpose checklist in Chapter 14.

8.6 Summary

$F$ : EWMA (with half-life as the central dial), separate dynamics for vols vs. correlations, Newey–West for horizon scaling, shrinkage/conditioning for optimizer-safety, regime adjustment for level errors.
$\Delta$ : time-series EWMA where history is good, structural prediction where it is not, credibility-blended. Coverage-universe assets are the $\gamma = 0$ limit.
The assembled $\Sigma = XFX^\top + \Delta$ produced the mini example’s central numbers: portfolio 17.55%, benchmark 18.14%, tracking error 5.42%.
Forecasts are scoreable (bias statistics, exceedances, likelihoods) and have known failure modes. Validation runs continuously, and Chapter 14 builds it out.

The model is now built: $X$ (Ch. 3), estimation discipline (Ch. 5–6), factor portfolios (Ch. 7), $F$ and $\Delta$ (Ch. 8). The next four chapters put it to work.

8.1 The factor covariance matrix FFF

8.2 Specific risk Δ\DeltaΔ