Financial time series often exhibit nonlinear behavior, where the dynamics change depending on market conditions. One common example is the asymmetric response of returns during bull versus bear markets. In bullish regimes, returns may follow a smoother and more persistent pattern, while in bearish regimes, they may be more volatile and mean-reverting. Traditional linear autoregressive (AR) models cannot capture such regime-dependent dynamics effectively.
The Threshold Autoregressive (TAR) model offers a flexible framework for capturing this behavior. It partitions the state space of a threshold variable (often past returns) into distinct regimes and fits separate AR models for each regime. This allows for different dynamics depending on whether the market is above or below a chosen threshold. We use TAR-style regime detection — without estimating AR coefficients — to switch between bull and bear exposure, combined with volatility targeting for smoother risk-adjusted returns.
A self-exciting two-regime TAR model can be written as:
\[ x_t = \begin{cases} \phi_0^{(1)} + \sum_{i=1}^p \phi_i^{(1)} x_{t-i} + a_t^{(1)}, & \text{if } z_{t-d} \le \gamma, \\ \phi_0^{(2)} + \sum_{i=1}^p \phi_i^{(2)} x_{t-i} + a_t^{(2)}, & \text{if } z_{t-d} > \gamma, \end{cases} \]
where:
The model is self-exciting because the threshold variable is taken from the series itself. For more than two regimes, we extend the partition of the state space with additional thresholds.
In equity markets, returns often behave differently depending on whether recent returns are above or below certain levels:
By setting a return threshold (e.g., ±0.5% daily), we can model each regime separately and tailor trading decisions accordingly.
We scale positions using volatility targeting:
\[ w_t = \min\left( \text{max\_lever}, \frac{\sigma_{\text{target,daily}}}{\hat{\sigma}_t} \right) \cdot \mathbf{1}_{\text{bull}} \]
where \(\hat{\sigma}_t\) is the EWMA volatility estimate.
This ensures:
import warnings
"ignore")
warnings.filterwarnings(
import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt
# =========================
# Config
# =========================
= "^GSPC" # e.g. SPY, ^GSPC, BTC-USD
SYMBOL = "2015-01-01"
START = "2022-12-31" # optimize on <= TRAIN_END, test after
TRAIN_END
# Transaction costs and risk controls
= 20.0 # per-unit turnover (bps)
TC_BPS_PER_TURNOVER = True
USE_VOL_TARGETING = 0.25 # 15% target
ANN_VOL_TARGET = 2.
MAX_LEVER
# Optimization grids
= [1, 3, 5] # average of past k daily returns for threshold variable
LAG_AVG_WINDOWS = [0.35, 0.45, 0.55, 0.65] # threshold = quantile(z, q) on train
GAMMA_QUANTS = [1, 3, 5] # days required to confirm a regime switch
HYSTERESIS_K
=True, precision=6)
np.set_printoptions(suppress
# =========================
# Data
# =========================
= yf.download(SYMBOL, start=START, auto_adjust=True, progress=False)
df if isinstance(df.columns, pd.MultiIndex):
= df.droplevel(1, axis=1)
df if df.empty:
raise ValueError("No data downloaded.")
= df["Close"].rename("px")
px = np.log(px).diff()
r_log = px.pct_change().rename("ret")
r_simple = pd.concat([px, r_log.rename("r_log"), r_simple], axis=1).dropna()
df
# =========================
# Helpers
# =========================
def ann_vol(rets): return rets.std() * np.sqrt(252)
def sharpe(rets, rf=0.0):
= rets - rf/252.0
ex = ex.std()
s return (ex.mean()/s)*np.sqrt(252) if s > 0 else np.nan
def max_dd(rets):
= (1+rets).cumprod()
eq return (eq/eq.cummax() - 1).min()
def ewma_vol(r, lam=0.94):
= pd.Series(index=r.index, dtype=float)
v = r.mean()
m = r.var()
s2 = s2
prev for t in r.index:
= r.loc[t] - m
x = lam*prev + (1-lam)*(x*x)
prev = np.sqrt(prev)
v.loc[t] return v
def apply_hysteresis(raw_regime, k):
# require k consecutive different labels to switch regime
= raw_regime.copy()
out = raw_regime.iloc[0]
cur = 0
count for i in range(1, len(raw_regime)):
if raw_regime.iloc[i] != cur:
+= 1
count if count >= k:
= raw_regime.iloc[i]
cur = 0
count else:
= 0
count = cur
out.iloc[i] return out
def simulate_strategy(px, r, z, q_gamma, k_hyst, use_vol=True, ann_target=0.15, max_lever=1.25):
# threshold from training distribution of z
= z.index <= pd.Timestamp(TRAIN_END)
train_mask = z[train_mask].quantile(q_gamma)
gamma
# regimes from lagged/averaged returns (TAR via return threshold)
= (z > gamma).astype(int) # 1 = bull, 0 = bear
raw_regime = apply_hysteresis(raw_regime, k_hyst)
regime
# base long/cash signal: bull → 1, bear → 0
= regime.astype(float)
sig
# vol targeting (optional): scale weight by daily target / EWMA vol
if use_vol:
# use EWMA vol of simple returns, shift to avoid look-ahead
= ewma_vol(r).shift(1)
vol = ann_target / np.sqrt(252.0)
sigma_target_daily = (sigma_target_daily / vol).clip(0, max_lever)
w = w.fillna(0.0)
w = w * sig
w else:
= sig
w
# next-day execution (no look-ahead)
= w.shift(1).fillna(0.0)
w_tr
# transaction costs on turnover
= w_tr.diff().abs().fillna(0.0)
turnover = turnover * (TC_BPS_PER_TURNOVER / 1e4)
cost
= w_tr * r
strat_gross = strat_gross - cost
strat return strat, regime, gamma
def score_train(strat, z):
= z.index <= pd.Timestamp(TRAIN_END)
mask return sharpe(strat[mask])
# =========================
# Build threshold variable z_{t-1} (averaged returns then lagged)
# =========================
= {"score": -1e9}
best for kavg in LAG_AVG_WINDOWS:
if kavg == 1:
= df["r_log"]
z0 else:
= df["r_log"].rolling(kavg).mean()
z0 = z0.shift(1).dropna()
z # align returns to z
= df["ret"].reindex(z.index)
r
for q in GAMMA_QUANTS:
for k_h in HYSTERESIS_K:
= simulate_strategy(df["px"].reindex(z.index), r, z, q, k_h,
strat, regime, gamma =USE_VOL_TARGETING,
use_vol=ANN_VOL_TARGET,
ann_target=MAX_LEVER)
max_lever= score_train(strat, z)
sc if np.isfinite(sc) and sc > best["score"]:
= dict(score=sc, kavg=kavg, q=q, k_h=k_h, gamma=float(gamma))
best "strat"] = strat
best["z"] = z
best["regime"] = regime
best[
print("Best (train) Sharpe:", round(best["score"], 3),
"| kavg:", best["kavg"], "q:", best["q"], "hysteresis:", best["k_h"], "gamma:", round(best["gamma"], 6))
# =========================
# Final out-of-sample run with best params
# =========================
# rebuild z/r with chosen kavg on full sample
if best["kavg"] == 1:
= df["r_log"]
z0 else:
= df["r_log"].rolling(best["kavg"]).mean()
z0 = z0.shift(1).dropna()
z = df["ret"].reindex(z.index)
r
= simulate_strategy(df["px"].reindex(z.index), r, z,
strat, regime, gamma "q"], best["k_h"],
best[=USE_VOL_TARGETING,
use_vol=ANN_VOL_TARGET,
ann_target=MAX_LEVER)
max_lever
= r.copy()
bh
# Metrics
def metrics_block(rets):
return pd.Series({
"TotalReturn": (1+rets).prod()-1,
"AnnVol": ann_vol(rets),
"Sharpe": sharpe(rets),
"MaxDD": max_dd(rets)
})
= z.index <= pd.Timestamp(TRAIN_END)
train_mask = pd.DataFrame({
met_train "Buy&Hold": metrics_block(bh[train_mask]),
"TAR_Overlay": metrics_block(strat[train_mask])
}).T
= pd.DataFrame({
met_full "Buy&Hold": metrics_block(bh),
"TAR_Overlay": metrics_block(strat)
}).T
print("\nTraining metrics:")
with pd.option_context('display.float_format', '{:0.6f}'.format):
print(met_train)
print("\nFull-period metrics:")
with pd.option_context('display.float_format', '{:0.6f}'.format):
print(met_full)
# =========================
# Plots
# =========================
= (1+bh).cumprod()
eq_bh = (1+strat).cumprod()
eq_st
= plt.subplots(3, 1, figsize=(12, 8), sharex=True)
fig, axes 0].plot(df.index, df["r_log"], lw=0.8)
axes[0].set_ylabel("r_t (log)"); axes[0].grid(True, alpha=0.3)
axes[
1].plot(z.index, z.values, lw=0.8, label=f"z (avg {best['kavg']})")
axes[1].axhline(gamma, linestyle="--", label="gamma")
axes[1].fill_between(z.index, z*0+gamma, z.values,
axes[=(regime.values==1), alpha=0.15, step="pre", label="Bull regime")
where1].set_ylabel("Threshold var"); axes[1].legend(); axes[1].grid(True, alpha=0.3)
axes[
2].plot(eq_bh.index, eq_bh.values, lw=1.0, label="Buy & Hold")
axes[2].plot(eq_st.index, eq_st.values, lw=1.0, label="TAR Overlay")
axes[2].legend(); axes[2].grid(True, alpha=0.3)
axes[; plt.show() plt.tight_layout()
In our backtest:
Full-period metrics:
TotalReturn AnnVol Sharpe MaxDD
Buy&Hold 2.162186 0.181087 0.692332 -0.339250
TAR_Overlay 3.282037 0.179230 0.858016 -0.202298
This is important — the strategy didn’t just add return, it smoothed the ride.
This TAR overlay can be:
We can further enhance the method:
Threshold Autoregressive models provide a powerful tool for detecting nonlinear dynamics in financial time series. By setting thresholds on lagged returns, we can detect bull and bear regimes and design trading strategies that adapt accordingly. Unlike static models, TAR captures structural changes in market behavior and offers a practical edge in both regime detection and decision-making.