Article

Threshold Autoregressive Models for Bull Bear Regime Detection via Return Thresholds

Financial time series often exhibit nonlinear behavior, where the dynamics change depending on market conditions. One common example is the asymmetric response of returns during bull versus bear markets. In bullish regimes, returns may follow a smoother and more persistent pattern, while in bearish regimes, they may be more volatile and mean-reverting. Traditional linear autoregressive (AR) models cannot capture such regime-dependent dynamics effectively.

The Threshold Autoregressive (TAR) model offers a flexible framework for capturing this behavior. It partitions the state space of a threshold variable (often past returns) into distinct regimes and fits separate AR models for each regime. This allows for different dynamics depending on whether the market is above or below a chosen threshold. We use TAR-style regime detection — without estimating AR coefficients — to switch between bull and bear exposure, combined with volatility targeting for smoother risk-adjusted returns.

TAR Model Formulation

A self-exciting two-regime TAR model can be written as:

\[ x_t = \begin{cases} \phi_0^{(1)} + \sum_{i=1}^p \phi_i^{(1)} x_{t-i} + a_t^{(1)}, & \text{if } z_{t-d} \le \gamma, \\ \phi_0^{(2)} + \sum_{i=1}^p \phi_i^{(2)} x_{t-i} + a_t^{(2)}, & \text{if } z_{t-d} > \gamma, \end{cases} \]

where:

\(z_{t-d}\) is the threshold variable (often \(x_{t-d}\), the lagged return),
\(\gamma\) is the threshold value,
\(d\) is the delay parameter,
\(a_t^{(j)}\) are white noise errors for regime \(j\).

The model is self-exciting because the threshold variable is taken from the series itself. For more than two regimes, we extend the partition of the state space with additional thresholds.

Why TAR for Bull/Bear Regimes?

In equity markets, returns often behave differently depending on whether recent returns are above or below certain levels:

Bull regime: Higher persistence, lower volatility, momentum-driven.
Bear regime: Stronger mean reversion, higher volatility, faster corrections.

By setting a return threshold (e.g., ±0.5% daily), we can model each regime separately and tailor trading decisions accordingly.

Position Sizing

We scale positions using volatility targeting:

\[ w_t = \min\left( \text{max\_lever}, \frac{\sigma_{\text{target,daily}}}{\hat{\sigma}_t} \right) \cdot \mathbf{1}_{\text{bull}} \]

where \(\hat{\sigma}_t\) is the EWMA volatility estimate.

This ensures:

Larger size in calm conditions.
Smaller size when volatility spikes.
Optional cap on leverage.

Python Implementation

import warnings
warnings.filterwarnings("ignore")

import numpy as np
import pandas as pd
import yfinance as yf
import matplotlib.pyplot as plt

# =========================
# Config
# =========================
SYMBOL     = "^GSPC"           # e.g. SPY, ^GSPC, BTC-USD
START      = "2015-01-01"
TRAIN_END  = "2022-12-31"      # optimize on <= TRAIN_END, test after

# Transaction costs and risk controls
TC_BPS_PER_TURNOVER = 20.0      # per-unit turnover (bps)
USE_VOL_TARGETING   = True
ANN_VOL_TARGET      = 0.25     # 15% target
MAX_LEVER           = 2.

# Optimization grids
LAG_AVG_WINDOWS = [1, 3, 5]           # average of past k daily returns for threshold variable
GAMMA_QUANTS    = [0.35, 0.45, 0.55, 0.65]   # threshold = quantile(z, q) on train
HYSTERESIS_K    = [1, 3, 5]           # days required to confirm a regime switch

np.set_printoptions(suppress=True, precision=6)

# =========================
# Data
# =========================
df = yf.download(SYMBOL, start=START, auto_adjust=True, progress=False)
if isinstance(df.columns, pd.MultiIndex):
    df = df.droplevel(1, axis=1)
if df.empty:
    raise ValueError("No data downloaded.")

px = df["Close"].rename("px")
r_log = np.log(px).diff()
r_simple = px.pct_change().rename("ret")
df = pd.concat([px, r_log.rename("r_log"), r_simple], axis=1).dropna()

# =========================
# Helpers
# =========================
def ann_vol(rets): return rets.std() * np.sqrt(252)
def sharpe(rets, rf=0.0):
    ex = rets - rf/252.0
    s = ex.std()
    return (ex.mean()/s)*np.sqrt(252) if s > 0 else np.nan
def max_dd(rets):
    eq = (1+rets).cumprod()
    return (eq/eq.cummax() - 1).min()

def ewma_vol(r, lam=0.94):
    v = pd.Series(index=r.index, dtype=float)
    m = r.mean()
    s2 = r.var()
    prev = s2
    for t in r.index:
        x = r.loc[t] - m
        prev = lam*prev + (1-lam)*(x*x)
        v.loc[t] = np.sqrt(prev)
    return v

def apply_hysteresis(raw_regime, k):
    # require k consecutive different labels to switch regime
    out = raw_regime.copy()
    cur = raw_regime.iloc[0]
    count = 0
    for i in range(1, len(raw_regime)):
        if raw_regime.iloc[i] != cur:
            count += 1
            if count >= k:
                cur = raw_regime.iloc[i]
                count = 0
        else:
            count = 0
        out.iloc[i] = cur
    return out

def simulate_strategy(px, r, z, q_gamma, k_hyst, use_vol=True, ann_target=0.15, max_lever=1.25):
    # threshold from training distribution of z
    train_mask = z.index <= pd.Timestamp(TRAIN_END)
    gamma = z[train_mask].quantile(q_gamma)

    # regimes from lagged/averaged returns (TAR via return threshold)
    raw_regime = (z > gamma).astype(int)   # 1 = bull, 0 = bear
    regime = apply_hysteresis(raw_regime, k_hyst)

    # base long/cash signal: bull → 1, bear → 0
    sig = regime.astype(float)

    # vol targeting (optional): scale weight by daily target / EWMA vol
    if use_vol:
        # use EWMA vol of simple returns, shift to avoid look-ahead
        vol = ewma_vol(r).shift(1)
        sigma_target_daily = ann_target / np.sqrt(252.0)
        w = (sigma_target_daily / vol).clip(0, max_lever)
        w = w.fillna(0.0)
        w = w * sig
    else:
        w = sig

    # next-day execution (no look-ahead)
    w_tr = w.shift(1).fillna(0.0)

    # transaction costs on turnover
    turnover = w_tr.diff().abs().fillna(0.0)
    cost = turnover * (TC_BPS_PER_TURNOVER / 1e4)

    strat_gross = w_tr * r
    strat = strat_gross - cost
    return strat, regime, gamma

def score_train(strat, z):
    mask = z.index <= pd.Timestamp(TRAIN_END)
    return sharpe(strat[mask])

# =========================
# Build threshold variable z_{t-1} (averaged returns then lagged)
# =========================
best = {"score": -1e9}
for kavg in LAG_AVG_WINDOWS:
    if kavg == 1:
        z0 = df["r_log"]
    else:
        z0 = df["r_log"].rolling(kavg).mean()
    z = z0.shift(1).dropna()
    # align returns to z
    r = df["ret"].reindex(z.index)

    for q in GAMMA_QUANTS:
        for k_h in HYSTERESIS_K:
            strat, regime, gamma = simulate_strategy(df["px"].reindex(z.index), r, z, q, k_h,
                                                     use_vol=USE_VOL_TARGETING,
                                                     ann_target=ANN_VOL_TARGET,
                                                     max_lever=MAX_LEVER)
            sc = score_train(strat, z)
            if np.isfinite(sc) and sc > best["score"]:
                best = dict(score=sc, kavg=kavg, q=q, k_h=k_h, gamma=float(gamma))
                best["strat"] = strat
                best["z"] = z
                best["regime"] = regime

print("Best (train) Sharpe:", round(best["score"], 3),
      "| kavg:", best["kavg"], "q:", best["q"], "hysteresis:", best["k_h"], "gamma:", round(best["gamma"], 6))

# =========================
# Final out-of-sample run with best params
# =========================
# rebuild z/r with chosen kavg on full sample
if best["kavg"] == 1:
    z0 = df["r_log"]
else:
    z0 = df["r_log"].rolling(best["kavg"]).mean()
z = z0.shift(1).dropna()
r = df["ret"].reindex(z.index)

strat, regime, gamma = simulate_strategy(df["px"].reindex(z.index), r, z,
                                         best["q"], best["k_h"],
                                         use_vol=USE_VOL_TARGETING,
                                         ann_target=ANN_VOL_TARGET,
                                         max_lever=MAX_LEVER)

bh = r.copy()

# Metrics
def metrics_block(rets):
    return pd.Series({
        "TotalReturn": (1+rets).prod()-1,
        "AnnVol": ann_vol(rets),
        "Sharpe": sharpe(rets),
        "MaxDD": max_dd(rets)
    })

train_mask = z.index <= pd.Timestamp(TRAIN_END)
met_train = pd.DataFrame({
    "Buy&Hold": metrics_block(bh[train_mask]),
    "TAR_Overlay": metrics_block(strat[train_mask])
}).T

met_full = pd.DataFrame({
    "Buy&Hold": metrics_block(bh),
    "TAR_Overlay": metrics_block(strat)
}).T

print("\nTraining metrics:")
with pd.option_context('display.float_format', '{:0.6f}'.format):
    print(met_train)
print("\nFull-period metrics:")
with pd.option_context('display.float_format', '{:0.6f}'.format):
    print(met_full)

# =========================
# Plots
# =========================
eq_bh = (1+bh).cumprod()
eq_st = (1+strat).cumprod()

fig, axes = plt.subplots(3, 1, figsize=(12, 8), sharex=True)
axes[0].plot(df.index, df["r_log"], lw=0.8)
axes[0].set_ylabel("r_t (log)"); axes[0].grid(True, alpha=0.3)

axes[1].plot(z.index, z.values, lw=0.8, label=f"z (avg {best['kavg']})")
axes[1].axhline(gamma, linestyle="--", label="gamma")
axes[1].fill_between(z.index, z*0+gamma, z.values,
                     where=(regime.values==1), alpha=0.15, step="pre", label="Bull regime")
axes[1].set_ylabel("Threshold var"); axes[1].legend(); axes[1].grid(True, alpha=0.3)

axes[2].plot(eq_bh.index, eq_bh.values, lw=1.0, label="Buy & Hold")
axes[2].plot(eq_st.index, eq_st.values, lw=1.0, label="TAR Overlay")
axes[2].legend(); axes[2].grid(True, alpha=0.3)
plt.tight_layout(); plt.show()

Results

In our backtest:

Full-period metrics:
             TotalReturn   AnnVol   Sharpe     MaxDD
Buy&Hold        2.162186 0.181087 0.692332 -0.339250
TAR_Overlay     3.282037 0.179230 0.858016 -0.202298

This is important — the strategy didn’t just add return, it smoothed the ride.

Practical Applications

This TAR overlay can be:

Applied to index futures (ES, NQ) for tactical allocation.
Combined with bond overlays to create an equity-bond rotation.
Used on crypto where volatility regimes are extreme and exploitable.
Stacked with trend filters (e.g., 200-day SMA) to add extra defense.

We can further enhance the method:

Run walk-forward optimization to avoid parameter staleness.
Incorporate multi-asset TAR overlay for portfolio-level regime control.
Add transaction cost models specific to the market traded.
Blend TAR with machine learning classifiers for regime prediction.

Conclusion

Threshold Autoregressive models provide a powerful tool for detecting nonlinear dynamics in financial time series. By setting thresholds on lagged returns, we can detect bull and bear regimes and design trading strategies that adapt accordingly. Unlike static models, TAR captures structural changes in market behavior and offers a practical edge in both regime detection and decision-making.