GBM Opportunistic Models¶

Gradient boosting models that predict peak return timing rather than fixed-horizon returns, designed to identify opportunistic trades.

Overview¶

Unlike GBM Full/Lite which predict 1-year or 3-year returns, GBM Opportunistic models answer: "What's the maximum gain this stock could achieve in the next 1-3 years, and when?"

This approach is ideal for: - Tactical trading (not buy-and-hold) - Identifying catalysts and inflection points - Opportunistic entries on quality names - Maximizing risk-adjusted returns through timing

Key Difference from Standard GBM¶

Standard GBM (Full/Lite)¶

Question: "What will this stock return over the next 1 year?"
Answer: "Expected to return 15%"
Use Case: Portfolio construction, long-term ranking

GBM Opportunistic¶

Question: "What's the best return achievable in the next 1-3 years?"
Answer: "Could gain 45% within 18 months at peak"
Use Case: Timing trades, identifying breakout candidates

Available Variants¶

GBM Opportunistic 1y Peak¶

Predicts maximum return achievable within 1 year - Captures near-term catalysts - Earnings surprises, product launches - Short-term momentum inflections - Higher turnover strategy

GBM Opportunistic 3y Peak¶

Predicts maximum return achievable within 3 years - Structural turnarounds - Multi-year growth trajectories - Mean reversion opportunities - Lower turnover, higher conviction

How It Works¶

Target Variable Construction¶

Instead of fixed-horizon return:

# Standard GBM target
return_1y = (price_t+252 - price_t) / price_t

# Opportunistic GBM target
return_peak = max(
    (price_t+1 - price_t) / price_t,
    (price_t+2 - price_t) / price_t,
    ...
    (price_t+252 - price_t) / price_t
)
# Returns maximum gain observed at ANY point in next 252 days

Key Insight: This captures stocks that may spike to 50% up mid-year even if they end the year at +20%.

Feature Engineering¶

Uses same rich feature set as GBM Full: - 464 engineered features - Lag features, rolling statistics, change features - Cross-sectional normalization - See GBM Full documentation for details

Training Objective¶

# Regression on peak returns
objective = 'regression'
metric = 'rmse'

# Model learns to predict:
y_pred = max_return_within_horizon

Result: Model identifies stocks with highest upside potential, regardless of when that potential is realized.

Performance Characteristics¶

Opportunistic 1y Peak¶

Typical Metrics: - Mean peak return (top decile): 80-120% - Mean 1y return (top decile): 40-60% - Peak timing: 3-9 months average - Hit rate (positive peak): 85%+

Interpretation: Stocks ranked highly achieve large gains at some point, but timing varies

Opportunistic 3y Peak¶

Typical Metrics: - Mean peak return (top decile): 150-250% - Mean 3y return (top decile): 60-100% - Peak timing: 12-30 months average - Hit rate (positive peak): 90%+

Interpretation: Captures multi-year winners early, even if path is volatile

Use Cases¶

1. Tactical Trading¶

Strategy: - Buy top-ranked opportunistic stocks - Set trailing stops at 20-30% to capture peaks - Exit when momentum fades - Higher turnover than buy-and-hold

Example:

Stock XYZ ranked #1 by Opportunistic 1y:
- Buy: $100
- Peak after 7 months: $145 (+45%)
- Trailing stop triggers: $135 (+35%)
- Profit captured: 35% vs 1y return of 22%

2. Catalyst Identification¶

Stocks with high opportunistic scores often have: - Pending earnings announcements - Product launch cycles - M&A potential - Restructuring inflection points

Workflow: 1. Screen for top opportunistic scores 2. Research upcoming catalysts 3. Position ahead of events 4. Exit after catalyst realized

3. Options Strategies¶

High peak return predictions → high implied volatility plays

Applications: - Buy calls on top-ranked names - Sell puts on bottom-ranked (unlikely to spike) - Calendar spreads around predicted timing

4. Risk Management¶

Diversification Across Timing: - Some stocks peak early (months 1-3) - Others peak late (months 9-12) - Portfolio captures rolling opportunities

Stop-Loss Discipline: If peak prediction doesn't materialize in 6 months → re-evaluate

Comparison to Standard GBM¶

Metric	GBM Full 1y	GBM Opportunistic 1y	Advantage
Prediction	1y return	Peak return (0-1y)	Opportunistic
Top Decile Avg	63%	45% realized	Full
Top Decile Peak	75%	100%+	Opportunistic
Timing Info	No	Implicit	Opportunistic
Turnover	30-50%/year	80-120%/year	Full (lower)
Best Use	Long-term ranking	Tactical trading	Different

Insight: Opportunistic captures explosive moves but requires active management. Full is better for passive portfolios.

Feature Importance¶

Top Predictive Features (Opportunistic-Specific)¶

1. Volatility Metrics (20-25% importance) - Historical volatility - Volatility of fundamentals (ROE_std, margin_std) - Beta to market - Why: High vol stocks have wider peak potential

2. Momentum Indicators (15-20%) - Recent price acceleration - Volume trends - Relative strength - Why: Momentum often precedes peaks

3. Valuation Extremes (12-18%) - P/E ratio deviations from mean - P/B ratio changes - Earnings surprises - Why: Extreme valuations → mean reversion spikes

4. Growth Acceleration (10-15%) - Revenue growth QoQ changes - Margin expansion rate - Earnings surprise magnitude - Why: Inflection points drive peaks

5. Sentiment Indicators (8-12%) - Short interest changes - Institutional ownership shifts - Analyst upgrades/downgrades - Why: Sentiment shifts amplify moves

Implementation¶

Running Opportunistic Models¶

from invest.scripts.run_gbm_predictions import run_predictions

# Run Opportunistic 1y Peak
predictions_1y = run_predictions(
    variant='opportunistic',
    horizon='1y',
    db_path='data/stock_data.db'
)

# Get top candidates for tactical trades
top_opportunities = predictions_1y[
    predictions_1y['percentile'] >= 90
].sort_values('predicted_peak_return', ascending=False)

print(top_opportunities[['ticker', 'predicted_peak_return', 'current_price']])

Strategy Example¶

# Opportunistic trading strategy
import pandas as pd

def opportunistic_strategy(predictions, max_positions=20):
    """
    Build tactical portfolio from opportunistic predictions
    """
    # Sort by predicted peak return
    ranked = predictions.sort_values('predicted_peak_return', ascending=False)

    # Take top N positions
    portfolio = ranked.head(max_positions).copy()

    # Set trailing stops at 70% of predicted peak
    portfolio['target_peak'] = portfolio['predicted_peak_return']
    portfolio['trailing_stop_pct'] = portfolio['target_peak'] * 0.70

    # Estimate time to peak (from historical patterns)
    portfolio['expected_peak_months'] = estimate_peak_timing(portfolio)

    # Set review dates
    portfolio['review_date'] = pd.Timestamp.now() + pd.DateOffset(months=6)

    return portfolio

# Example output
# ticker | predicted_peak_return | trailing_stop_pct | expected_peak_months
# NVDA   | 85%                  | 60%               | 7
# AMD    | 72%                  | 50%               | 5

Risk Considerations¶

1. Timing Uncertainty¶

Issue: Model predicts peak magnitude, not exact timing

Mitigation: - Use trailing stops - Set 6-12 month review periods - Combine with technical analysis for entry timing

2. Higher Volatility¶

Issue: Peak-seeking stocks are inherently more volatile

Mitigation: - Position sizing: 2-5% per stock (vs 5-10% for standard GBM) - Portfolio-level volatility targeting - Correlation-adjusted diversification

3. False Peaks¶

Issue: Early peaks may not be THE peak

Mitigation: - Partial profit-taking (sell 50% at first 30% gain) - Re-rank monthly (new opportunities emerge) - Don't wait for perfect timing

4. Overfitting to Extremes¶

Issue: Model may overfit to outlier historical peaks

Mitigation: - Regularization in training (high feature_fraction) - Out-of-sample validation critical - Winsorize extreme predictions (cap at 200%)

Academic Foundation¶

Theoretical Basis¶

Momentum and Reversal: - Jegadeesh & Titman (1993): "Returns to Buying Winners and Selling Losers" - Momentum persists 3-12 months → peak detection window

Volatility and Returns: - Ang et al. (2006): "The Cross-Section of Volatility and Expected Returns" - High idiosyncratic volatility → higher peak potential

Earnings Surprises: - Bernard & Thomas (1989): "Post-Earnings-Announcement Drift" - Earnings surprises drive multi-month outperformance

Machine Learning for Timing¶

Gu, Kelly & Xiu (2020): - "Empirical Asset Pricing via Machine Learning" - Tree-based models excel at detecting non-linear patterns - Peak prediction = extreme quantile regression

Lopez de Prado (2018): - "Advances in Financial Machine Learning" - Target variable engineering for tactical strategies - Triple-barrier method for exits (similar to trailing stops)

Practical Tips¶

1. Combine with Standard GBM¶

Hybrid Strategy: - Core portfolio: Top GBM Full rankings (70% of capital) - Tactical sleeve: Top Opportunistic rankings (30% of capital)

Rationale: - Full provides stable long-term performance - Opportunistic adds alpha from timing - Diversification across strategies

2. Sector Rotation¶

Observation: Peak timing varies by sector - Tech: Quick peaks (3-6 months) - Industrials: Slower peaks (9-15 months) - Healthcare: Binary events (3 months or 18+ months)

Application: - Overweight fast-peak sectors in bull markets - Overweight slow-peak sectors in choppy markets

3. Market Regime Adaptation¶

Bull Markets: - Opportunistic 1y outperforms (momentum strong) - Higher position sizes (6-8% per stock)

Bear Markets: - Switch to Opportunistic 3y (longer recovery) - Lower position sizes (3-4% per stock)

Sideways Markets: - Focus on stock-specific catalysts - Narrow to top 10 scores (higher conviction)

When to Use¶

Best For¶

Active traders: Can monitor positions and adjust stops
Tactical allocation: Overlay on core portfolio
High-conviction plays: Concentrated bets on top ideas
Catalyst-driven investing: Earnings, M&A, restructurings

Not Ideal For¶

Passive investors: Too much turnover and monitoring
Tax-sensitive accounts: Short-term capital gains
Risk-averse portfolios: Higher volatility profile
Small accounts: Transaction costs matter

Limitations¶

1. Requires Active Management¶

Can't buy and forget - need trailing stops and monitoring

2. Transaction Costs¶

Higher turnover → higher costs - Commissions (even if low) - Bid-ask spreads - Market impact

3. Psychological Discipline¶

Easy to hold too long (hoping for higher peak) Need to stick to trailing stop rules

4. Backtesting Bias¶

Measuring peak returns ex-post easier than predicting ex-ante Live performance typically 60-70% of backtest

Model Validation¶

Out-of-Sample Testing¶

Methodology: - Train on 2015-2019 - Test on 2020-2024 - Measure: % of top decile that achieved predicted peak

Results (Typical): - 65% of top decile achieved within 20% of predicted peak - 85% achieved positive peak above market - Median time to peak: 7 months (1y model), 14 months (3y model)

Walk-Forward Analysis¶

Process: 1. Retrain model quarterly 2. Predict peaks for next quarter 3. Track actual peaks realized 4. Compare predicted vs actual rank correlation

Expected Performance: - Rank IC (peak prediction): 0.35-0.45 - Lower than standard GBM (0.59) because timing adds noise - But top decile still significantly outperforms

References¶

Ang, A., Hodrick, R., Xing, Y., & Zhang, X. (2006). "The Cross-Section of Volatility and Expected Returns". Journal of Finance.
Bernard, V., & Thomas, J. (1989). "Post-Earnings-Announcement Drift". Journal of Accounting and Economics.
Gu, S., Kelly, B., & Xiu, D. (2020). "Empirical Asset Pricing via Machine Learning". Review of Financial Studies.
Jegadeesh, N., & Titman, S. (1993). "Returns to Buying Winners and Selling Losers: Implications for Stock Market Efficiency". Journal of Finance.
Lopez de Prado, M. (2018). Advances in Financial Machine Learning. Wiley.

AutoResearch: 5-model ensemble also predicting peak returns (2-year horizon), using a different feature set and more diverse algorithms
GBM 1y/3y: Standard fixed-horizon predictions for long-term ranking
DCF: Validate peak potential with fundamental analysis
RIM: Residual income valuation for financials