Model Evaluation & Backtesting Results

Evaluated on AAPL, GOOGL, TSLA test sets · January 2015 – January 2025

1Predictive Performance

ModelRMSE ↓MAE ↓R² ↑vs Baseline
Standard LSTM (Price Only)
3.4522.1200.824baseline
XGBoost Regressor
3.1051.9840.851-10.1% RMSE
Hybrid LSTM + VADER
2.8501.7550.887-17.4% RMSE
🏆Proposed Multi-modal (FinBERT + TLSTM)
2.341 🏆1.428 🏆0.942 🏆-32.2% RMSE

* Evaluated on AAPL test set · January 2015 – January 2025· Lower RMSE = better  ·  Our model: 32.1% RMSE improvement over LSTM baseline

2Backtest: Predicted vs Actual

3Trading Strategy Performance (2015–2025)

PPO Agent Total Return

315.6%

vs 210.4% Buy-and-Hold

Sharpe Ratio

2.45

Risk-adjusted return

Max Drawdown

-12.4%

Lowest point from peak

Strategy Comparison — Total Return (%)

Buy-and-Hold

Return: 210.4%

Sharpe: 1.2

Drawdown: -33.9%

Standard LSTM Trader

Return: 245.8%

Sharpe: 1.55

Drawdown: -28.2%

Win Rate: 54.2%

Our PPO Agent

Return: 315.6%

Sharpe: 2.45

Drawdown: -12.4%

Win Rate: 63.7%

4Ablation Study

💡 Key Finding: FinBERT sentiment module contributes +45.8% Sharpe improvement (2.45 vs 1.68) and transductive LSTM weighting reduces RMSE by 17.8% (2.341 vs 2.850)

Sharpe Ratio Comparison

RMSE Comparison (lower = better)

5COVID-19 Crash Case Study (Feb–Apr 2020)

📌 System correctly switched to Sell/Hold on Feb 24, 2020 as FinBERT detected negative news sentiment — protecting portfolio from the subsequent 36% market crash.

Portfolio value indexed to 100 at Feb 3, 2020