You can not select more than 25 topics Topics must start with a letter or number, can include dashes ('-') and can be up to 35 characters long.

4.0 KiB

Alpha Lab

Quantitative research experiments for qshare library. This repository contains Jupyter notebooks and analysis scripts for exploring trading strategies and machine learning models.

Philosophy

  • Notebook-centric: Experiments are interactive notebooks, not rigid scripts
  • Minimal abstraction: Simple functions over complex class hierarchies
  • Self-contained: Each task directory is independent
  • Ad-hoc friendly: Easy to modify for exploration

Structure

alpha_lab/
├── common/              # Shared utilities (keep minimal!)
│   ├── __init__.py
│   ├── paths.py         # Path management
│   └── plotting.py      # Common plotting functions
│
├── cta_1d/             # CTA 1-day return prediction
│   ├── __init__.py     # Re-exports from src/
│   ├── config.yaml     # Task configuration
│   ├── src/            # Implementation modules
│   │   ├── __init__.py
│   │   ├── loader.py   # CTA1DLoader
│   │   ├── train.py    # Training functions
│   │   ├── backtest.py # Backtest functions
│   │   └── labels.py   # Label blending utilities
│   ├── 01_data_check.ipynb
│   ├── 02_label_analysis.ipynb
│   ├── 03_baseline_xgb.ipynb
│   └── 04_blend_comparison.ipynb
│
├── stock_15m/          # Stock 15-minute return prediction
│   ├── __init__.py     # Re-exports from src/
│   ├── config.yaml     # Task configuration
│   ├── src/            # Implementation modules
│   │   ├── __init__.py
│   │   ├── loader.py   # Stock15mLoader
│   │   └── train.py    # Training functions
│   ├── 01_data_exploration.ipynb
│   └── 02_baseline_model.ipynb
│
└── results/            # Output directory (gitignored)
    ├── cta_1d/
    └── stock_15m/

Setup

# Install dependencies
pip install -r requirements.txt

# Create environment file
cp .env.template .env
# Edit .env with your settings

Usage

Interactive (Notebooks)

Start Jupyter and run notebooks interactively:

jupyter notebook

Each task directory contains numbered notebooks:

  • 01_*.ipynb - Data loading and exploration
  • 02_*.ipynb - Analysis and baseline models
  • 03_*.ipynb - Advanced experiments
  • 04_*.ipynb - Comparisons and ablations

Command Line

Train models from config files:

# CTA 1D
python -m cta_1d.train --config cta_1d/config.yaml --output results/cta_1d/exp01

# Stock 15m
python -m stock_15m.train --config stock_15m/config.yaml --output results/stock_15m/exp01

# CTA Backtest
python -m cta_1d.backtest \
    --model results/cta_1d/exp01/model.json \
    --dt-range 2023-01-01 2023-12-31 \
    --output results/cta_1d/backtest_01

Python API

# Import from task root (re-exports from src/)
from cta_1d import CTA1DLoader, train_model, TrainConfig
from stock_15m import Stock15mLoader, train_model, TrainConfig
from common import create_experiment_dir

Experiment Tracking

Experiments are tracked manually in results/{task}/README.md:

## 2025-01-15: Baseline XGB
- Notebook: `cta_1d/03_baseline_xgb.ipynb` (cells 1-50)
- Config: eta=0.5, lambda=0.1
- Train IC: 0.042
- Test IC: 0.038
- Notes: Dual normalization, 4 trades/day

Adding a New Task

  1. Create directory: mkdir my_task
  2. Add src/ subdirectory with:
    • __init__.py - Export public APIs
    • loader.py - Dataset loader class
    • Other modules as needed
  3. Add root __init__.py that re-exports from src/
  4. Create numbered notebooks
  5. Add entry to results/my_task/README.md

Best Practices

  1. Keep it simple: Only add to common/ after 3+ copies
  2. Module organization: Place implementation in src/, re-export from root __init__.py
  3. Notebook configs: Define CONFIG dict in first cell for easy modification
  4. Document results: Update results README after significant runs
  5. Git discipline: Don't commit large files, results, or credentials