# CTA 1D Parquet Dataset This directory contains requirements for CTA (Commodity Trading Advisor) futures Parquet datasets used by alpha_lab. ## Tables ### cta_alpha158_1d Alpha158 features for CTA futures. - **Source**: `dfs://daily_stock_run.stg_1day_tinysoft_cta_alpha159_0_7_beta` - **Output**: `/data/parquet/dataset/cta_alpha158_1d/` - **Columns**: ~163 feature columns + code, m_nDate ### cta_hffactor_1d High-frequency factor features (8 columns). - **Source**: `dfs://daily_stock_run.stg_1day_tinysoft_cta_hffactor` - **Output**: `/data/parquet/dataset/cta_hffactor_1d/` - **Transformation**: Pivot from long to wide format - Input columns: code, m_nDate, factor_name, value - Output columns: code, m_nDate, vol_1min, skew_1min, ... (8 features) - **Filter**: Only include factor_name in [vol_1min, skew_1min, volp_1min, volp_ratio_1min, voln_ratio_1min, trend_strength_1min, pv_corr_1min, flowin_ratio_1min] ### cta_dom_1d Dominant contract mapping for continuous contracts. - **Source**: `dfs://daily_stock_run.dwm_1day_cta_dom` - **Output**: `/data/parquet/dataset/cta_dom_1d/` - **Filter**: version = 'vp_csmax_roll2_cummax' - **Aggregation**: GROUP BY m_nDate, code_init; SELECT first(code) as code ### cta_labels_1d Return labels for different return types. - **Source**: `dfs://daily_stock_run.stg_1day_tinysoft_cta_hfvalue` - **Output**: `/data/parquet/dataset/cta_labels_1d/` - **Filter**: indicator in [twap_open1m@1_twap_close1m@1, twap_open1m@1_twap_open1m@2] - **Columns**: code, m_nDate, indicator, value ## Consumer Used by: `alpha_lab/cta_1d/src/loader_parquet.py` The alpha_lab project will create a parallel loader that reads from these Parquet tables instead of DolphinDB.