Installation
Five minutes to a passing test suite.
Cost Predictor is pure Python. Pick your OS below for the path of least resistance. The smoke run takes roughly seven seconds; the full Synthea generation path is opt-in.
§Prerequisites — common to all platforms
- Python 3.10+ — the runner is tested on 3.14; any 3.10+ should work.
- git — the TimesFM install pulls from a GitHub source URL.
- ~5 GB free disk — venv + the TimesFM 200M-parameter checkpoint (~800 MB) + Synthea output if you generate it.
- Optional: JDK 17+ if you intend to regenerate the Synthea population (Temurin 25 LTS is what's pinned upstream).
1. Clone
git clone https://github.com/GrayBeamTechnology/cost-predictor.git
cd cost-predictor
2. Virtual environment
python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
3. Install dependencies
Install the base requirements first, then TimesFM from source. On Apple Silicon (M-series), TimesFM's PyPI package is not compatible — pull from GitHub.
pip install -r requirements.txt
pip install "timesfm[torch] @ git+https://github.com/google-research/timesfm.git"
timesfm package
caps at Python <3.12 and ships a JAX/lingvo dependency tree
that won't resolve on ARM macOS. The GitHub HEAD pulls
timesfm 2.0+ with lazy JAX imports — only the
covariate path needs JAX, and we don't use it.
4. Run the test suite
python -m pytest cost_predictor
Expect 158 tests passing in ~14 seconds.
5. Smoke-run the experiment runner
python -m cost_predictor.experiments.runner \
--dataset fixture --model seasonal_naive --fast
Writes a run dir under runs/<UTC>_<gitsha7>/
with config.json, metrics.json,
forecasts.parquet, and REPORT.md. The
full Synthea generation path is documented at the bottom of this
page; it requires the JDK toolchain.
6. Read the bonus-pool summary
python main.py --latest --reference-monthly 5000000
Loads the most recent run and prints the one-page sizing
summary. Pass --bonus-rate 0.05 to override the
default 5%, or --run-id <run> to load a
specific run.
1. Clone
git clone https://github.com/GrayBeamTechnology/cost-predictor.git
cd cost-predictor
2. Virtual environment
python3 -m venv venv
source venv/bin/activate
python -m pip install --upgrade pip
3. Install dependencies
pip install -r requirements.txt
pip install "timesfm[torch] @ git+https://github.com/google-research/timesfm.git"
On x86_64 Linux, the PyPI timesfm wheel works for
Python 3.10/3.11. The GitHub source path is recommended for
Python 3.12+ (which is most current distros), and is what the
project tracks.
4. Run the test suite
python -m pytest cost_predictor
5. Smoke-run the experiment runner
python -m cost_predictor.experiments.runner \
--dataset fixture --model seasonal_naive --fast
6. Read the bonus-pool summary
python main.py --latest --reference-monthly 5000000
1. Clone (PowerShell)
git clone https://github.com/GrayBeamTechnology/cost-predictor.git
cd cost-predictor
2. Virtual environment
python -m venv venv
venv\Scripts\activate
python -m pip install --upgrade pip
3. Install dependencies
pip install -r requirements.txt
pip install "timesfm[torch] @ git+https://github.com/google-research/timesfm.git"
If pip install fails on the TimesFM line, you're
hit by the same ARM/JAX dependency knot Apple Silicon users
encounter — easiest fix is to do this part inside WSL2.
4. Run the test suite
python -m pytest cost_predictor
5. Smoke-run the experiment runner
python -m cost_predictor.experiments.runner ^
--dataset fixture --model seasonal_naive --fast
6. Read the bonus-pool summary
python main.py --latest --reference-monthly 5000000
WSL2 fallback
If you've installed WSL2 with an Ubuntu distribution, switch to the Linux tab above — the instructions are identical from there.
§Optional · Synthea generation
The smoke run uses a small synthetic fixture. To run against the full Synthea population the project was calibrated on, you will need a JDK and a Synthea checkout. This is not required for the test suite or the bonus-pool CLI.
1. Install a JDK (17+)
On macOS we use mise; on Linux any system OpenJDK
works. Pin Temurin 25 LTS to match upstream:
# macOS / Linux with mise
mise use --global java@temurin-25
2. Clone Synthea
cd ~
git clone https://github.com/synthetichealth/synthea.git
cd synthea
Apply the project's Texas Medicaid MCO authoring patches to
payers/insurance_companies.csv and
insurance_plans.csv — see the project's
docs/synthea-mco-encoding.md for the rows to add
(IDs 110000–150000 and Plan IDs 110001–150001).
3. Generate
./run_synthea -p 1000 -s 42 \
--exporter.csv.export true \
--generate.only_alive_patients true \
"Texas" "Houston"
Outputs CSVs under output/csv/ — copy these to
cost-predictor/data/synthea/v1/csv/ (gitignored)
and the runner will pick them up via --dataset synthea.
4. Run against Synthea
python -m cost_predictor.experiments.runner \
--dataset synthea --model timesfm_b --by-payer --fast
§Verifying the install
After the smoke run, two artifacts confirm the installation is wired correctly:
-
runs/<ts>_<sha>/REPORT.md— a generated per-run report. Open it; the headline matrix should shown_eval_originsand a sub-0.15breach_rate_p10on the seasonal_naive fixture. -
python main.py --latest— prints the bonus-pool summary block. The dollar translation rescales to whatever you pass to--reference-monthly.
If the test suite passes but the runner errors out on
timesfm, the TimesFM install hit the JAX
dependency tree — your fix is to ensure the GitHub source
install succeeded (pip show timesfm should
return version 2.0.0+).