Building a Python Macroeconomic Truth Engine: How to Trade the Kalshi CPI Spread
TL;DR / Key Takeaways
- The Econ Bot does not predict inflation in the discretionary sense. It computes a composite CPI nowcast from public macro data and trades only when the Kalshi order book disagrees with the math.
- The signal stack pulls from the BLS Public Data API, BEA NIPA tables, FRED time series, Cleveland Fed inflation nowcasts, and private market structure from Kalshi.
- The homemade CPI model decomposes inflation into shelter, food, energy, and core-proxy components, then reconstructs a weighted current-state estimate.
- A second forward-looking CPI estimate blends current shelter data with ZORI rent proxies because lease-market pressure reaches official CPI shelter with a lag.
- The Cleveland Fed nowcast is not treated as gospel. It is an independent model used for convergence scoring. Agreement increases execution confidence; material divergence penalizes it.
- The final trade decision happens in
edge_calculator.py: model probability minus Kalshi ask price minus fees. If the resulting edge is negative, the bot rejects the trade.
Most people trading CPI contracts on Kalshi are not really trading CPI. They are trading headlines, Twitter summaries, stale economist consensus numbers, and whatever price the order book happens to show at the moment they look.
That is not a system. That is a reaction loop.
The Predict & Profit Econ Bot was built around a stricter idea: do not guess where inflation will settle. Build a reproducible macroeconomic truth engine that pulls the same public data every run, normalizes it, computes a composite nowcast, compares that value against an independent institutional nowcast, and then prices every Kalshi contract as a probability problem.
The output is not a narrative.
It is a number.
If that number creates a positive edge against the Kalshi ask after fees, the bot can trade. If it does not, the bot does nothing. Most cycles end with no trade, which is exactly how an automated trading system should behave when the market is already priced correctly.
The Objective: Price the Spread, Not the Story
CPI contracts are binary instruments. A contract might ask whether year-over-year CPI will be above 3.2%, below 3.5%, or inside a specific range. The market price is an implied probability.
A 44-cent YES ask is not a commentary on inflation. It is a claim that the event is worth buying at roughly 44% probability before fees and spread friction.
The bot's job is to answer one question:
Is my statistically derived probability higher than the market's executable ask price by enough to pay fees, absorb model error, and justify the risk?
That is the entire trade.
The system is not trying to forecast the Federal Reserve, anticipate cable-news narratives, or decide whether inflation "feels sticky." Those ideas may be interesting, but they are not executable. A Kalshi contract settles against a defined government statistic. The bot therefore builds a data pipeline around measurable inputs that lead or proxy that statistic.
This is where the Python architecture matters.
The Signal Architecture
The Econ Bot uses a multi-API pipeline because no single public series explains CPI well enough to trade alone. Each source covers a different part of the inflation surface.
| Module | Source | Purpose |
| --- | --- | --- |
| bls_signals.py | BLS Public Data API | Official CPI component history and subindex behavior |
| bea_signals.py | BEA NIPA tables | PCE and national-account inflation context |
| fred_signals.py | FRED API | Market and commodity leading indicators |
| homemade_nowcast.py | Internal composite model | Weighted CPI reconstruction and forward-looking shelter adjustment |
| edge_calculator.py | Internal execution layer | Contract probability, order-book comparison, fee-aware rejection |
The BLS feed is the anchor. CPI settles from BLS data, so the pipeline starts by loading official CPI series and relevant subcomponents through bls_signals.py. This gives the bot the trailing measured structure: headline CPI, core CPI, food, energy, shelter, and related subindex movement.
The BEA feed adds a second official macro lens. bea_signals.py pulls from BEA NIPA tables to track PCE and national-account inflation data. CPI and PCE are not interchangeable, but they are related measures of the same economy. When both point in the same direction, the signal has more weight. When they separate, the divergence is a warning that category composition may matter more than headline movement.
The FRED feed supplies leading indicators through fred_signals.py. This is where the bot brings in faster-moving series that official CPI will only show later:
- Oil prices as a broad energy inflation input.
- Gasoline prices as the consumer-facing energy channel.
- 10-year breakeven inflation as the market's forward inflation expectation.
- ZORI rent proxies as a leading shelter-pressure estimate.
Those inputs do not settle the contract. They inform the model before the official settlement arrives.
That distinction matters. The bot does not treat market data as truth. It treats market data as pressure. Official CPI is the settlement target; oil, gas, breakevens, and rents are leading observations that help estimate where the next print is likely to land.
Why the Pipeline Is Built as Separate Signal Modules
A common failure mode in trading bots is a single monolithic script that fetches data, cleans it, models it, and trades from it in one pass. That is convenient until one API changes a field name, one source goes stale, or one data series reports on a different calendar than the others.
The Econ Bot separates ingestion from scoring:
bls_signals.py -> official CPI component history
bea_signals.py -> PCE and national accounts context
fred_signals.py -> leading indicators and market proxies
homemade_nowcast.py -> composite CPI estimate
edge_calculator.py -> probability and execution math
Each module has one job.
That makes the system easier to debug when a number looks wrong. If gasoline prices update but ZORI is stale, the FRED signal layer can report that. If BLS data is current but BEA has not published the latest NIPA table, the macro context layer can report that separately. The execution layer should never have to know why a time series is missing. It should receive a normalized signal packet with timestamps, values, confidence flags, and stale-data markers.
This is basic data engineering discipline. Automated trading just makes the cost of getting it wrong more obvious.
The Homemade Nowcast
The core of the system lives in homemade_nowcast.py.
The model starts with a simple premise: headline CPI is not one thing. It is a weighted composition of different inflation regimes moving at different speeds.
Energy can move quickly. Food moves slower, but still reacts to commodity and supply-chain pressure. Shelter is large and lagged. Core services and goods behave differently again. A single linear extrapolation of headline CPI loses that structure.
The homemade nowcast therefore reconstructs CPI from four weighted blocks:
| Component | Weight | Role in the model | | --- | ---: | --- | | Shelter | 35% | Largest structural CPI block; slow-moving and lagged | | Food | 13% | Consumer-visible inflation pressure with moderate lag | | Energy | 7% | High-volatility input driven by oil and gasoline | | Core-proxy | 45% | Residual core goods and services behavior |
The current-state estimate is calculated as:
current_state_cpi = (
shelter_signal * 0.35
+ food_signal * 0.13
+ energy_signal * 0.07
+ core_proxy_signal * 0.45
)
Those weights are intentionally coarse. The point is not to clone the BLS CPI basket line by line. The point is to build a stable tradable approximation that captures the major forces driving the next settlement.
In practical terms:
- Shelter gets the highest explicit weight because it dominates measured CPI and tends to persist.
- Food gets separated because grocery inflation can move differently from core goods and services.
- Energy gets its own block because oil and gasoline can change the headline print even when core inflation is stable.
- The core-proxy block absorbs everything else: broad goods, non-shelter services, and residual inflation pressure.
The result is a current-state CPI estimate: where the economy appears to be now based on the latest available official and leading data.
The Forward-Looking CPI Layer
The current-state estimate is not enough because CPI shelter is lagged.
Official shelter inflation does not immediately reflect current rent-market conditions. Lease renewals, survey methodology, owner-equivalent rent, and sampling lag all slow the transmission. That means a CPI model that looks only at current BLS shelter can overstate inflation when market rents have already cooled, or understate inflation when market rents are accelerating before the official series catches up.
That is why homemade_nowcast.py also computes a forward-looking CPI estimate.
The forward-looking version blends the current shelter signal with the ZORI rent proxy:
adjusted_shelter = blend(current_shelter_signal, zori_rent_proxy)
forward_looking_cpi = (
adjusted_shelter * 0.35
+ food_signal * 0.13
+ energy_signal * 0.07
+ core_proxy_signal * 0.45
)
ZORI does not replace shelter CPI. That would be a category error. ZORI is not the settlement series, and it is not measured with the same methodology. It is used as a directional pressure proxy for where shelter inflation is likely to move as the lag burns off.
This gives the bot two views:
current_state_cpi: what the CPI basket looks like using present measured structure.forward_looking_cpi: what the CPI basket looks like after adjusting shelter for real-time rent pressure.
The spread between those two values is itself useful. If current-state CPI is hot but forward-looking shelter is cooling, the bot should be cautious about blindly buying higher CPI strikes. If both current and forward-looking estimates are hot, the signal is cleaner.
Why Oil, Gas, Breakevens, and ZORI Matter
The leading indicators are not decorative. They exist because CPI is published after the economic activity has already occurred.
Oil and gasoline matter because energy is the fastest-moving headline component. A sharp move in gasoline can change the CPI print even when the rest of the basket is stable. This does not mean energy dominates the model. It only gets a 7% component weight. But because its volatility is high, even a smaller weight can move the final number.
10-year breakeven inflation is a different kind of input. It is not a CPI component. It is a market-implied long-term inflation expectation derived from nominal Treasury yields and TIPS. The bot uses it as a regime indicator. Rising breakevens do not mechanically raise next-month CPI, but they can confirm that broad inflation expectations are moving in the same direction as the component data.
ZORI is the shelter lag tool. CPI shelter is too important and too delayed to ignore real-time rent pressure. The bot uses the ZORI proxy carefully: not as a substitute for official CPI shelter, but as an adjustment layer when estimating forward-looking shelter pressure.
This is what separates a data pipeline from a spreadsheet. Each input has a defined role, a defined weight, and a defined reason for being included.
The Convergence Edge
The homemade nowcast is not the only model in the system.
The bot also compares its internal estimate against the Cleveland Fed nowcast. That comparison is not a popularity contest. It is a model-risk control.
If two independent approaches point to the same CPI neighborhood, the execution layer can treat the signal with higher confidence. If they diverge materially, one of the models is wrong, stale, or seeing a different economic regime.
The scoring rule is direct:
divergence = abs(homemade_nowcast - cleveland_fed_nowcast)
if divergence >= 0.20:
confidence_score -= model_disagreement_penalty
elif divergence <= 0.05:
confidence_score += model_agreement_boost
The thresholds are deliberately tight because CPI contracts are threshold instruments. A 0.20 percentage-point disagreement is not noise when the strike spacing is narrow. It can be the difference between a YES contract expiring at $1.00 and expiring at $0.00.
If the homemade model says 3.42% and the Cleveland Fed nowcast says 3.21%, that is not a minor disagreement. It means the two model families are pricing materially different settlements. The bot penalizes execution confidence because one view is fundamentally wrong for the contract being priced.
If the homemade model says 3.36% and the Cleveland Fed says 3.39%, the models are effectively in agreement. That does not guarantee the trade is profitable, but it reduces model-risk uncertainty. The bot boosts execution confidence because two independent nowcast processes are converging on the same settlement zone.
This is the macro equivalent of ensemble agreement in the weather bot. A single model can be sharp and still wrong. Independent agreement is more valuable than internal conviction.
Confidence Is Not Edge
One of the easiest mistakes in automated trading is confusing confidence with edge.
A model can be highly confident that CPI will settle above 3.2%, while the Kalshi order book already prices that outcome at 93 cents. That may be a correct forecast and still be a terrible trade.
The Econ Bot keeps those concepts separate:
- Confidence measures the quality and agreement of the signal.
- Edge measures the difference between model probability and executable market price.
You need both. Confidence without edge is overpaying. Edge without confidence is gambling against a bad model.
That separation is enforced in edge_calculator.py.
Execution Logic in edge_calculator.py
After the nowcast layer produces a CPI estimate and confidence score, the execution layer evaluates each Kalshi contract against its strike threshold.
For a simplified YES contract asking whether CPI will exceed a threshold, the bot computes:
distance_to_strike = nowcast_value - strike_threshold
model_probability = probability_from_distribution(
distance=distance_to_strike,
uncertainty=nowcast_uncertainty,
confidence=confidence_score,
)
raw_edge = model_probability - kalshi_ask_price
net_edge = raw_edge - estimated_fees
The actual implementation can use the model's distributional assumptions and contract-specific handling, but the principle is the same: estimate the statistical probability of settlement, compare it to the executable ask, then subtract friction.
The bot does not trade mid-price fantasy. It uses the ask because the ask is what can actually be bought.
That matters. A contract showing a 45-cent bid and 55-cent ask is not a 50-cent contract from an execution standpoint. If the bot wants to buy YES now, it is paying 55 cents. The edge calculation must use 55 cents, not the midpoint.
The rejection rule is final:
if net_edge <= 0:
reject_trade()
Negative edge after fees is not "almost good enough." It is a no-trade.
This is where a lot of retail prediction-market trading fails. Traders find a view they like, then negotiate with the price in their head. The bot does not negotiate. It either has positive mathematical expectancy at the executable price or it rejects the order.
A Concrete Example
Assume the homemade nowcast prints 3.38% for year-over-year CPI and the Cleveland Fed nowcast prints 3.40%. The divergence is 0.02 percentage points, inside the 0.05 agreement band, so the execution confidence receives a boost.
Now assume Kalshi has a contract:
CPI above 3.2%
YES ask: 0.64
The nowcast is 0.18 percentage points above the strike. Given the model uncertainty and convergence boost, suppose edge_calculator.py estimates a 76% probability that the contract settles YES.
The raw edge is:
raw_edge = 0.76 - 0.64
raw_edge = 0.12
That is 12 points before fees.
If estimated fees and execution friction consume 2 points, the net edge is still 10 points. That is a candidate trade, subject to position sizing, exposure limits, contradiction checks, and liquidity.
Now change only the market price:
CPI above 3.2%
YES ask: 0.78
The model still thinks the event is 76% likely. The forecast did not change. The trade did.
raw_edge = 0.76 - 0.78
raw_edge = -0.02
After fees, the trade is worse. The bot rejects it.
Same model. Same macro view. Different order book. Different decision.
That is the discipline.
Handling Threshold Contracts Correctly
CPI markets are dangerous because small numerical differences matter.
A model estimate of 3.29% and a model estimate of 3.31% may look nearly identical in a dashboard. For a contract struck at 3.3%, they are opposite sides of settlement. The execution layer has to treat the strike threshold as a hard boundary, not as a visual guide.
That is why edge_calculator.py converts the nowcast into a probability distribution rather than simply comparing point estimate to strike.
A point estimate of 3.31% does not mean the probability of settling above 3.3% is 100%. It means the center of the model is slightly above the strike. The probability depends on uncertainty, model agreement, recent component volatility, stale-data flags, and how close the estimate is to the boundary.
Near the strike, confidence should fall. Far from the strike, confidence can rise if the data is clean and independent models agree.
This is also why a 0.20 percentage-point disagreement between homemade and Cleveland Fed nowcasts is so serious. In CPI threshold markets, 20 basis points can cross multiple strikes.
Data Freshness and Failure Modes
The hardest bugs in macro trading systems are not syntax errors. They are stale-data errors.
A script can run successfully and still produce a bad signal if one API returned yesterday's observation, one table failed to publish, or one series has a release calendar that does not match the others. The bot has to treat timestamps as data, not metadata.
The signal stack is designed around that assumption:
- BLS values must carry official release timestamps and observation periods.
- BEA values must identify the NIPA table and period used.
- FRED values must be checked for latest observation date, not just successful HTTP response.
- ZORI proxy values must be treated as slower housing data, not intraday market data.
- Cleveland Fed nowcast values must be compared as model estimates for the same target period.
When those dates do not line up, confidence should degrade. A clean API response is not the same thing as a valid trading signal.
This is the part of automated trading that looks boring until it saves real money. Most bad trades are not caused by one dramatic failure. They come from small assumptions silently stacking on top of each other.
Position Sizing Comes After Edge
The edge calculation answers whether the trade is worth considering. It does not answer how large the trade should be.
Sizing has to come later, after the bot knows:
- Net edge after fees.
- Current exposure across related CPI strikes.
- Whether the proposed position contradicts existing positions.
- Liquidity available at the ask.
- Maximum risk allowed for the market and settlement date.
- Confidence penalties from model divergence or stale data.
This is especially important in CPI spread trading because contracts are correlated. A YES on CPI above 3.2% and a NO on CPI above 3.5% can both make sense together because they express a range view. But a YES above 3.5% and a NO above 3.2% are logically contradictory. The bot has to understand the strike ladder, not just individual contracts.
The edge engine therefore sits inside a broader execution framework. Positive edge is necessary. It is not sufficient.
What This System Is Actually Trading
The Econ Bot is not trading inflation in the abstract.
It is trading divergence between three things:
- A composite Python nowcast built from public macro data.
- An independent Cleveland Fed nowcast used as a convergence check.
- The executable Kalshi order book.
When the model stack and Cleveland Fed nowcast agree, and the Kalshi ask implies a probability materially below the bot's computed probability, the system has a candidate trade.
When the models disagree, confidence drops.
When the order book already prices the event correctly, edge disappears.
When fees erase the spread, the bot rejects the trade.
That is the entire architecture in one sentence: compute the current macro state, verify convergence, price the contract, and only trade mathematical divergence.
Why This Beats Manual CPI Trading
Manual CPI trading breaks down because the trader is usually doing several incompatible jobs at once. They are reading economic commentary, watching the order book, guessing what other traders believe, remembering the last CPI print, checking gasoline prices, and trying to place an order before the price moves.
That workflow is not just stressful. It is inconsistent.
The bot does the same process every time:
- Pull BLS CPI component data.
- Pull BEA NIPA context.
- Pull FRED leading indicators.
- Update oil, gas, breakeven, and ZORI proxy signals.
- Compute current-state CPI.
- Compute forward-looking CPI.
- Compare against Cleveland Fed nowcast.
- Score convergence or divergence.
- Convert the nowcast into contract-level probability.
- Compare probability against the Kalshi ask.
- Subtract estimated fees.
- Reject negative edge.
There is no mood, no narrative drift, and no temptation to "just take the trade" because the last one worked.
That is what a macroeconomic truth engine is supposed to do. It turns public data into a repeatable decision system.
The Practical Tradeoff
The downside of this architecture is that it filters aggressively.
If you want a bot that trades every CPI market every day, this is the wrong design. A serious edge engine will spend most of its time saying no. That is not a defect. Prediction markets are not priced incorrectly all the time.
The real edge appears when the order book is stale, overly confident, underreacting to leading data, or overreacting to a narrative that the component math does not support.
When that happens, the bot is ready because the pipeline has already done the work:
- BLS has defined the official component baseline.
- BEA has added macro context.
- FRED has supplied leading indicators.
- ZORI has adjusted the shelter lens.
- Cleveland Fed has provided the independent convergence check.
- Kalshi has exposed the executable price.
The trade is not based on a feeling that CPI is going higher or lower. It is based on a measured spread between computed probability and market price.
Final Thought
The best prediction-market systems are not prediction machines. They are pricing machines.
The Predict & Profit Econ Bot does not need to be philosophically right about inflation. It needs to compute a better probability than the market is offering, avoid trades where model risk is high, and refuse orders where fees destroy the edge.
That is the point of the Python macroeconomic truth engine: ingest the data, normalize the signals, compute the nowcast, test convergence, price the contract, and trade only when the order book is mathematically wrong.
The complete dual-bot architecture is available at predictandprofit.gumroad.com.
For $97, you receive the full system: the 62-member ensemble Weather Bot, the macroeconomic Econ Bot, full Python source code, and the PostgreSQL database schemas to run the exact same infrastructure locally.