The 3 Filters That Saved My Kalshi Weather Bot From Blowing Up
TL;DR / Key Takeaways
- The Kalshi fee formula (
0.07 x C x P x (1-P)) is not a footnote. It is capable of consuming your entire edge on contracts priced near 50 cents. The fee filter eliminates these trades before they cost you money. - Ensemble confidence below 0.30 means the model is not sure enough to bet on. Below that threshold, you are guessing. The bot does not guess.
- When the physics-based GFS ensemble and the AI-based AIGEFS ensemble disagree, the atmosphere is genuinely uncertain. Cross-model agreement is not a nice-to-have. It is the primary signal filter.
- Combining all three filters eliminates roughly 80% of apparent trading opportunities. That is not a bug. That is the entire point.
The first working version of the Predict & Profit bot had one decision rule: if the GFS ensemble probability was far enough from the market price, take the trade.
That is it. No fee check. No minimum confidence requirement. No cross-model validation. Just raw edge -- ensemble probability minus market price -- above a fixed threshold, and the bot would submit an order.
It worked well enough to prove the concept. I got fills, I tracked outcomes, the math was right most of the time. But the P&L was noisier than I expected. Wins happened when they should have. So did losses. But the fees on marginal trades were eating more than I had anticipated, and I was taking positions on signals that were technically positive but practically too weak to trade with confidence.
Adding filters was not an admission that the core model was wrong. The ensemble probability approach is sound. The problem was that "sound model" and "tradeable signal" are not the same thing. You need both.
Here are the three filters that closed the gap.
Filter 1: The Fee Efficiency Check
Kalshi charges fees on every contract using this formula:
fee = 0.07 x C x P x (1 - P)
Where C is the number of contracts, P is the price per contract in dollars, and the result is the total fee in dollars.
The geometry of this formula matters. The fee is maximized when P = 0.50 -- right at the 50-cent midpoint. At that price, a 100-contract position costs $1.75 in fees. At P = 0.20 or P = 0.80, the same position costs $1.12. At P = 0.10, it drops to $0.63.
The implication is direct: if you are trading contracts priced near 50 cents, your edge has to be larger than if you are trading extreme-probability contracts. A 10-point edge at 0.50 is not the same effective edge as a 10-point edge at 0.80.
The bot calculates expected return before fees, then subtracts the fee and checks whether the net edge clears a minimum threshold. If the fee consumes more than a set percentage of the expected return, the trade is rejected.
# predictandprofit.io
def fee_efficient(contracts: int, price: float, expected_edge: float, max_fee_ratio: float = 0.25) -> bool:
fee = 0.07 * contracts * price * (1 - price)
expected_return = contracts * expected_edge
if expected_return <= 0:
return False
return (fee / expected_return) <= max_fee_ratio
The max_fee_ratio parameter -- currently set to 0.25 -- means the fee cannot consume more than 25% of expected return. This is a tunable parameter. At 0.25, a substantial number of near-50-cent contracts get filtered out. That is by design.
Before this filter, the bot was taking positions on 12-point edges at 0.48 cents where the fee was eating 30% of the return. Those trades look profitable on paper. They are not profitable in practice at scale.
Filter 2: Minimum Ensemble Confidence
The GFS ensemble outputs a probability based on how many of the 31 (or 62, in the HGEFS configuration) members agree on an outcome. If 20 out of 31 members predict the temperature exceeds the threshold, the ensemble probability is 0.645.
But the raw probability number does not tell you how confident the model is relative to the 50/50 baseline. A contract priced at 0.645 in the market with an ensemble reading of 0.645 has no edge. A contract priced at 0.55 with an ensemble reading of 0.645 has edge, but only 9.5 points of it.
The confidence filter operates on the ensemble probability itself, not on the edge:
# predictandprofit.io
def ensemble_confident(ensemble_prob: float, min_confidence: float = 0.30) -> bool:
# Distance from 0.50 -- how far the model is from neutral
return abs(ensemble_prob - 0.50) >= min_confidence
With min_confidence = 0.30, the ensemble must be reading at least 0.80 or at most 0.20 before the bot considers the trade. Anything between 0.20 and 0.80 is filtered out regardless of edge.
This sounds aggressive. It is. In practice, ensemble readings in the 0.60 to 0.75 range exist on plenty of cycle runs. The model thinks it has signal. But the historical accuracy of GFS ensemble at those confidence levels does not justify the binary nature of prediction market contracts. At 0.65, you are right roughly 65% of the time. You need to be right enough, often enough, with enough edge remaining after fees, to make the position worth placing. Below 0.30 confidence distance from neutral, the math does not support it.
Filter 3: Cross-Model Agreement (The HGEFS Filter)
This is the filter that changed the system the most when I upgraded from 31-member GFS to the 62-member HGEFS hybrid ensemble in March 2026.
HGEFS combines the GFS physics ensemble (31 members) with NOAA's new AIGEFS AI ensemble (31 members, built on DeepMind's GraphCast architecture). Two completely independent modeling families. Two fundamentally different approaches to the same forecast question.
When both agree, you have something worth looking at. When they disagree, you have uncertainty masquerading as a signal.
The filter is simple:
# predictandprofit.io
def models_agree(gfs_prob: float, aigefs_prob: float, min_agreement_delta: float = 0.15) -> bool:
# Both must be on the same side of 0.50
same_side = (gfs_prob > 0.50) == (aigefs_prob > 0.50)
# And must not diverge more than 15 percentage points from each other
close_enough = abs(gfs_prob - aigefs_prob) <= min_agreement_delta
return same_side and close_enough
If GFS says 0.82 and AIGEFS says 0.78, both are bullish and within 4 points. Trade is eligible.
If GFS says 0.82 and AIGEFS says 0.55, both are technically bullish (above 0.50) but the AIGEFS confidence is weak enough to indicate real uncertainty. That 27-point divergence gets caught by min_agreement_delta. Trade is filtered.
If GFS says 0.82 and AIGEFS says 0.44, the models disagree on direction entirely. This is the atmosphere telling you something. Skip it.
In the first weeks after I deployed HGEFS with this filter, trade volume dropped by roughly half compared to the 31-member GFS-only system with the same edge thresholds. That is the filter working. I am not looking for activity. I am looking for high-probability setups where two independent modeling families are telling me the same thing.
What Happens When All Three Filters Run Together
The full pre-trade check runs the filters in sequence. Any failure aborts:
# predictandprofit.io
def is_tradeable(
gfs_prob: float,
aigefs_prob: float,
market_price: float,
contracts: int,
max_fee_ratio: float = 0.25,
min_confidence: float = 0.30,
min_agreement_delta: float = 0.15,
min_edge: float = 0.10,
) -> tuple[bool, str]:
if not ensemble_confident(gfs_prob, min_confidence):
return False, "gfs_confidence_below_threshold"
if not ensemble_confident(aigefs_prob, min_confidence):
return False, "aigefs_confidence_below_threshold"
if not models_agree(gfs_prob, aigefs_prob, min_agreement_delta):
return False, "model_disagreement"
hgefs_prob = (gfs_prob + aigefs_prob) / 2.0
edge = abs(hgefs_prob - market_price)
if edge < min_edge:
return False, "edge_below_threshold"
if not fee_efficient(contracts, market_price, edge, max_fee_ratio):
return False, "fee_kills_edge"
return True, "eligible"
Every rejected trade logs the rejection reason. Over time, those logs tell you which filter is doing the most work. On my setup, fee kills a significant share of the rejections around mid-probability contracts. Model disagreement is the second most common cause. GFS confidence rejections happen but are less frequent because I tend to run scans during high-confidence forecast windows.
What This Changed
Before the filters, the bot was taking 15 to 20 positions per week. After all three filters, it averages 3 to 5. Those 3 to 5 positions are measurably higher quality -- higher edge, better model agreement, lower fee drag.
The P&L is still a small sample size. Prediction market trading requires patience and a lot of settled contracts before the edge distribution becomes statistically meaningful. But the noise dropped noticeably. Marginal trades that were costing me money stopped appearing in the ledger.
That is what a filter system is supposed to do. Not find more trades. Find better ones.
The full source code -- including the complete filter logic, HGEFS ensemble pipeline, and Kalshi API integration -- is available at predictandprofit.gumroad.com. Use code REDDIT for 15% off.