Sprint Velocity Is a Measure of Effort. My Bot Measures Outcomes. There Is a Difference.
TL;DR / Key Takeaways
- Sprint velocity measures how fast a team moves. It does not measure whether anyone is moving in a useful direction. These are completely different things.
- Automated trading systems have no sprint board. There is a P&L ledger. Either the bot made money this week or it did not.
- The psychological comfort of story point velocity is that it gives you something to show even when nothing ships. The discomfort of a P&L ledger is that it never lies.
- Building your own systems does not eliminate failure. It eliminates the ability to hide failure behind process metrics.
I spent years in enterprise software planning rooms. Sticky notes on walls. Fibonacci story point estimation. Velocity charts on monitors. Burndown graphs nobody looked at past Wednesday. The whole ritual.
Every two weeks the team would review its velocity. Were we moving faster than last sprint? Slower? Did we have too much carryover? The retrospective would generate action items. The action items would be turned into tickets. The tickets would be estimated in story points. And the cycle would continue.
At no point in that process did anyone ask the question I now ask my trading bot every morning: did this generate a return?
What Velocity Actually Measures
Sprint velocity, in the Scrum sense, is a count of story points completed per sprint. Story points are a relative measure of effort and complexity, not value. A five-point story might ship a feature that brings in $2 million in new revenue. It might also ship a feature that nobody uses. Velocity does not know the difference. The chart goes up either way.
This is not a knock on engineering teams. I have worked with brilliant engineers in Agile environments. The problem is structural. Velocity is easy to measure, so it becomes the proxy for productivity. Productivity is hard to measure in software, so the proxy becomes the thing that gets optimized. Teams optimize their velocity by getting better at estimation, splitting stories smaller, reducing carryover. None of that is the same as building things people actually need.
The metric is honest about what it is: a measure of throughput. The dishonesty is in treating throughput as a stand-in for value.
The Bot Does Not Have a Sprint Board
My weather trading bot runs on a headless Ubuntu VM. It pulls 62-member HGEFS ensemble data every six hours via the Open-Meteo API, scores Kalshi temperature markets on four factors, filters out anything below its minimum thresholds, and executes trades automatically. It does not have a sprint board. It does not have story points. It does not have a product owner writing acceptance criteria.
It has a PostgreSQL database with a trades table. Every executed trade has a pnl column. At the end of each day I can run one query and know exactly what happened.
SELECT
trade_date,
COUNT(*) AS trades_executed,
SUM(pnl) AS daily_pnl,
AVG(pnl) AS avg_pnl_per_trade,
SUM(CASE WHEN pnl > 0 THEN 1 ELSE 0 END) AS winners,
SUM(CASE WHEN pnl < 0 THEN 1 ELSE 0 END) AS losers
FROM trades
WHERE trade_date >= CURRENT_DATE - INTERVAL '14 days'
GROUP BY trade_date
ORDER BY trade_date DESC;
That is the retrospective. Fourteen rows. No action items. No process improvement stories. Just whether the system worked or it did not.
The Discomfort Is the Feature
Here is what nobody tells you about escaping the velocity treadmill: the alternative is harder, not easier.
Story point velocity is psychologically comfortable because it rewards activity. As long as you are closing tickets, the metric is improving. The P&L ledger rewards nothing except results. You can run the bot for two weeks, deploy three new signal improvements, refactor the database layer, add a new exponential backoff pattern, and if the market conditions were unfavorable, the ledger shows a flat or negative return. The effort is invisible to the metric.
That is painful in a way that burndown charts never are. When the bot loses money, there is nowhere to point except at the model. Was the ensemble spread not wide enough? Was the confidence threshold set too low? Was the fee calculation off on a batch of trades? Those are real questions with real answers, and the P&L is telling you to go find them.
In corporate engineering, a bad sprint ends with a retrospective and a list of process improvements. A bad two weeks on the bot ends with me in the SQLite logs at 11 PM figuring out why three trades that should have been filtered out were not.
One of those processes builds better engineers. The other builds better habits.
The Alignment That Actually Matters
There is a version of "stakeholder alignment" that enterprise software teams do endlessly: syncing roadmaps, getting sign-off, reviewing priorities. It exists because there are multiple people with different interests all looking at the same system.
My system has one stakeholder: the P&L. Every decision — whether to add a new data source, whether to tighten a filter threshold, whether to expand to a new market type — runs through the same question: does this improve the expected return? If yes, build it. If not, skip it.
That sounds obvious until you have spent twenty years in rooms where the right answer to "should we build this?" was "who wants it?" rather than "what does the data say?"
The Kalshi fee formula is a good example. The bot calculates the expected fee on every potential trade before submitting. If the fee exceeds a set percentage of the expected edge, the trade is filtered out. No meeting required. No escalation. The math runs in real time and either clears the filter or does not.
# predictandprofit.io
def fee_is_acceptable(price: float, contracts: int, edge: float, max_fee_ratio: float = 0.25) -> bool:
"""
Kalshi fee formula: 0.07 * C * P * (1 - P)
Returns True if the fee does not consume more than max_fee_ratio of expected edge.
"""
fee = 0.07 * contracts * price * (1 - price)
expected_edge_dollars = edge * contracts
if expected_edge_dollars <= 0:
return False
return (fee / expected_edge_dollars) <= max_fee_ratio
That function runs hundreds of times per day. It has no opinions, no politics, and no preference for what the answer should be. Either the trade clears the filter or it does not.
I have written more impactful production logic than that in Confluence comments during alignment meetings and watched it go nowhere for six months.
What 30 Years Built Toward
I did not come into this cynical. I believed in the process. I thought velocity charts and sprint retrospectives were genuine attempts to build better engineering culture. Some of them were. I worked on teams where Agile practices actually helped — small, focused, empowered teams where the sprint board reflected real decisions and the retrospective changed real behavior.
But those teams were the exception, not the rule. More often the process was theater. A way for management to feel like they were in control of something that does not respond well to control. Software development is inherently unpredictable. Story points are a way to make the unpredictability feel manageable by measuring the one part of it you can actually count.
When I built Predict & Profit I made a deliberate choice to skip all of that. The bot runs on a single Ubuntu VM. The architecture fits in my head. Every component exists because it demonstrably improves the outcome or it does not exist at all. The 62-member HGEFS ensemble upgrade happened because backtesting showed a measurable improvement in edge detection accuracy. Not because it was on a roadmap. Not because a product manager prioritized it. Because the data said it was worth it.
That is not a workflow. It is a discipline. And it is the most productive I have been in 30 years of building software.
This Is Not Anti-Engineering
To be clear: I am not arguing against structure. I am arguing against structure that measures the wrong thing.
Good sprint teams do exist. Good Agile processes do exist. If you are working in one, hold onto it. But if you are sitting in your third velocity planning session of the month wondering why none of it seems connected to what your users actually need, the problem is probably not that you are estimating in the wrong Fibonacci numbers.
The bot does not care what I estimated. It cares what I built. That distinction is worth something.
If you are a developer who is tired of measuring effort and wants to build a system that measures outcomes, the Predict & Profit bot source code is a working example. Real trades. Real edge logic. Real P&L. No story points.
Get the Source Code — $67 — use code REDDIT for 15% off.