< Back to Blog

Shipping a SQLite Bot to Customers When You Run PostgreSQL in Prod

TL;DR / Key Takeaways

  • The production bots run PostgreSQL with peer auth, connection pooling, and JSONB columns. Customers get SQLite. The translation is not trivial but it is manageable.
  • Key substitutions: REAL for NUMERIC, TEXT for JSONB, sqlite3.connect() for psycopg2, cursor.lastrowid for RETURNING id.
  • SQLite is not a compromise for a single-user trading bot on a VPS. It is the correct tool.
  • The Predict & Profit bot bundle ships with a migration script and SQLite schema out of the box, so customers are not guessing.

My production setup runs PostgreSQL 16 on the same Ubuntu 24 VPS as the bots. Peer authentication, connection pooling via a simple context manager, JSONB for storing ensemble breakdowns and signal payloads. It works. I know where everything is. The trade log has 12 columns and a partial index on open positions.

None of that ships to customers.

When someone buys the bot source code for $97, they get SQLite. One file. No server process. No auth config. No pg_hba.conf to fumble with at 11pm when the bot refuses to connect and CPI drops in 6 hours.

The packaging problem is real: I maintain two database layers from one codebase, and the translation rules have to be exact or the customer bot silently misbehaves. This post documents the full translation, the migration script structure, and why I stopped apologizing for shipping SQLite.


Why the Production Bot Uses PostgreSQL

The short answer is that I was already running Postgres for other things on this VPS, and the data engineering instinct is always to reach for the relational database you trust.

The longer answer involves JSONB. The Weather Bot logs a per-source probability breakdown for every trade candidate. That breakdown is a dictionary: four keys, four float values, one entry per ensemble source. In Postgres, that is a JSONB column. You can index it, query into it, aggregate across it. WHERE ensemble_data->>'ecmwf_ifs' > '0.75' just works.

The Econ Bot does something similar for signal payloads. The Cleveland Fed probability, the four FRED nudges, the BLS component read, the BEA PCE delta, the final composite. All of it lands in a single JSONB column per trade cycle. Querying it later for diagnostics is clean.

Peer auth means the bot process connects as the same OS user that owns the Postgres role. No password in the config. No password in the environment. On my VPS that is fine. For a customer on a different VPS with a different Linux user setup, it is a support ticket waiting to happen.

So: Postgres in prod, SQLite in the customer bundle. Pick your translation rules and stick to them.


The Translation Rules

JSONB to TEXT

SQLite has no native JSON column type. Everything is stored as TEXT, INTEGER, REAL, BLOB, or NULL. For the ensemble breakdown and signal payload columns, TEXT is fine. The bot serializes with json.dumps() before writing and deserializes with json.loads() after reading. The data is intact. You just lose the ability to query into the JSON at the SQL layer.

For a single-user trading bot, that is an acceptable trade. You are not running analytics across 10 million rows. You are logging trades so you can audit them later. TEXT works.

Production schema (Postgres):

ensemble_data JSONB,
signal_payload JSONB

Customer schema (SQLite):

ensemble_data TEXT,
signal_payload TEXT

In the Python layer, the write path looks like this in production:

# psycopg2 with JSONB — Postgres
cursor.execute(
    """
    INSERT INTO trade_decisions
    (market_ticker, direction, ensemble_data, signal_payload, decision, skip_reason)
    VALUES (%s, %s, %s::jsonb, %s::jsonb, %s, %s)
    """,
    (ticker, direction, json.dumps(ensemble), json.dumps(signals), decision, reason)
)

Customer version with SQLite:

# sqlite3 — customer bundle
cursor.execute(
    """
    INSERT INTO trade_decisions
    (market_ticker, direction, ensemble_data, signal_payload, decision, skip_reason)
    VALUES (?, ?, ?, ?, ?, ?)
    """,
    (ticker, direction, json.dumps(ensemble), json.dumps(signals), decision, reason)
)

Two changes: %s becomes ?, and the ::jsonb cast disappears. The json.dumps() call is identical in both paths.


NUMERIC to REAL

Postgres NUMERIC(10,4) gives you exact decimal arithmetic. It is the right type for money and probabilities when you care about rounding at the database layer. SQLite stores everything as IEEE 754 floating point under REAL. You lose the precision guarantee.

For a trading bot that is logging probabilities and dollar amounts, this matters at the margin. A probability of 0.6543 stored as Postgres NUMERIC comes back as exactly 0.6543. Stored as SQLite REAL it comes back as 0.6542999999999999 or similar depending on the float representation.

My call: accept it. The bot's decision logic uses Python floats throughout. The database is a log, not a ledger. If I were building a multi-user accounting system I would care more. For this use case, REAL is fine and the code is simpler.

Production:

probability     NUMERIC(10,6),
edge            NUMERIC(10,6),
fill_price      NUMERIC(10,4),
contracts       INTEGER,
pnl             NUMERIC(10,4)

Customer:

probability     REAL,
edge            REAL,
fill_price      REAL,
contracts       INTEGER,
pnl             REAL

psycopg2 to sqlite3

The connection layer is the biggest structural difference. Psycopg2 gives you a connection object you can configure with autocommit, server-side cursors, and connection pool hooks. The production bot uses a context manager that handles connection lifecycle and rolls back on exceptions.

Sqlite3 from the standard library is simpler. One call to sqlite3.connect() with the database file path. The context manager pattern is the same. Transactions behave slightly differently by default, but for the bot's write patterns it does not matter.

Production connection (psycopg2):

import psycopg2
from contextlib import contextmanager

@contextmanager
def get_db_connection():
    conn = psycopg2.connect(
        dbname="kalshi_trading",
        user="botuser",
        host="/var/run/postgresql"  # peer auth via Unix socket
    )
    try:
        yield conn
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        conn.close()

Customer connection (sqlite3):

import sqlite3
from contextlib import contextmanager

DB_PATH = "trades.db"

@contextmanager
def get_db_connection():
    conn = sqlite3.connect(DB_PATH)
    conn.row_factory = sqlite3.Row  # lets you access columns by name
    try:
        yield conn
        conn.commit()
    except Exception:
        conn.rollback()
        raise
    finally:
        conn.close()

The row_factory = sqlite3.Row line is worth keeping. It makes fetched rows behave like dictionaries, so row["market_ticker"] works instead of row[0]. That matches the psycopg2 RealDictCursor behavior the rest of the code expects.


RETURNING to lastrowid

Postgres supports RETURNING id at the end of an INSERT. You get the new row's primary key back in the same round trip. Useful when you need to immediately associate child records with a parent.

# Postgres with RETURNING
cursor.execute(
    "INSERT INTO trade_decisions (...) VALUES (...) RETURNING id",
    values
)
new_id = cursor.fetchone()[0]

SQLite does not support RETURNING in older versions. The equivalent is cursor.lastrowid after the execute call, which returns the rowid of the last inserted row.

# SQLite with lastrowid
cursor.execute(
    "INSERT INTO trade_decisions (...) VALUES (...)",
    values
)
new_id = cursor.lastrowid

Note: SQLite 3.35+ added RETURNING support. Ubuntu 24 ships SQLite 3.45, so technically RETURNING works on a modern system. I still use lastrowid in the customer bundle because it keeps the code compatible with older environments and avoids surprises.


Partial Indexes

The production schema has a partial index on open positions:

CREATE INDEX idx_open_trades
ON trades (market_ticker, opened_at)
WHERE status = 'open';

SQLite supports partial indexes since version 3.8.9. The syntax is identical. This one actually ports cleanly with no changes.


Timestamps

Postgres has TIMESTAMPTZ. SQLite stores timestamps as TEXT in ISO 8601 format, or as INTEGER Unix timestamps.

I went with TEXT for the customer bundle. The bot writes timestamps using Python's datetime.utcnow().isoformat(). Sorting and filtering by timestamp string works correctly as long as the format is consistent. Querying "all trades in the last 24 hours" requires a bit more care than with a native timestamp type, but it works.

# Write
opened_at = datetime.utcnow().isoformat()

# Read and filter (SQLite)
cursor.execute(
    "SELECT * FROM trades WHERE opened_at > ? AND status = 'open'",
    (cutoff.isoformat(),)
)

The Migration Script

The customer bundle ships with setup_db.py. It creates the SQLite database and all tables from scratch. Running it twice is safe because every CREATE TABLE uses IF NOT EXISTS.

#!/usr/bin/env python3
"""
setup_db.py — Initialize the SQLite database for Predict & Profit bots.
Run once before first launch. Safe to re-run.
"""

import sqlite3
import os

DB_PATH = os.environ.get("BOT_DB_PATH", "trades.db")

SCHEMA = """
CREATE TABLE IF NOT EXISTS trade_decisions (
    id              INTEGER PRIMARY KEY AUTOINCREMENT,
    created_at      TEXT NOT NULL DEFAULT (datetime('now')),
    market_ticker   TEXT NOT NULL,
    event_ticker    TEXT,
    direction       TEXT NOT NULL,
    probability     REAL,
    edge            REAL,
    contracts       INTEGER,
    fill_price      REAL,
    decision        TEXT NOT NULL,
    skip_reason     TEXT,
    ensemble_data   TEXT,
    signal_payload  TEXT,
    source          TEXT
);

CREATE TABLE IF NOT EXISTS trades (
    id              INTEGER PRIMARY KEY AUTOINCREMENT,
    market_ticker   TEXT NOT NULL,
    direction       TEXT NOT NULL,
    contracts       INTEGER NOT NULL,
    fill_price      REAL NOT NULL,
    status          TEXT NOT NULL DEFAULT 'open',
    opened_at       TEXT NOT NULL DEFAULT (datetime('now')),
    closed_at       TEXT,
    close_price     REAL,
    pnl             REAL,
    close_reason    TEXT
);

CREATE INDEX IF NOT EXISTS idx_open_trades
ON trades (market_ticker, opened_at)
WHERE status = 'open';

CREATE INDEX IF NOT EXISTS idx_decisions_ticker
ON trade_decisions (market_ticker, created_at);
"""

def main():
    conn = sqlite3.connect(DB_PATH)
    conn.executescript(SCHEMA)
    conn.commit()
    conn.close()
    print(f"Database initialized: {DB_PATH}")

if __name__ == "__main__":
    main()

That is the entire setup. No server process, no role grants, no pg_hba.conf. Customer runs python setup_db.py and moves on.


Why SQLite Is Actually the Right Call Here

I spent more time than I should have feeling vaguely apologetic about shipping SQLite. It felt like a step down from "real" infrastructure.

It is not.

A trading bot for a single user on a single VPS has one writer at a time. The bot process inserts a row, the bot process reads it back, the bot process updates it on settlement. There is no concurrent write contention. There is no replication requirement. There is no need for server-side cursors or connection pooling across multiple application instances.

SQLite handles 100,000 writes per second on spinning rust. The bot fires maybe 20 database operations per scan cycle. The headroom is absurd.

The argument for Postgres in production is real: I run other things on the same server that share the database, I want JSONB querying for diagnostics, and I am comfortable with the operational overhead. That context does not exist for the customer. The customer is running one bot on one VPS and wants to get trades logging without spending an hour on database setup.

SQLite also gives customers something Postgres does not: a portable audit log. The trades.db file is a single binary you can copy to your laptop, open in DB Browser for SQLite, and inspect visually. No dump, no restore, no psql. That matters for people who are not database engineers.

The one real limitation is concurrent reads. If you build a separate analytics dashboard that queries the database while the bot is writing, you can hit locking issues. The fix is timeout=10 in sqlite3.connect(), which makes readers wait up to 10 seconds for the write lock to release rather than raising immediately. For a solo bot, that is more than enough.

conn = sqlite3.connect(DB_PATH, timeout=10)

Maintaining Both Layers

The practical question is how to avoid diverging the two schemas over time. My approach is simple: the SQLite schema is the canonical customer schema. When I add a column, I add it to the SQLite setup_db.py first, then add the equivalent Postgres migration. The Python bot code uses the abstracted context manager and never calls psycopg2 or sqlite3 directly outside of db/connection.py.

The only file that knows which database backend is running is db/connection.py. Everything else just calls get_db_connection() and uses ? placeholders. That discipline keeps the translation layer contained.

Is it a little annoying? Yes. Has it caused a customer bug in 12+ sales? Not one that I know of.


The real lesson from shipping this is that packaging complexity is a product problem as much as a technical one. PostgreSQL is the right tool for my production setup. SQLite is the right tool for a customer who just wants the bot running. Pretending those are the same context would have made the product worse. Pick the right tool for each environment, write the translation once, and document it so you are not re-figuring it out six months later.

The full schema and connection layer ship with the bot bundle. Everything described here is in the code.