Cron Jobs Are the Most Underrated Automation Tool in a Solo Builder's Stack
TL;DR / Key Takeaways
- Cron is older than most of your coworkers and more reliable than most scheduled task frameworks you will ever use.
- The
>> logfile 2>&1pattern is the single most useful thing you can add to any crontab entry. - Real recurring work, DB exports, blog post generation, bot health checks, results JSON updates, belongs in cron, not in a task queue you have to babysit.
- The Predict & Profit trading bots run on a RackNerd VPS, and cron is what stitches the whole operation together between bot cycles.
Every time I see a solo developer reach for Celery, Airflow, or a cloud scheduler to run a script once a day, I feel a little tired.
Not judgment. Just fatigue. Because I did the same thing for years in corporate environments where the answer to "how do we run this nightly?" was always a new service, a new queue, a new deployment, and three JIRA tickets.
Cron does not care about any of that. Cron runs your script. Cron goes to sleep. Cron runs it again tomorrow.
That is the entire product.
What Cron Actually Is
Cron is a Unix time-based job scheduler that has been shipping with Linux since the early 1970s. It reads a configuration file called a crontab, checks it every minute, and runs whatever commands are scheduled for that minute.
No daemon you have to manage separately. No web UI. No concept of retries, dependencies, or DAGs. If you need retries and dependencies, you want something else. For recurring scripts that should just run, cron is the answer.
The crontab syntax trips people up once and then they never forget it.
# ┌───────────── minute (0–59)
# │ ┌───────────── hour (0–23)
# │ │ ┌───────────── day of month (1–31)
# │ │ │ ┌───────────── month (1–12)
# │ │ │ │ ┌───────────── day of week (0–7, Sunday is 0 or 7)
# │ │ │ │ │
# * * * * * command to execute
Five fields. Stars mean "every." Numbers mean "at this specific value." Commas mean "and also." Slashes mean "every N."
A few examples before we get to the real crontab:
# Run at 6:30 AM every day
30 6 * * * /home/steve/scripts/morning_export.sh
# Run every 15 minutes
*/15 * * * * /home/steve/scripts/healthcheck.py
# Run at midnight on the first of every month
0 0 1 * * /home/steve/scripts/monthly_summary.py
# Run Monday through Friday at 8 AM
0 8 * * 1-5 /home/steve/scripts/weekday_report.py
If you can never remember the order, the mnemonic I use is: minutes before hours, like reading a clock backwards. The rest is just logic.
The One Pattern That Makes Cron Actually Useful
Raw cron output goes to the system mail spool by default, which nobody checks. If your script crashes at 3am, you find out the next day when results are missing. Sometimes you find out the day after that.
The fix is two characters: 2>&1
This redirects stderr to stdout, so error messages go to the same place as normal output. Then you append all of it to a log file:
30 6 * * * /home/steve/scripts/morning_export.sh >> /home/steve/logs/morning_export.log 2>&1
Now every run appends its output to a log file you can actually inspect. Your cron job failed at 3am? tail -50 /home/steve/logs/morning_export.log tells you exactly why.
One more thing: always use full absolute paths in crontab. Cron runs with a stripped-down environment. Your PATH is not what you think it is. If your script calls python3, use /usr/bin/python3. If it calls a virtualenv, activate it explicitly or use the full path to the venv binary.
# Wrong: relies on PATH being set correctly
30 6 * * * python3 /home/steve/scripts/export.py
# Right: explicit path, explicit virtualenv
30 6 * * * /home/steve/venv/bin/python3 /home/steve/scripts/export.py >> /home/steve/logs/export.log 2>&1
This is the difference between a cron job that runs and one that silently does nothing.
My Actual Crontab on the RackNerd VPS
This is the real one, lightly edited to remove paths that include account names. Everything else is as it runs.
# Edit with: crontab -e
# View with: crontab -l
# --- WEATHER BOT ---
# Run weather bot every 6 hours (new GFS cycles at 00z, 06z, 12z, 18z)
0 1,7,13,19 * * * /home/steve/venv/bin/python3 /home/steve/kalshiTrading/auto_trader.py --dry-run=false >> /home/steve/logs/weather_bot.log 2>&1
# Daily weather bot health check (just verify it can auth and fetch a forecast)
30 5 * * * /home/steve/venv/bin/python3 /home/steve/kalshiTrading/healthcheck.py >> /home/steve/logs/weather_healthcheck.log 2>&1
# --- ECON BOT ---
# CPI/PCE scan runs daily at 7 AM and 1 PM Eastern
# (Cleveland Fed nowcast updates mid-morning, BLS data drops early AM on release days)
0 12,18 * * * /home/steve/venv/bin/python3 /home/steve/kalshiEconTrading/econ_auto_trader.py >> /home/steve/logs/econ_bot.log 2>&1
# --- RESULTS ---
# Export trade results to JSON for website display
# Runs nightly at 11 PM
0 23 * * * /home/steve/venv/bin/python3 /home/steve/scripts/export_results_json.py >> /home/steve/logs/results_export.log 2>&1
# --- BLOG ---
# Automated blog post generation (calls Anthropic API with CLAUDE.md context)
# Runs Sundays at 8 AM
0 8 * * 0 /home/steve/venv/bin/python3 /home/steve/scripts/generate_blog_post.py >> /home/steve/logs/blog_gen.log 2>&1
# --- DB BACKUP ---
# SQLite DB export to /backups, keeps last 7 days
0 2 * * * /home/steve/scripts/db_backup.sh >> /home/steve/logs/db_backup.log 2>&1
# --- LOG ROTATION ---
# Truncate logs older than 30 days (find and delete, not logrotate config)
0 3 * * 0 find /home/steve/logs -name "*.log" -mtime +30 -delete
A few things worth explaining here.
The weather bot runs at 01z, 07z, 13z, and 19z rather than the GFS cycle times of 00z, 06z, 12z, 18z. GFS data is not available the moment the cycle ticks over. It takes 3.5 to 5 hours to process and appear on Open-Meteo and AWS S3. Running an hour after the cycle starts catches the freshest data without hitting stale forecasts.
The econ bot runs at 12z and 18z UTC, which is 8am and 2pm Eastern. Cleveland Fed updates their nowcast mid-morning Eastern. BLS CPI drops at 8:30am Eastern on release days. Running at 8am catches the fresh BLS data on release days, and 2pm catches any Cleveland Fed updates that came in after the morning cycle.
The blog post generation script is the one most people ask about. It is just a Python script that reads the CLAUDE.md context file, loads a subject queue, picks the next unused subject, calls the Anthropic API, and writes the resulting markdown to the content directory. Then a separate deploy hook picks it up. Total runtime under 90 seconds. Runs Sunday morning while I am still asleep.
The DB Backup Script
This one is worth showing because it is a common pattern and the details matter.
#!/bin/bash
# db_backup.sh
# Backs up SQLite trade database with a datestamped filename
# Keeps last 7 days, deletes older files automatically
set -euo pipefail
DB_PATH="/home/steve/kalshiTrading/trades.db"
BACKUP_DIR="/home/steve/backups"
DATE=$(date +%Y%m%d)
BACKUP_FILE="$BACKUP_DIR/trades_$DATE.db"
# Create backup directory if it doesn't exist
mkdir -p "$BACKUP_DIR"
# SQLite backup using the .backup command (safe for live DB)
sqlite3 "$DB_PATH" ".backup '$BACKUP_FILE'"
echo "Backup complete: $BACKUP_FILE ($(du -h "$BACKUP_FILE" | cut -f1))"
# Delete backups older than 7 days
find "$BACKUP_DIR" -name "trades_*.db" -mtime +7 -delete
echo "Old backups cleaned."
The sqlite3 .backup command is important. Doing a raw cp on a live SQLite database can give you a corrupt backup if a write is in progress. The .backup command is SQLite's own online backup API and handles that correctly.
set -euo pipefail at the top means the script exits immediately on any error, treats unset variables as errors, and catches failures in pipes. Without this, a bash script will happily continue past a failed command and log "success" when it didn't succeed.
The Results JSON Export
The website displays recent trade results pulled from a JSON file. This is simpler than connecting the website directly to the database. The export script queries the SQLite DB and writes a structured JSON file that the frontend reads.
#!/usr/bin/env python3
# export_results_json.py
# Runs nightly via cron. Exports last 90 days of settled trades to JSON.
import json
import sqlite3
from datetime import datetime, timedelta
from pathlib import Path
DB_PATH = Path("/home/steve/kalshiTrading/trades.db")
OUTPUT_PATH = Path("/home/steve/predictandprofit-site/public/data/results.json")
def export_results():
cutoff = (datetime.utcnow() - timedelta(days=90)).isoformat()
conn = sqlite3.connect(DB_PATH)
conn.row_factory = sqlite3.Row
cursor = conn.cursor()
cursor.execute("""
SELECT
trade_date,
market_ticker,
side,
contracts,
entry_price,
exit_price,
pnl_cents,
status,
close_reason
FROM trades
WHERE status = 'settled'
AND trade_date >= ?
ORDER BY trade_date DESC
""", (cutoff,))
rows = [dict(row) for row in cursor.fetchall()]
conn.close()
summary = {
"generated_at": datetime.utcnow().isoformat() + "Z",
"trade_count": len(rows),
"total_pnl_cents": sum(r["pnl_cents"] for r in rows),
"trades": rows
}
OUTPUT_PATH.parent.mkdir(parents=True, exist_ok=True)
OUTPUT_PATH.write_text(json.dumps(summary, indent=2))
print(f"Exported {len(rows)} trades. Total PnL: {summary['total_pnl_cents']} cents.")
if __name__ == "__main__":
export_results()
The script prints a summary line that ends up in the log. So tail -5 /home/steve/logs/results_export.log tells me exactly how many trades exported and what the running PnL is. No dashboard required.
Why Not Airflow, Prefect, or a Cloud Scheduler
I have used all three in production at the day job. They are good tools for the right problem.
The right problem is: complex DAGs with multiple dependent tasks, retries with backoff, a team that needs visibility into pipeline runs, and more than one machine executing work.
That is not my problem. My problem is "run this script at 7am every day and tell me if it failed."
Airflow requires a database backend, a web server, a scheduler process, and a worker process. For a solo builder on a $30/month VPS, that overhead is absurd. Prefect has a cloud service and a local agent model and still requires more moving parts than the task justifies.
Cron has been running correctly on Unix systems since before I had my first programming job. Its failure mode is well understood. It has no dependencies. When the VPS reboots, cron starts automatically.
For a solo builder running a handful of recurring scripts, cron wins on reliability and simplicity every time. Add logging with >> logfile 2>&1, use absolute paths, and test every new entry manually before trusting it to run unattended. That is the entire operational discipline required.
The One Gotcha That Will Bite You
Environment variables. Specifically, the ones your scripts depend on but cron does not have.
API keys, database connection strings, anything you set in .bashrc or .bash_profile, none of that is available to cron jobs by default. Cron runs with a minimal environment.
The fix I use is a .env file loaded explicitly at the top of each Python script:
from dotenv import load_dotenv
load_dotenv("/home/steve/.env")
Or for bash scripts, source the env file directly:
source /home/steve/.env
That .env file lives on the server, never in the repository, and contains every API key the scripts need. Cron jobs can now see those variables regardless of which user or shell context they run in.
Cron is not exciting. It does not have a logo or a conference or a Slack community. It is just a scheduler that has worked correctly for fifty years and will keep working correctly after every framework you evaluate this year is deprecated.
If you are running a solo project on a Linux server, open crontab -e and move your recurring scripts there. Add the log redirect. Use absolute paths. You will spend less time on infrastructure and more time on the work that actually matters.
That is the whole post.