PM2: The Process Manager That Keeps My Bots Running While I Sleep
TL;DR / Key Takeaways
- PM2 is a Node.js process manager that works perfectly for Python scripts and auto-restarts crashed processes without manual intervention.
- The
pm2 startupcommand wires your bots into systemd so they survive reboots, without you having to write a unit file yourself. - PM2's built-in log rotation, the
pm2 monitdashboard, and theecosystem.config.jsfile give you real operational control over long-running bots. - The Predict & Profit Weather Bot and Econ Bot both run under PM2 on a $20/month RackNerd VPS, and they come back on their own if they crash.
I run two trading bots on a VPS. They run 24 hours a day. I am not watching them 24 hours a day. That gap, between when the bot is supposed to be running and when I am actually paying attention, is where process management matters.
For a long time I used systemd. It works. It is built into Ubuntu. But writing unit files is tedious, log handling is annoying, and checking status means running journalctl -u botname -f and squinting at timestamps. It is fine for a single service. It gets old fast when you are managing multiple processes.
PM2 is what I use now. It is technically a Node.js process manager, but it handles Python scripts just as well. You get automatic restarts, a clean dashboard, log rotation, and a startup hook that survives reboots. All of it configured with one file.
Installing PM2
You need Node.js first. On Ubuntu 24:
curl -fsSL https://deb.nodesource.com/setup_20.x | sudo -E bash -
sudo apt install -y nodejs
sudo npm install -g pm2
That is it. PM2 is a global npm package. Once it is installed you can use it with any language, Python included.
The Ecosystem File
PM2 can manage processes from the command line directly, but the right way to do it is an ecosystem config file. One file that defines all your processes, their arguments, their environment variables, and their restart policies.
This is mine, trimmed down for the blog:
// ecosystem.config.js
module.exports = {
apps: [
{
name: "weather-bot",
script: "/home/steve/kalshiTrading/run_bot.py",
interpreter: "/home/steve/kalshiTrading/venv/bin/python",
cwd: "/home/steve/kalshiTrading",
args: "--min-price 0.40 --min-confidence 0.55 --min-ensemble-edge 0.20 --min-agreement 3 --early-exit-threshold 0.70",
restart_delay: 5000,
max_restarts: 10,
min_uptime: "30s",
log_date_format: "YYYY-MM-DD HH:mm:ss",
error_file: "/home/steve/logs/weather-bot-error.log",
out_file: "/home/steve/logs/weather-bot-out.log",
env: {
KALSHI_KEY_ID: "your-key-id",
KALSHI_PRIVATE_KEY_PATH: "/home/steve/.keys/kalshi_private.pem",
OPENMETEO_CACHE_DIR: "/home/steve/kalshiTrading/cache"
}
},
{
name: "econ-bot",
script: "/home/steve/kalshiEconTrading/run_econ_bot.py",
interpreter: "/home/steve/kalshiEconTrading/venv/bin/python",
cwd: "/home/steve/kalshiEconTrading",
restart_delay: 5000,
max_restarts: 10,
min_uptime: "30s",
log_date_format: "YYYY-MM-DD HH:mm:ss",
error_file: "/home/steve/logs/econ-bot-error.log",
out_file: "/home/steve/logs/econ-bot-out.log",
env: {
KALSHI_KEY_ID: "your-key-id",
KALSHI_PRIVATE_KEY_PATH: "/home/steve/.keys/kalshi_private.pem",
CLEVELAND_FED_URL: "https://www.clevelandfed.org/en/indicators-and-data/inflation-nowcasting.aspx",
FRED_API_KEY: "your-fred-key"
}
}
]
}
Two things worth explaining here.
First, the interpreter field. PM2 defaults to Node.js. You have to explicitly point it at your Python virtualenv binary. If you skip this, PM2 will try to run your Python script with Node and you will get a very confusing error.
Second, min_uptime. This is the grace period PM2 uses to decide whether a process actually started successfully. If your script crashes before 30 seconds, PM2 counts it as a failed start, not a normal crash. After max_restarts failed starts it stops trying. This prevents PM2 from hammering a broken process in a restart loop at 3am.
Starting the Bots
Once the ecosystem file is written:
pm2 start ecosystem.config.js
pm2 save
pm2 save dumps the current process list to ~/.pm2/dump.pm2. This is what gets restored on reboot.
The Startup Hook
This is the part systemd handles automatically and PM2 requires one extra command for:
pm2 startup
PM2 prints a command. It looks like this:
sudo env PATH=$PATH:/usr/bin /usr/lib/node_modules/pm2/bin/pm2 startup systemd -u steve --hp /home/steve
Copy it exactly, run it with sudo. It generates a systemd unit that starts PM2 itself on boot, and PM2 restores your saved process list from dump.pm2. You get the reliability of systemd without writing the unit file yourself.
After that, reboot and verify:
pm2 list
Both bots should show online.
Log Management
By default PM2 keeps all logs in ~/.pm2/logs. I redirect to a dedicated logs directory in the ecosystem file because I want to find them fast.
Log rotation is handled by a PM2 module:
pm2 install pm2-logrotate
pm2 set pm2-logrotate:max_size 50M
pm2 set pm2-logrotate:retain 7
pm2 set pm2-logrotate:compress true
That keeps 7 days of logs, caps each file at 50MB, and gzips the rotated files. Without this, a verbose bot running for months will fill your VPS disk. I learned this the hard way on a previous project.
To tail logs in real time:
pm2 logs weather-bot --lines 100
pm2 logs econ-bot --lines 100
pm2 logs --lines 50 # interleaved from all processes
The Monit Dashboard
pm2 monit
This opens a terminal dashboard. Left panel shows your processes with CPU and memory. Right panel shows a live log tail for whichever process you have selected. You navigate with arrow keys.
It is not Grafana. It is not meant to be. It is a fast way to check whether both bots are healthy and what they are currently logging without opening multiple terminal windows. I use it every morning before I start work.
What Happens at 3am
This is the whole point. Say the Weather Bot throws an unhandled exception at 3am, something like a network timeout during the AIGEFS AWS S3 fetch, or an xarray parse error on a malformed GRIB slice.
The process exits with a non-zero code. PM2 sees this, waits 5 seconds (restart_delay), and starts it again. By the time I wake up the bot has been running for hours. The crash is in the error log. I can check it in the morning.
The behavior I actually want logged:
[2026-06-04 03:17:42] App [weather-bot] exited with code [1]
[2026-06-04 03:17:47] App [weather-bot] restarted
Two lines. Five-second gap. Back online.
The restart counter in pm2 list shows how many times a process has restarted. If I see a bot that has restarted 8 times overnight, that is a signal to look at the error log before the next market cycle. Not a 3am page, just a morning investigation.
Useful Commands
A few I use regularly:
pm2 list # process status table
pm2 restart weather-bot # restart one process
pm2 reload ecosystem.config.js # reload config with zero downtime
pm2 stop econ-bot # stop without removing from list
pm2 delete econ-bot # stop and remove from list
pm2 show weather-bot # detailed info for one process
pm2 flush # clear all log files
pm2 reload is different from pm2 restart. Restart kills the process and starts a new one. Reload tries to do a graceful restart with zero downtime if the process handles SIGINT cleanly. For trading bots where I want a clean shutdown before restart, I actually prefer pm2 restart. The bots are not handling incoming web requests, so zero-downtime reload is not meaningful here.
Why Not Just systemd
I am not against systemd. I used it for a year. The unit file is not complicated once you write one.
The reason I switched is tooling density. PM2 gives me log rotation, a monitoring dashboard, restart policies, startup hooks, and multi-process management from one config file and one command set. Getting the same out of systemd requires separate config for log rotation (logrotate), separate service files per bot, and journalctl for log access.
PM2 is a thinner abstraction. For a solo developer running two bots on a VPS, thinner wins.
The one legitimate complaint about PM2: it adds a Node.js dependency to a Python project. That is a real cost. If you are philosophically committed to a pure Python stack, use systemd. Both work. I just find PM2 faster to operate day to day.
The Actual Setup on My VPS
RackNerd VPS, Ubuntu 24, 4GB RAM, $20/month. PM2 is running weather-bot and econ-bot. Both are pointed at their own virtualenvs. Logs go to /home/steve/logs. Log rotation keeps 7 days.
I check pm2 monit in the morning. I look at the restart counter. If a bot has crashed and recovered, I read the error log. If it is still online with zero restarts, I close the terminal and go back to my day job.
That is the whole system. It is not impressive. It is just reliable.
The bots handle the trades. PM2 handles the bots. I handle my coffee.