< Back to Blog

GitHub Actions from Scratch: The Deployment Pipeline I Wish I'd Had in Corporate

TL;DR / Key Takeaways

  • The entire predictandprofit.io deployment pipeline runs in under 90 seconds using GitHub Actions, SSH, and pm2.
  • Enterprise CI/CD is slow by committee, not by necessity. A solo developer can have a better pipeline than most Fortune 500 teams.
  • The workflow does four things: connects to the VPS, pulls the latest code, rebuilds the Next.js frontend, and restarts the process manager. That's it.
  • The same discipline that keeps the Predict & Profit trading bots lean and auditable applies to the deployment infrastructure too.

I spent years at large companies watching deployment pipelines that were genuinely impressive engineering achievements, in the same way a Rube Goldberg machine is impressive. Fifteen stages. Three approval gates. A Slack bot that posted a message to four channels when anything failed. Once, I watched a senior DevOps engineer spend two days debugging why a pipeline step was taking 47 minutes to install Node dependencies that hadn't changed in six months.

Nobody asked why. That was the pipeline. The pipeline was sacred.

When I set up the deployment for predictandprofit.io, I had about two hours on a Saturday morning before my wife wanted to go grocery shopping. That constraint turned out to be a feature.

Here's what I built, why it works, and what corporate CI/CD gets wrong.


The Stack

The site runs on a RackNerd VPS. Ubuntu 24. Next.js 14 for the frontend. The trading bots run separately as systemd services on the same machine. pm2 handles the Next.js process.

The GitHub repo is the source of truth. When I push to main, the site updates. That's the entire contract.

I don't need Docker. I don't need Kubernetes. I don't need an artifact registry or a container scanning step or a deployment approval workflow that pages three people who are all in different time zones.

I need SSH, git, and npm.


The Workflow File

Here's the actual .github/workflows/deploy.yml from the repo, sanitized for the obvious secrets:

name: Deploy to Production

on:
  push:
    branches:
      - main

jobs:
  deploy:
    runs-on: ubuntu-latest
    timeout-minutes: 10

    steps:
      - name: Checkout code
        uses: actions/checkout@v4

      - name: Set up SSH key
        run: |
          mkdir -p ~/.ssh
          echo "${{ secrets.DEPLOY_KEY }}" > ~/.ssh/deploy_key
          chmod 600 ~/.ssh/deploy_key
          ssh-keyscan -H ${{ secrets.VPS_HOST }} >> ~/.ssh/known_hosts

      - name: Deploy to VPS
        run: |
          ssh -i ~/.ssh/deploy_key -o StrictHostKeyChecking=no \
            ${{ secrets.VPS_USER }}@${{ secrets.VPS_HOST }} << 'EOF'
            set -e
            cd /var/www/predictandprofit
            git pull origin main
            npm ci --production=false
            npm run build
            pm2 restart predictandprofit --update-env
            pm2 save
          EOF

      - name: Verify deployment
        run: |
          sleep 5
          curl -f -s -o /dev/null -w "%{http_code}" \
            https://predictandprofit.io | grep -q "200"
          echo "Deployment verified."

That's 40 lines including whitespace and comments. It runs in under 90 seconds on a push. The timeout-minutes: 10 is there because if it ever takes more than 10 minutes, something is genuinely wrong and I want the job to fail loudly rather than sit there burning runner minutes.


What Each Step Actually Does

Checkout code is just GitHub Actions pulling your repo into the runner's working directory. You need this even though the actual deployment happens over SSH. Why? Because without it, the runner has no context for what branch you're on or what triggered the run.

Set up SSH key writes the private key from a GitHub secret to a file on the runner, sets the correct permissions (chmod 600, because SSH will refuse to use a key that's world-readable), and adds the VPS host to known_hosts via ssh-keyscan. That last part is important. Without it, SSH will prompt interactively to accept the host fingerprint, the pipeline will hang, and you'll spend 20 minutes wondering why nothing happened.

The DEPLOY_KEY secret is a dedicated SSH keypair I generated specifically for deployment. The private key lives in GitHub Secrets. The public key is in ~/.ssh/authorized_keys on the VPS. Not my personal SSH key. A separate key with one job.

Deploy to VPS opens an SSH session and runs a heredoc as a remote shell script. The set -e at the top is not optional. It means the script exits immediately if any command fails. Without it, npm run build could fail silently and pm2 would restart the process with broken code. set -e is the difference between a failed deployment that tells you it failed and a failed deployment that quietly serves broken pages to real visitors.

The sequence inside the heredoc:

  1. git pull origin main brings the VPS in sync with the repo.
  2. npm ci --production=false installs dependencies cleanly from package-lock.json. The --production=false flag keeps dev dependencies around because Next.js needs them to build.
  3. npm run build compiles the Next.js app.
  4. pm2 restart predictandprofit --update-env restarts the running process and picks up any environment variable changes.
  5. pm2 save persists the process list so if the VPS reboots, pm2 knows what to restart.

Verify deployment waits five seconds for pm2 to finish restarting, then hits the production URL with curl and checks for a 200 response. Simple. If the site is down, the job fails and I get a GitHub notification. No elaborate health check framework required.


The Secrets Setup

GitHub Actions secrets live under Settings > Secrets and variables > Actions in your repo. I have four:

  • DEPLOY_KEY: the private half of the deployment SSH keypair (include the full PEM block, newlines and all)
  • VPS_HOST: the IP address of the RackNerd VPS
  • VPS_USER: the Linux user account on the VPS (not root)
  • Nothing else. There are no other secrets this pipeline needs.

One thing that trips people up: when you paste a multi-line private key into a GitHub secret, the newlines have to survive intact. GitHub handles this correctly if you paste directly. If you're setting secrets via the gh CLI, use --body "$(cat ~/.ssh/deploy_key)" and verify the stored value looks right in the UI.


The pm2 Configuration

The ecosystem.config.js in the repo root tells pm2 how to run the app:

module.exports = {
  apps: [
    {
      name: "predictandprofit",
      script: "node_modules/.bin/next",
      args: "start",
      cwd: "/var/www/predictandprofit",
      env: {
        NODE_ENV: "production",
        PORT: 3000,
      },
      max_restarts: 5,
      restart_delay: 2000,
      error_file: "/var/log/pm2/predictandprofit-error.log",
      out_file: "/var/log/pm2/predictandprofit-out.log",
    },
  ],
};

max_restarts: 5 and restart_delay: 2000 mean pm2 will try to restart a crashed process five times with 2 second gaps between attempts before giving up. Without these, a bug that causes an immediate crash on startup will put pm2 in an infinite restart loop that hammers the VPS CPU. I learned this the hard way on a different project.

Nginx sits in front of pm2 and proxies port 80/443 to port 3000. SSL is handled by Certbot. That's a separate topic but the short version is: sudo certbot --nginx -d predictandprofit.io and you're done.


What Enterprise CI/CD Actually Looks Like

I'm not going to name specific tools because the problem isn't the tools. The problem is organizational gravity.

A typical enterprise pipeline at a company with more than 500 engineers has:

  • A lint stage that runs the same ESLint rules that have been configured for three years and catches zero new issues
  • A unit test stage that takes 12 minutes because someone added Selenium tests to the unit test suite four years ago and nobody moved them
  • A Docker build stage that rebuilds the entire image from scratch every time because layer caching was never configured
  • A security scan stage that flags the same five vulnerabilities in a transitive dependency that was last updated in 2021 and is on the "known exceptions" list
  • An artifact upload stage that pushes the build to an S3 bucket or an artifact registry, which adds three minutes for a 200MB Next.js build
  • A staging deployment that deploys to an environment nobody looks at
  • A manual approval step where someone clicks a button in a UI
  • A production deployment that is functionally identical to the staging deployment but takes longer because the ECS cluster is configured differently
  • A post-deployment smoke test that was written for the old version of the app and hasn't been updated since the redesign

Total runtime: 40 to 55 minutes. On a good day.

The team treats this as normal. New engineers inherit it and assume this is how things work. The engineers who know better are busy firefighting or in meetings.

My pipeline does the same job in 90 seconds. The difference is not technical sophistication. The difference is that I have no organizational debt and nobody can book a meeting to add an approval gate.


The One Thing I Would Add

Right now, if npm run build fails, the deployment fails and the site keeps running on the previous build. That's correct behavior. But I have no automatic rollback if the build succeeds but the app crashes after pm2 restart.

The right fix is a pre-swap health check. Before restarting pm2, build into a staging directory, run a quick sanity check, then swap the symlink. Something like:

# Build to a timestamped release directory
RELEASE_DIR="/var/www/releases/$(date +%Y%m%d%H%M%S)"
mkdir -p "$RELEASE_DIR"
git --work-tree="$RELEASE_DIR" checkout -f main
cd "$RELEASE_DIR"
npm ci --production=false
npm run build

# Swap the symlink
ln -sfn "$RELEASE_DIR" /var/www/predictandprofit

# Restart
pm2 restart predictandprofit --update-env
pm2 save

I haven't done this yet because the current setup has been stable for months and I don't want to introduce complexity that I'd have to debug at 11pm. The blog post version of me says to do it right. The actual me says it's on the list.


Why This Matters Beyond the Blog

The same instinct that keeps this pipeline lean is the same instinct baked into the trading bots. Every layer of complexity is a place where things can fail silently. The Weather Bot's agreement filter requiring 3 of 4 ensembles to agree before placing a trade is the set -e of automated trading. It means the system fails loudly and early rather than quietly and expensively.

If you're buying the Predict & Profit source code, you're getting both bots already deployed this way on my own VPS. The deployment patterns are documented. The systemd service files are included. You're not starting from a blank repo.

The code is $97. The 30 years of watching what not to do is free.


The Honest Closer

This pipeline is not clever. It is not impressive to anyone who has been doing DevOps for a decade. It does exactly one thing well: it ships code to production reliably without requiring me to think about it.

That's the goal. Ship fast, fail loudly, recover quickly. Everything else is theater.