Systems | Hg's Portfolio

Production runs on a small Vultr VPS: 1 CPU, 2GB RAM. Cloud Run stays as failover only. Builds never run on the box — push main, GitHub Actions does the rest.

Notes below are from a real deploy: what worked, what broke, and what we changed.

Context: what we chose and what we skipped

The project is one monorepo with two Next.js apps on one domain:

Marketing / main site (/) — Next.js at the repo root
Product app — lives in apps/web, public under /apps/<product-slug>

Two apps, two containers, one domain.

We kept GCP Cloud Run as failover — point DNS back and the old path still works. Primary production is the VPS: cheap, controllable, enough for this stage.

A 2GB VPS does not build Next.js on-server. next build eats RAM. Build on GitHub Actions, push images to GHCR, deploy to the VPS.

Architecture (compact)

Request path:

Internet → nginx (VPS, ports 80/443)
              ├─ /           → site container    (127.0.0.1:8080)
              └─ /apps/      → product container (127.0.0.1:8081)

CI/CD:

push main → build site + product images → push GHCR
         → smoke test on runner
         → ship images to VPS → docker compose up

Containers are not exposed to the internet directly. Only nginx is public. Certbot runs on the nginx host, not inside Docker — keeps TLS and routing simpler.

Infra layout in the repo

We split concerns clearly:

Path	Role
`infra/platforms/shared/`	docker-compose, deploy scripts — shared across VPS providers
`infra/platforms/vultr/`	bootstrap, nginx template, config sync — Vultr only
`infra/platforms/gcp/`	Terraform for Cloud Run — failover

Later, moving to Hetzner means copy vultr/ → hetzner/, adjust bootstrap; shared/ stays.

On the VPS, runtime lives at a fixed workspace path, for example:

/home/<deploy-user>/workspace/projects/<repo-name>/

One-time bootstrap at ~/workspace/bootstrap/vultr/. We skipped /opt — the team already had a workspace layout. Pick paths and stick to them.

GitHub: setup before the workflow

Secrets

In the repo → Settings → Secrets:

Secret	Purpose
`VULTR_HOST`	VPS IP
`VULTR_USER`	SSH user (e.g. `deploy`)
`VULTR_SSH_KEY`	Deploy private key

Create a dedicated key pair for CI and add the public key to authorized_keys on the VPS. Do not reuse your daily personal key.

Org permissions for Actions

If your GitHub org locks GITHUB_TOKEN to read-only, the build job cannot push to GHCR. An org owner must enable Read and write at org level, or the job must declare permissions: packages: write.

GHCR is lowercase: org MyOrg → image ghcr.io/myorg/my-web/site:sha-xxxxxxx. Tags use sha- + 7 commit chars so you always know which revision is running.

After the first build: Packages → connect your repo → Actions access. Skip this and smoke tests fail often.

Workflow: three jobs, one pipeline

File: .github/workflows/deploy-production.yml

Job	What it does
build	Build and push site + product images; tag `sha-xxxxxxx` and `latest`
smoke	Pull on runner, run containers, `curl` `/api/health` and `/apps/<product-slug>`
deploy	Ship images to VPS, `compose up`

If smoke passes but deploy fails, the problem is usually SSH or the VPS — not the build.

GHCR pull on VPS vs shipping the image over SSH

First attempt on the VPS — the usual pattern:

docker login ghcr.io
docker compose pull
docker compose up -d

On the runner: fine. On the VPS: 403 Forbidden — even with a PAT scoped to read:packages, even piping GITHUB_TOKEN from CI over SSH.

curl GitHub user API → 200. curl GHCR manifest → 403. The token is alive; it still cannot read the org package. Manage Actions access on the package helps workflows in that repo, not arbitrary VPS pulls with a personal PAT.

What we chose

Runner pulls (smoke already OK) → docker save → compress → scp to VPS /tmp/ → docker load → delete tarball → DEPLOY_SKIP_PULL=1 ./scripts/deploy.sh.

	Pull on VPS	Ship over SSH
GHCR auth on VPS	Required	Not required
Bandwidth	Smaller if layers cache	Larger each deploy
Reliability (our case)	Failed with 403	Works

We ship over SSH for now: small VPS, low traffic, a few extra minutes per deploy is fine. If org GHCR permissions get fixed, switch back to pull — deploy.sh supports both.

/tmp/ is only a short-lived tarball staging area, not where the app runs. Delete after docker load.

First-time VPS setup (bootstrap)

bootstrap.sh installs: Docker, nginx, certbot, ufw, 2GB swap (without swap, 2GB RAM hangs easily), copies nginx config to /etc/nginx/sites-enabled/<app-name>.

Re-SSH after bootstrap so the docker group applies.

We once had healthy containers but no site — bootstrap had not run, so no nginx config. Green docker ps does not mean production is done.

Quick checks:

ls /etc/nginx/sites-enabled/<app-name>
curl -H "Host: example.com" http://127.0.0.1/api/health

DNS and Certbot

At your DNS provider:

A @ → VPS IP
A www → VPS IP
Remove stray AAAA if the VPS has no real IPv6

Certbot failed with 404; logs showed odd IPv6 (2001:4860:...) — Let's Encrypt hit the wrong host. Remove AAAA, wait for DNS, then:

curl http://example.com/.well-known/acme-challenge/test

When HTTP is stable:

sudo certbot --nginx -d example.com -d www.example.com

CSS bug: two Next.js apps, one domain

The product app HTML loads under /apps/... (product container). Default Next.js CSS lives at /_next/static/... — nginx routes / to the site container.

Result: the product page renders text, no styles. DevTools: CSS 404 or wrong file.

Fix: assetPrefix: "/apps" in apps/web/next.config.ts. Assets become /apps/_next/static/... and hit the product upstream.

Two apps on one domain: plan static asset routing up front, not after launch.

After setup: what each release looks like

Mostly:

Merge / push main
Actions green (build → smoke → deploy)
Browser check; hard refresh if CSS/JS changed

No manual SSH deploy. No GHCR pull on the VPS (with our current ship-over-SSH path).

CI does not change nginx or certs. Reverse proxy or TLS changes → manual on VPS.

Runtime env (non-public vars in .env): edit on VPS, docker compose up -d
NEXT_PUBLIC_*: requires a new image build — baked at build time

Checklist

GitHub secrets VULTR_*, org Actions permissions
VPS: bootstrap + sync config + .env
DNS A records; no bogus AAAA
Push main, wait for CI
Certbot after HTTP works
Verify main site and product path (e.g. https://example.com/ and https://example.com/apps/<product-slug>)

Full copy-paste commands belong in your repo runbook (e.g. docs/PRODUCTION-DEPLOY.md).

Production is more than docker run: DNS, nginx, TLS, registry auth, and routing two Next.js apps on one hostname. The diagram in tutorials rarely matches org GHCR policy.

A 2GB VPS works if you build in CI, terminate TLS at nginx, and remember GHCR 403 often needs org/package access — not another PAT.

Next.js on a 2GB VPS: Real CI/CD, Real Failures, and What We Did