Skip to content

Zero-Downtime Deploys & Rollbacks

Every deployment on Stackpad is a zero-downtime deployment. If a new version fails its health check, Stackpad automatically rolls back to the previous working version. This page explains exactly how that works.

Zero-downtime deployments

When a new version is deployed, Stackpad follows a blue-green deployment strategy:

  1. The new container starts alongside the old one
  2. Stackpad runs a health check against the new container
  3. If the health check passes, Caddy switches traffic to the new container
  4. The old container is stopped after traffic drains

At no point are both containers stopped — your users always hit a running version.

Health checks in detail

Health checks are how Stackpad decides whether a new deployment is working. The behavior depends on the service type:

Web services (HTTP check)

Stackpad sends a GET request to the service’s configured port (e.g. port 3000). The check passes if:

  • The port is accepting connections AND
  • The response status code is 2xx or 3xx

The check hits your application root — it doesn’t use a special /health endpoint. Any valid HTTP response means the service is healthy.

Non-web services (TCP check)

For databases, caches, and background services, Stackpad verifies that the configured port is accepting TCP connections. No HTTP request is sent.

Timeout

Health checks have a 90-second timeout. Stackpad retries the check during this window. If the service doesn’t respond within 90 seconds, the deployment is marked as failed.

Automatic rollbacks

If a health check fails, Stackpad automatically:

  1. Keeps the old container running — your users keep seeing the previous version
  2. Stops the new container — the failed version is removed
  3. Marks the deployment as “Failed” — visible in the dashboard
  4. Preserves build logs — you can view what went wrong

No manual intervention is needed. Your users never see an error page.

When deployments fail

A deployment can fail at several stages. Here’s how to diagnose each:

Build failure

The code failed to compile or the Docker image couldn’t be built.

How to debug:

  1. Go to the Deployments tab on your service or project
  2. Click the failed deployment
  3. Read the build logs — they show the full output of the build process

Common causes:

  • TypeScript compilation errors
  • Missing dependencies
  • Incorrect build command
  • Build exceeds the 10-minute timeout (large monorepos, slow installs)

Health check failure

The build succeeded but the application didn’t start correctly.

How to debug:

  1. Check the build logs — the build itself succeeded, so look for runtime errors
  2. Check the service Logs tab — if the container started briefly, it may have logged errors before crashing
  3. Common causes:
    • Missing environment variable (e.g. DATABASE_URL not set)
    • Port mismatch (app listens on 8080 but service is configured for 3000)
    • Application crash on startup (unhandled exception)
    • Startup takes longer than 90 seconds

Deploy failure

Rare — the image was built but couldn’t be pulled or started on the compute node.

How to debug: Check the deployment status in the dashboard. If this happens repeatedly, it’s likely an infrastructure issue — contact support.

Manual rollbacks

To roll back to a previous version:

  1. Go to the Deployments tab on your service
  2. Find a previous successful deployment (status: Ready)
  3. Click Redeploy to restore that version

The redeploy goes through the same zero-downtime process — the old version is health-checked and traffic is switched only after it’s confirmed working.

Deployment status reference

StatusMeaningWhat to do
QueuedWaiting for a build slotWait — max 4 concurrent builds
BuildingCloning repo, installing deps, buildingWait — 10 min timeout
DeployingStarting container, running health checkWait — 90 sec timeout
ReadyLive and serving trafficNothing — it’s working
FailedBuild or health check failedCheck build logs and service logs
StoppedManually stoppedRedeploy to restart

What’s next?