Docker for Node.js services: the setup that actually works in production

September 25, 2023·7 min read·1 comment

Every Node.js Docker tutorial shows you a basic Dockerfile that works. Few of them show you the Dockerfile you actually need for production. The difference is about 20 lines and covers security, image size, startup reliability, and graceful shutdown.

This is the pattern I have converged on after running containerised Node.js services in production for several years.

The Dockerfile

# Stage 1: Install dependencies
FROM node:20-alpine AS deps
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci --only=production

# Stage 2: Build (if using TypeScript or a build step)
FROM node:20-alpine AS builder
WORKDIR /app
COPY package.json package-lock.json ./
RUN npm ci
COPY . .
RUN npm run build

# Stage 3: Production image
FROM node:20-alpine AS runner
WORKDIR /app

RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 appuser

COPY --from=deps /app/node_modules ./node_modules
COPY --from=builder /app/dist ./dist
COPY --from=builder /app/package.json ./package.json

USER appuser

EXPOSE 3000

HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

CMD ["node", "dist/server.js"]

Here's why each of these matters.

Why Alpine

Alpine Linux produces images around 50MB compared to 400MB+ for Debian-based images. The smaller image means faster pulls during deployment and less attack surface.

The trade-off: Alpine uses musl libc instead of glibc. Some native Node.js modules (bcrypt, sharp, canvas) have issues with musl. If you use these modules, either:

Install the Alpine-compatible versions (npm install --platform=linuxmusl)
Use node:20-slim instead, which is Debian-based but stripped down (~80MB)

For most API services that don't use native image processing, Alpine works without issues.

Multi-stage builds

The final image shouldn't contain build tools, dev dependencies, TypeScript source files, or test fixtures. Multi-stage builds achieve this by using separate stages for installation, building, and running.

Stage 1 installs only production dependencies. Stage 2 installs all dependencies and runs the build. Stage 3 copies only the compiled output and production node_modules into a clean image.

The result: a production image that contains only what is needed to run the application.

Non-root user

By default, Docker runs processes as root. If an attacker exploits a vulnerability in your application, they have root access inside the container. With a non-root user, the damage is contained.

RUN addgroup --system --gid 1001 nodejs
RUN adduser --system --uid 1001 appuser
USER appuser

The user is created with a system account (no home directory, no login shell). The USER directive switches all subsequent commands and the CMD to run as this user.

If your application needs to bind to port 80 or 443, use a reverse proxy (nginx, traefik) in front of the container rather than running as root.

.dockerignore

node_modules
npm-debug.log
.git
.gitignore
.env
.env.*
dist
coverage
tests
*.md
.vscode
.idea

Without a .dockerignore, COPY . . sends everything in the build context to the Docker daemon, including node_modules (which are reinstalled inside the container anyway), test files, and potentially sensitive files like .env.

SIGTERM handling

When a container orchestrator (Kubernetes, ECS, Docker Compose) stops a container, it sends SIGTERM. The process has a grace period (default 10 seconds in Docker, 30 in Kubernetes) to shut down cleanly. After the grace period, it receives SIGKILL and is terminated immediately.

Node.js doesn't handle SIGTERM by default. Without explicit handling, the process dies immediately and in-flight requests are dropped.

const server = app.listen(3000);

process.on('SIGTERM', () => {
  console.log('SIGTERM received, starting graceful shutdown');

  server.close(() => {
    console.log('HTTP server closed');
    // Close database connections, flush logs, etc.
    process.exit(0);
  });

  // Force exit after timeout
  setTimeout(() => {
    console.error('Forced shutdown after timeout');
    process.exit(1);
  }, 8000);
});

server.close() stops accepting new connections and waits for existing connections to finish. The timeout is a safety net in case connections don't close within the grace period.

There's a subtle issue: the node process must be PID 1 in the container for SIGTERM to be delivered. If you use a shell form CMD (CMD node server.js), the shell runs as PID 1 and doesn't forward signals. Use the exec form (CMD ["node", "server.js"]) so Node.js is PID 1 directly.

Health checks

HEALTHCHECK --interval=30s --timeout=3s --start-period=10s --retries=3 \
  CMD wget --no-verbose --tries=1 --spider http://localhost:3000/health || exit 1

The health check tells the orchestrator whether the container is ready to receive traffic. Without it, the orchestrator assumes the container is healthy as soon as it starts, which may be before the application has connected to databases and loaded configuration.

The application needs a /health endpoint:

app.get('/health', (req, res) => {
  // Check database connectivity, cache connectivity, etc.
  const dbHealthy = checkDatabaseConnection();
  if (dbHealthy) {
    res.status(200).json({ status: 'ok' });
  } else {
    res.status(503).json({ status: 'unhealthy', reason: 'database' });
  }
});

Environment variables

Never bake secrets into the Docker image. No .env files copied into the image, no ENV SECRET_KEY=... in the Dockerfile.

Pass environment variables at runtime:

docker run -e DATABASE_URL="postgres://..." -e API_KEY="..." myapp

Or with docker-compose:

services:
  api:
    build: .
    env_file: .env.production
    ports:
      - "3000:3000"

The env_file is read at container start time and isn't included in the image.

Docker Compose for local development

version: '3.8'
services:
  api:
    build:
      context: .
      target: builder
    command: npm run dev
    volumes:
      - .:/app
      - /app/node_modules
    ports:
      - "3000:3000"
    env_file: .env.local
    depends_on:
      - db
      - redis

  db:
    image: postgres:16-alpine
    environment:
      POSTGRES_DB: myapp
      POSTGRES_USER: postgres
      POSTGRES_PASSWORD: postgres
    ports:
      - "5432:5432"
    volumes:
      - pgdata:/var/lib/postgresql/data

  redis:
    image: redis:7-alpine
    ports:
      - "6379:6379"

volumes:
  pgdata:

The key detail: target: builder uses the build stage that has dev dependencies. The volume mount at .:/app enables hot reloading. The /app/node_modules anonymous volume prevents the host's node_modules from overwriting the container's.

This gives every developer an identical environment without installing PostgreSQL or Redis locally.

RESPONSES

Carlos MendezOct 8, 2023

The SIGTERM handling section is something I've never seen covered properly in any Docker tutorial. Most people just let the container die mid-request. The code for draining the server is exactly what we needed.