How to Achieve Graceful Shutdown and Zero-Downtime Deployment in Spring Boot

2026-04-03

Written by: Zuko

Spring Boot Kubernetes E382b0e383ace383bce382b9e38395e383abe382b7e383a3e38383e38388e38380e382a6e383b3 E382bce383ade38380e382a6e383b3e382bfe382a4e383a0 E69cace795aae9818be794a8 Sre E382b3e383b3e38386e3838a

About this article

This article deepens your understanding of Spring Boot. A practical guide to combining graceful shutdown configuration in Spring Boot 2.3+ with Kubernetes preStop hooks to ensure zero-downtime deployments without dropping in-flight requests.

About the author View all Spring Boot articles

Have you ever experienced errors being returned to some users during a Kubernetes rolling update?

The cause is almost always insufficient graceful shutdown configuration. By correctly configuring both the Spring Boot side and the Kubernetes side, you can deploy without dropping any in-flight requests.

Why Requests Fail During Deployment

When Kubernetes terminates a Pod during a rolling update, the shutdown sequence proceeds in the following order:

The Pod enters Terminating state and is removed from Endpoints
The preStop hook executes (if configured)
SIGTERM is sent to the application after preStop completes
SIGKILL is sent after terminationGracePeriodSeconds elapses

The key thing to watch out for here is that even after a Pod is removed from Endpoints, it takes a few seconds for kube-proxy to reflect that change. Without a preStop hook, traffic may continue arriving at the moment SIGTERM is received.

Furthermore, Spring Boot by default stops serving immediately upon receiving SIGTERM, which forcibly terminates any in-flight requests. The combination of graceful shutdown and the preStop hook is what resolves this timing gap.

Enabling Graceful Shutdown in Spring Boot

If you are on Spring Boot 2.3 or later, a single line added to application.properties is all you need.

server.shutdown=graceful

This changes the behavior after receiving SIGTERM as follows:

Stop accepting new requests
Wait for in-flight requests to complete
Shut down once they have completed

This setting does not exist in Spring Boot 2.2 and earlier, so check your version first.

Configuring the Timeout

Graceful shutdown has a wait timeout. The default is 30 seconds.

spring.lifecycle.timeout-per-shutdown-phase=30s

Requests that do not complete within this time are forcibly terminated. Base this value on your API’s P99 response time. For typical REST APIs, 30 seconds is sufficient, but consider increasing it if you have heavy batch processing.

Note that this is a timeout per shutdown phase. For applications with multiple Bean lifecycle phases, keep in mind that the total shutdown time may exceed this value.

Delaying SIGTERM with a Kubernetes preStop Hook

This is the most commonly overlooked point.

Kubernetes removes the Pod from Endpoints the moment it enters Terminating state, but the preStop hook executes before SIGTERM is sent. By inserting a sleep, you can ensure SIGTERM reaches the application only after kube-proxy has had time to propagate the change.

lifecycle:
  preStop:
    exec:
      command: ["sh", "-c", "sleep 10"]

5 to 10 seconds of sleep is the generally recommended range. Ten seconds provides a comfortable buffer for most cluster configurations.

Aligning Numbers with terminationGracePeriodSeconds

Once you have configured preStop and graceful shutdown, verify that terminationGracePeriodSeconds is greater than their combined total.

terminationGracePeriodSeconds > preStop sleep seconds + timeout-per-shutdown-phase seconds

With a 10-second sleep and Spring Boot’s 30-second timeout, setting it to 60 seconds is the safe choice.

terminationGracePeriodSeconds: 60

If this value is too low, Kubernetes will forcibly terminate the process with SIGKILL.

Integrating Readiness Probe with Actuator

If you are using Spring Boot Actuator, you can take further advantage of shutdown behavior.

When graceful shutdown begins, Spring Boot internally transitions to ReadinessState.REFUSING_TRAFFIC, causing /actuator/health/readiness to automatically return OUT_OF_SERVICE. If you point your Kubernetes Readiness Probe at this endpoint, traffic will be cut off automatically.

management.endpoint.health.probes.enabled=true
management.health.readinessstate.enabled=true
management.health.livenessstate.enabled=true

readinessProbe:
  httpGet:
    path: /actuator/health/readiness
    port: 8080
  initialDelaySeconds: 10
  periodSeconds: 5
livenessProbe:
  httpGet:
    path: /actuator/health/liveness
    port: 8080
  initialDelaySeconds: 30
  periodSeconds: 10

Configuration Summary

For container image build instructions, refer to the Docker Containerization Guide.

Here is the final application.properties:

# Graceful Shutdown
server.shutdown=graceful
spring.lifecycle.timeout-per-shutdown-phase=30s

# Actuator Probes
management.endpoint.health.probes.enabled=true
management.health.readinessstate.enabled=true
management.health.livenessstate.enabled=true

And the relevant section of the Kubernetes Deployment YAML:

spec:
  template:
    spec:
      terminationGracePeriodSeconds: 60
      containers:
        - name: app
          image: your-app:latest
          lifecycle:
            preStop:
              exec:
                command: ["sh", "-c", "sleep 10"]
          readinessProbe:
            httpGet:
              path: /actuator/health/readiness
              port: 8080
            initialDelaySeconds: 10
            periodSeconds: 5
          livenessProbe:
            httpGet:
              path: /actuator/health/liveness
              port: 8080
            initialDelaySeconds: 30
            periodSeconds: 10

For the basics of deploying Spring Boot to Kubernetes, see this article as well.

Verifying: Send Requests During a Rolling Update

The clearest way to verify that your configuration is working correctly is to continuously send requests to a business endpoint during a rolling update. Since /actuator/health only returns health status, hit an actual API endpoint and check for any HTTP error responses.

# In a separate terminal, keep sending requests to a business endpoint
while true; do
  STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://your-service/api/hello)
  echo "$(date): $STATUS"
  if [ "$STATUS" != "200" ]; then
    echo "ERROR: Got $STATUS"
  fi
  sleep 0.5
done

Then run the update in another terminal:

kubectl rollout restart deployment/your-app
kubectl rollout status deployment/your-app

If all responses are 200, the configuration is successful. If you see 502 or 503 mixed in, try increasing the preStop sleep duration or revisiting terminationGracePeriodSeconds.

Common Pitfalls

Configuring graceful shutdown without a preStop hook is the most frequent mistake. Graceful shutdown protects in-flight processing within the application, but it cannot prevent traffic that arrives due to kube-proxy’s propagation lag. Both settings must work together.

In Spring Boot 2.2 and earlier environments, server.shutdown=graceful has no effect. Check your version in pom.xml or build.gradle.

Forgetting probes.enabled=true is also easy to miss. Without this setting, /actuator/health/readiness returns 404.

Summary

Each configuration has a clear role, and omitting even one will cause request loss somewhere else.

Without server.shutdown=graceful, in-flight requests are forcibly terminated the moment SIGTERM is received
Without the preStop hook, traffic loss remains for the duration of kube-proxy’s propagation lag
If terminationGracePeriodSeconds is too short, SIGKILL will terminate the process before graceful shutdown can complete
Without the Actuator readiness probe configured, Kubernetes continues sending traffic after shutdown begins

Combining this with observability setup via Micrometer + Prometheus lets you monitor error rates in real time during deployments for added peace of mind.

References

Official documentation and references for the topics covered in this article.