Have you ever experienced errors being returned to some users during a Kubernetes rolling update?
The cause is almost always insufficient graceful shutdown configuration. By correctly configuring both the Spring Boot side and the Kubernetes side, you can deploy without dropping any in-flight requests.
Why Requests Fail During Deployment
When Kubernetes terminates a Pod during a rolling update, the shutdown sequence proceeds in the following order:
- The Pod enters Terminating state and is removed from Endpoints
- The preStop hook executes (if configured)
- SIGTERM is sent to the application after preStop completes
- SIGKILL is sent after
terminationGracePeriodSecondselapses
The key thing to watch out for here is that even after a Pod is removed from Endpoints, it takes a few seconds for kube-proxy to reflect that change. Without a preStop hook, traffic may continue arriving at the moment SIGTERM is received.
Furthermore, Spring Boot by default stops serving immediately upon receiving SIGTERM, which forcibly terminates any in-flight requests. The combination of graceful shutdown and the preStop hook is what resolves this timing gap.
Enabling Graceful Shutdown in Spring Boot
If you are on Spring Boot 2.3 or later, a single line added to application.properties is all you need.
server.shutdown=graceful
This changes the behavior after receiving SIGTERM as follows:
- Stop accepting new requests
- Wait for in-flight requests to complete
- Shut down once they have completed
This setting does not exist in Spring Boot 2.2 and earlier, so check your version first.
Configuring the Timeout
Graceful shutdown has a wait timeout. The default is 30 seconds.
spring.lifecycle.timeout-per-shutdown-phase=30s
Requests that do not complete within this time are forcibly terminated. Base this value on your API’s P99 response time. For typical REST APIs, 30 seconds is sufficient, but consider increasing it if you have heavy batch processing.
Note that this is a timeout per shutdown phase. For applications with multiple Bean lifecycle phases, keep in mind that the total shutdown time may exceed this value.
Delaying SIGTERM with a Kubernetes preStop Hook
This is the most commonly overlooked point.
Kubernetes removes the Pod from Endpoints the moment it enters Terminating state, but the preStop hook executes before SIGTERM is sent. By inserting a sleep, you can ensure SIGTERM reaches the application only after kube-proxy has had time to propagate the change.
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 10"]
5 to 10 seconds of sleep is the generally recommended range. Ten seconds provides a comfortable buffer for most cluster configurations.
Aligning Numbers with terminationGracePeriodSeconds
Once you have configured preStop and graceful shutdown, verify that terminationGracePeriodSeconds is greater than their combined total.
terminationGracePeriodSeconds > preStop sleep seconds + timeout-per-shutdown-phase seconds
With a 10-second sleep and Spring Boot’s 30-second timeout, setting it to 60 seconds is the safe choice.
terminationGracePeriodSeconds: 60
If this value is too low, Kubernetes will forcibly terminate the process with SIGKILL.
Integrating Readiness Probe with Actuator
If you are using Spring Boot Actuator, you can take further advantage of shutdown behavior.
When graceful shutdown begins, Spring Boot internally transitions to ReadinessState.REFUSING_TRAFFIC, causing /actuator/health/readiness to automatically return OUT_OF_SERVICE. If you point your Kubernetes Readiness Probe at this endpoint, traffic will be cut off automatically.
management.endpoint.health.probes.enabled=true
management.health.readinessstate.enabled=true
management.health.livenessstate.enabled=true
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
Configuration Summary
For container image build instructions, refer to the Docker Containerization Guide.
Here is the final application.properties:
# Graceful Shutdown
server.shutdown=graceful
spring.lifecycle.timeout-per-shutdown-phase=30s
# Actuator Probes
management.endpoint.health.probes.enabled=true
management.health.readinessstate.enabled=true
management.health.livenessstate.enabled=true
And the relevant section of the Kubernetes Deployment YAML:
spec:
template:
spec:
terminationGracePeriodSeconds: 60
containers:
- name: app
image: your-app:latest
lifecycle:
preStop:
exec:
command: ["sh", "-c", "sleep 10"]
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 10
periodSeconds: 5
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 30
periodSeconds: 10
For the basics of deploying Spring Boot to Kubernetes, see this article as well.
Verifying: Send Requests During a Rolling Update
The clearest way to verify that your configuration is working correctly is to continuously send requests to a business endpoint during a rolling update. Since /actuator/health only returns health status, hit an actual API endpoint and check for any HTTP error responses.
# In a separate terminal, keep sending requests to a business endpoint
while true; do
STATUS=$(curl -s -o /dev/null -w "%{http_code}" http://your-service/api/hello)
echo "$(date): $STATUS"
if [ "$STATUS" != "200" ]; then
echo "ERROR: Got $STATUS"
fi
sleep 0.5
done
Then run the update in another terminal:
kubectl rollout restart deployment/your-app
kubectl rollout status deployment/your-app
If all responses are 200, the configuration is successful. If you see 502 or 503 mixed in, try increasing the preStop sleep duration or revisiting terminationGracePeriodSeconds.
Common Pitfalls
Configuring graceful shutdown without a preStop hook is the most frequent mistake. Graceful shutdown protects in-flight processing within the application, but it cannot prevent traffic that arrives due to kube-proxy’s propagation lag. Both settings must work together.
In Spring Boot 2.2 and earlier environments, server.shutdown=graceful has no effect. Check your version in pom.xml or build.gradle.
Forgetting probes.enabled=true is also easy to miss. Without this setting, /actuator/health/readiness returns 404.
Summary
Each configuration has a clear role, and omitting even one will cause request loss somewhere else.
- Without
server.shutdown=graceful, in-flight requests are forcibly terminated the moment SIGTERM is received - Without the preStop hook, traffic loss remains for the duration of kube-proxy’s propagation lag
- If
terminationGracePeriodSecondsis too short, SIGKILL will terminate the process before graceful shutdown can complete - Without the Actuator readiness probe configured, Kubernetes continues sending traffic after shutdown begins
Combining this with observability setup via Micrometer + Prometheus lets you monitor error rates in real time during deployments for added peace of mind.