Mastering Kubernetes Memory Management for Java Applications

I have been disconnected from writing for a while now. Since, I was resetting myself and consuming content more than producing it.

I have been interviewing for the first 6 months of 2025, where I was looking for some challenging role, and then I landed as Senior Performance Engineer in HDFC Bank, and the journey have been great so far.

I have delivered an AI product and the application went live with MVP release and a my learning curve peaked here. Learning has never stopped since.

In this blog post I want to share my learnings in Kubernetes, world of microservices and pre-conceived notion regarding GC which was broken completely in my project.

In Kubernetes, memory management is not merely about allocation — it is about isolation, predictability, and stability of workloads running at scale.

Kubernetes is a portable container orchestration platform which automates many manual process involved in deployment of large applications.

From a performance engineers’ perspective, K8s is not only a container orchestrator, but as I perceive it as dynamic workload and resource control plane that directly influences system scalability, latency, and stability under load.

It does:

Schedules containers onto nodes
Manages resource allocation (CPU, memory)
Scales workloads automatically
Restarts unhealthy components
Routes traffic between services

Now, coming to scope of this post on the fundamentals of Memory Management in Kubernetes.

Kubernetes Memory Management Fundamentals

At its core, Kubernetes does not manage memory directly. Instead, it relies on the Linux kernel’s control groups (cgroups) to enforce memory limits and isolation for containers.

Key Principles

Kubernetes memory management is based on:

Requests → used for scheduling
Limits → used for enforcement
Node capacity → physical boundary
Eviction policies → protection mechanism

Unlike CPU (which is compressible), memory is non-compressible. When memory is exhausted, something must be killed.

Memory Requests and Limits

Memory Requests

Memory request defines the minimum guaranteed memory for a container.

Used by:

Kubernetes scheduler
Node placement decisions
QoS classification

Example:

			
resources:
  requests:
    memory: "512Mi"

If a node does not have 512Mi available, the pod will not be scheduled there.

Memory Limits

Memory limit defines the maximum memory a container can use.

Example:

			
resources:
  limits:
    memory: "1Gi"

If the container exceeds this:

Linux OOM killer terminates the container
Kubernetes marks container as OOMKilled

Important: Memory limits are enforced strictly.

Deep Dive: JAVA_OPTS and Memory Management in Kubernetes

While Kubernetes enforces container memory limits using cgroups, Java applications introduce an additional layer of complexity through JVM memory management. Misalignment between Kubernetes limits and JVM settings is one of the most common causes of OOMKills in production clusters.

This is where JAVA_OPTS becomes critically important.

What is JAVA_OPTS?

JAVA_OPTS is an environment variable used to pass JVM arguments at runtime. In Kubernetes deployments, it is typically defined inside the container spec to control heap sizing, garbage collection, and container awareness.

Example:

			
env:
  - name: JAVA_OPTS
    value: >
      -Xms512m
      -Xmx1024m
      -XX:+UseG1GC

		

These parameters directly influence how the JVM consumes memory within the container boundary.

Why JAVA_OPTS Matters in Kubernetes

In traditional VM environments, JVM heap sizing was tuned against machine memory. In Kubernetes, however, the JVM runs inside a container constrained by cgroups.

If not configured properly:

JVM may assume more memory than the container limit
Heap may grow too large
Container gets OOMKilled
Pod enters CrashLoopBackOff

Key insight: Kubernetes limits memory at the container level, but JVM manages memory internally. Both must be aligned.

Understanding JVM Memory Regions

Before tuning, it is important to understand where memory goes in a Java container.

Heap Memory

Controlled by -Xms and -Xmx
Stores application objects
Usually the largest consumer

Non-Heap Memory

Includes:

Metaspace
Thread stacks
Code cache
Direct buffers (very important in Netty apps)
GC structures

Container-Aware JVM (Modern Java)

Modern JVMs (Java 10+) are container-aware.

Important flags:

			
-XX:+UseContainerSupport
-XX:MaxRAMPercentage
-XX:InitialRAMPercentage

Conclusion

In Kubernetes environments, memory stability of Java applications depends heavily on correct JVM tuning through JAVA_OPTS. Kubernetes enforces the outer boundary, but the JVM controls internal consumption. Production incidents often arise when these two layers operate with mismatched assumptions.

A well-engineered system aligns container limits, heap sizing, non-heap overhead, and garbage collection behavior. Teams that validate JVM memory behavior under realistic load — rather than relying on defaults — achieve significantly higher resilience and performance predictability in cloud-native platforms.

the scalable guy

Memory Management in Kubernetes

Kubernetes Memory Management Fundamentals

Key Principles

Memory Requests and Limits

Memory Requests

Memory Limits

Deep Dive: JAVA_OPTS and Memory Management in Kubernetes

What is JAVA_OPTS?

Why JAVA_OPTS Matters in Kubernetes

Understanding JVM Memory Regions

Heap Memory

Non-Heap Memory

Container-Aware JVM (Modern Java)

Conclusion

Like this:

Leave a ReplyCancel reply

Memory Management in Kubernetes

Kubernetes Memory Management Fundamentals

Key Principles

Memory Requests and Limits

Memory Requests

Memory Limits

Deep Dive: JAVA_OPTS and Memory Management in Kubernetes

What is JAVA_OPTS?

Why JAVA_OPTS Matters in Kubernetes

Understanding JVM Memory Regions

Heap Memory

Non-Heap Memory

Container-Aware JVM (Modern Java)

Conclusion

Share with your network:

Like this:

Leave a ReplyCancel reply

Discover more from the scalable guy

Discover more from the scalable guy