I have been disconnected from writing for a while now. Since, I was resetting myself and consuming content more than producing it.
I have been interviewing for the first 6 months of 2025, where I was looking for some challenging role, and then I landed as Senior Performance Engineer in HDFC Bank, and the journey have been great so far.
I have delivered an AI product and the application went live with MVP release and a my learning curve peaked here. Learning has never stopped since.
In this blog post I want to share my learnings in Kubernetes, world of microservices and pre-conceived notion regarding GC which was broken completely in my project.
In Kubernetes, memory management is not merely about allocation — it is about isolation, predictability, and stability of workloads running at scale.
Kubernetes is a portable container orchestration platform which automates many manual process involved in deployment of large applications.
From a performance engineers’ perspective, K8s is not only a container orchestrator, but as I perceive it as dynamic workload and resource control plane that directly influences system scalability, latency, and stability under load.
It does:
- Schedules containers onto nodes
- Manages resource allocation (CPU, memory)
- Scales workloads automatically
- Restarts unhealthy components
- Routes traffic between services
Now, coming to scope of this post on the fundamentals of Memory Management in Kubernetes.
Kubernetes Memory Management Fundamentals
At its core, Kubernetes does not manage memory directly. Instead, it relies on the Linux kernel’s control groups (cgroups) to enforce memory limits and isolation for containers.
Key Principles
Kubernetes memory management is based on:
- Requests → used for scheduling
- Limits → used for enforcement
- Node capacity → physical boundary
- Eviction policies → protection mechanism
Unlike CPU (which is compressible), memory is non-compressible. When memory is exhausted, something must be killed.
Memory Requests and Limits
Memory Requests
Memory request defines the minimum guaranteed memory for a container.
Used by:
- Kubernetes scheduler
- Node placement decisions
- QoS classification
Example:
resources: requests: memory: "512Mi"
If a node does not have 512Mi available, the pod will not be scheduled there.
Memory Limits
Memory limit defines the maximum memory a container can use.
Example:
resources: limits: memory: "1Gi"
If the container exceeds this:
- Linux OOM killer terminates the container
- Kubernetes marks container as OOMKilled
Important: Memory limits are enforced strictly.
Deep Dive: JAVA_OPTS and Memory Management in Kubernetes
While Kubernetes enforces container memory limits using cgroups, Java applications introduce an additional layer of complexity through JVM memory management. Misalignment between Kubernetes limits and JVM settings is one of the most common causes of OOMKills in production clusters.
This is where JAVA_OPTS becomes critically important.
What is JAVA_OPTS?
JAVA_OPTS is an environment variable used to pass JVM arguments at runtime. In Kubernetes deployments, it is typically defined inside the container spec to control heap sizing, garbage collection, and container awareness.
Example:
env: - name: JAVA_OPTS value: > -Xms512m -Xmx1024m -XX:+UseG1GC
These parameters directly influence how the JVM consumes memory within the container boundary.

Why JAVA_OPTS Matters in Kubernetes
In traditional VM environments, JVM heap sizing was tuned against machine memory. In Kubernetes, however, the JVM runs inside a container constrained by cgroups.
If not configured properly:
- JVM may assume more memory than the container limit
- Heap may grow too large
- Container gets OOMKilled
- Pod enters CrashLoopBackOff
Key insight: Kubernetes limits memory at the container level, but JVM manages memory internally. Both must be aligned.
Understanding JVM Memory Regions
Before tuning, it is important to understand where memory goes in a Java container.
Heap Memory
- Controlled by
-Xmsand-Xmx - Stores application objects
- Usually the largest consumer
Non-Heap Memory
Includes:
- Metaspace
- Thread stacks
- Code cache
- Direct buffers (very important in Netty apps)
- GC structures
Container-Aware JVM (Modern Java)
Modern JVMs (Java 10+) are container-aware.
Important flags:
-XX:+UseContainerSupport-XX:MaxRAMPercentage-XX:InitialRAMPercentage
Conclusion
In Kubernetes environments, memory stability of Java applications depends heavily on correct JVM tuning through JAVA_OPTS. Kubernetes enforces the outer boundary, but the JVM controls internal consumption. Production incidents often arise when these two layers operate with mismatched assumptions.
A well-engineered system aligns container limits, heap sizing, non-heap overhead, and garbage collection behavior. Teams that validate JVM memory behavior under realistic load — rather than relying on defaults — achieve significantly higher resilience and performance predictability in cloud-native platforms.
Leave a Reply