Memory Management in Kubernetes

I have been disconnected from writing for a while now. Since, I was resetting myself and consuming content more than producing it.

I have been interviewing for the first 6 months of 2025, where I was looking for some challenging role, and then I landed as Senior Performance Engineer in HDFC Bank, and the journey have been great so far.

I have delivered an AI product and the application went live with MVP release and a my learning curve peaked here. Learning has never stopped since.

In this blog post I want to share my learnings in Kubernetes, world of microservices and pre-conceived notion regarding GC which was broken completely in my project.

In Kubernetes, memory management is not merely about allocation — it is about isolation, predictability, and stability of workloads running at scale.

Kubernetes is a portable container orchestration platform which automates many manual process involved in deployment of large applications.

From a performance engineers’ perspective, K8s is not only a container orchestrator, but as I perceive it as dynamic workload and resource control plane that directly influences system scalability, latency, and stability under load.

It does:

  • Schedules containers onto nodes
  • Manages resource allocation (CPU, memory)
  • Scales workloads automatically
  • Restarts unhealthy components
  • Routes traffic between services

Now, coming to scope of this post on the fundamentals of Memory Management in Kubernetes.

Kubernetes Memory Management Fundamentals

At its core, Kubernetes does not manage memory directly. Instead, it relies on the Linux kernel’s control groups (cgroups) to enforce memory limits and isolation for containers.

Key Principles

Kubernetes memory management is based on:

  • Requests → used for scheduling
  • Limits → used for enforcement
  • Node capacity → physical boundary
  • Eviction policies → protection mechanism

Unlike CPU (which is compressible), memory is non-compressible. When memory is exhausted, something must be killed.

Memory Requests and Limits

Memory Requests

Memory request defines the minimum guaranteed memory for a container.

Used by:

  • Kubernetes scheduler
  • Node placement decisions
  • QoS classification

Example:

resources:
requests:
memory: "512Mi"

If a node does not have 512Mi available, the pod will not be scheduled there.

Memory Limits

Memory limit defines the maximum memory a container can use.

Example:

resources:
limits:
memory: "1Gi"

If the container exceeds this:

  • Linux OOM killer terminates the container
  • Kubernetes marks container as OOMKilled

Important: Memory limits are enforced strictly.

Deep Dive: JAVA_OPTS and Memory Management in Kubernetes

While Kubernetes enforces container memory limits using cgroups, Java applications introduce an additional layer of complexity through JVM memory management. Misalignment between Kubernetes limits and JVM settings is one of the most common causes of OOMKills in production clusters.

This is where JAVA_OPTS becomes critically important.

What is JAVA_OPTS?

JAVA_OPTS is an environment variable used to pass JVM arguments at runtime. In Kubernetes deployments, it is typically defined inside the container spec to control heap sizing, garbage collection, and container awareness.

Example:

env:
- name: JAVA_OPTS
value: >
-Xms512m
-Xmx1024m
-XX:+UseG1GC

These parameters directly influence how the JVM consumes memory within the container boundary.

A young person with headphones working on a laptop, surrounded by various charts and graphs related to memory usage and Java options in a tech-themed environment.

Why JAVA_OPTS Matters in Kubernetes

In traditional VM environments, JVM heap sizing was tuned against machine memory. In Kubernetes, however, the JVM runs inside a container constrained by cgroups.

If not configured properly:

  • JVM may assume more memory than the container limit
  • Heap may grow too large
  • Container gets OOMKilled
  • Pod enters CrashLoopBackOff

Key insight: Kubernetes limits memory at the container level, but JVM manages memory internally. Both must be aligned.

Understanding JVM Memory Regions

Before tuning, it is important to understand where memory goes in a Java container.

Heap Memory

  • Controlled by -Xms and -Xmx
  • Stores application objects
  • Usually the largest consumer

Non-Heap Memory

Includes:

  • Metaspace
  • Thread stacks
  • Code cache
  • Direct buffers (very important in Netty apps)
  • GC structures

Container-Aware JVM (Modern Java)

Modern JVMs (Java 10+) are container-aware.

Important flags:

-XX:+UseContainerSupport
-XX:MaxRAMPercentage
-XX:InitialRAMPercentage

Conclusion

In Kubernetes environments, memory stability of Java applications depends heavily on correct JVM tuning through JAVA_OPTS. Kubernetes enforces the outer boundary, but the JVM controls internal consumption. Production incidents often arise when these two layers operate with mismatched assumptions.

A well-engineered system aligns container limits, heap sizing, non-heap overhead, and garbage collection behavior. Teams that validate JVM memory behavior under realistic load — rather than relying on defaults — achieve significantly higher resilience and performance predictability in cloud-native platforms.

Leave a Reply

Discover more from the scalable guy

Subscribe now to keep reading and get access to the full archive.

Continue reading

Discover more from the scalable guy

Subscribe now to keep reading and get access to the full archive.

Continue reading