Lab5 (Spring Boot/K8S): Understanding Kubernetes Resources Management

Welcome back to our series. In this post, we’ll explore how to manage resources within Kubernetes.

· Resource types
· Kubernetes Resources requests and limits
∘ requests
∘ limits
∘ Requests + Limits
· Understanding pod QoS classes
∘ Guaranteed
∘ Burstable
∘ BestEffort
· Namespace settings
∘ Limit Ranges
∘ ResourceQuota object
· Conclusion
· References

This series of stories shows how to use Kubernetes in the Spring ecosystem. We work with a Spring Boot API and Minikube to have a lightweight and fast development environment similar to production.

So far we’ve created pods without caring how much CPU and memory they’re allowed to consume. The most common resources to specify are CPU and memory (RAM) but there are others.

Resource types

A resource type has a base unit. The resource types include CPU, memory, disk I/O, and network bandwidth.

  • CPU: It represents compute processing and is expressed in terms of the number of vCPUs and the amount of Kubernetes resources memory per vCPU.
  • Memory: The amount of RAM available on the nodes in the cluster. It is measured in bytes or power-of-two units like KiB, MiB, GiB, etc.
  • Storage or Disk I/O: The amount of persistent storage available in the cluster: It is measured in bytes or power-of-two units like KiB, MiB, GiB, etc.
  • Network Bandwidth: It is one of the most important aspects of Kubernetes. It allows you to allocate a certain amount of network bandwidth to various containers and Kubernetes pod resources. It is measured in bits per second or power-of-ten units like Kbps, Mbps, Gbps, etc.

Kubernetes Resources requests and limits

requests

Kubernetes defines requests as the minimum amount of a resource (CPU or memory) a pod requires to run.

When we create a Pod, the Kubernetes scheduler selects a node for the Pod to run on. Each node has a maximum capacity for each resource type: the amount of CPU and memory it can provide for Pods. The scheduler ensures that, for each resource type, the sum of the resource requests of the scheduled containers is less than the capacity of the node. If the amount of unallocated CPU or memory is less than what the pod requests, Kubernetes will not schedule the pod to that node, because the node can’t provide the minimum amount required by the pod.

Resource requests are defined in the spec.containers[].resources.requests field of the pod manifests.

resources:
requests:
cpu: "500m" # 500 milliCPU, or 50% of a CPU core
memory: "128Mi" # 128 Megabytes of memory

limits

Kubernetes defines limits as the maximum amount of a resource (CPU or memory) to be used by a container.

When we specify a resource limit for a container, the kubelet enforces those limits so that the running container is not allowed to use more of that resource than the limit you set. When a process tries to allocate memory over its limit, the process is killed with an OOMKilled error (Out Of Memory). If the pod’s restart policy is set to Always or OnFailure, the process is restarted immediately, so we may not even notice it getting killed. But if it keeps going over the memory limit and getting killed, Kubernetes will begin restarting it with increasing delays between restarts.

Limits are set in a similar way to requests by using the spec.containers[].resources.limits field in the pod manifests:

resources:
limits:
cpu: "1" # 1 CPU core
memory: "256Mi" # 256 Megabytes of memory

Requests + Limits

It’s common to use both resource requests and limits to prevent excessive consumption.

apiVersion: v1
kind: Pod
metadata:
name: limit-request-example
namespace: rs-demo
spec:
containers:
- name: nginx-demo
image: nginx
resources:
requests:
cpu: "500m"
memory: "128Mi"
limits:
cpu: "1"
memory: "256Mi"

kubectl top displays current CPU and memory usage for pods and containers.

The resource metrics are available in the Kubernetes Dashboard for the pods and containers in various ways, such as graphs, charts, or tables.

Understanding pod QoS classes

Kubernetes uses Quality of Service classes (QoS) to make decisions about evicting Pods when Node resources are exceeded. It categorizes Pods into three QoS classes: GuaranteedBurstable, and BestEffort.

Guaranteed

Pods that are Guaranteed have the strictest resource limits and are least likely to face eviction. They are guaranteed not to be killed until they exceed their limits or no lower-priority Pods can be preempted from the Node.

apiVersion: v1
kind: Pod
metadata:
name: guaranteed-pod
namespace: rs-demo
spec:
containers:
- name: guaranteed-container-demo
image: nginx
resources:
limits:
memory: "200Mi"
cpu: "700m"
requests:
memory: "200Mi"
cpu: "700m"

For a pod’s class to be Guaranteed, three things need to be true:

  • Requests and limits need to be set for both CPU and memory.
  • They need to be set for each container.
  • They need to be equal (the limit needs to match the request for each resource in each container)

View detailed information about the Pod:

$ kubectl get pod guaranteed-pod --namespace=rs-demo --output=yaml

The output shows that Kubernetes gave the Pod a qosClass of Guaranteed. The output also verifies that the Pod Container has a memory request that matches its memory limit and it has a CPU request that matches its CPU limit.

Burstable

Pods that are Burstable have some lower-bound resource guarantees based on the request, but do not require a specific limit. If a limit is not specified, it defaults to a limit equivalent to the Node’s capacity, allowing the Pods to increase their resources if resources are available flexibly.

apiVersion: v1
kind: Pod
metadata:
name: burstable-pod
namespace: rs-demo
spec:
containers:
- name: burstable-container-demo
image: nginx
resources:
limits:
memory: "200Mi"
requests:
memory: "100Mi"

A Pod is given a QoS class of Burstable if:

  • The Pod does not meet the criteria for the QoS class Guaranteed.
  • At least one Container in the Pod has a memory or CPU request or limit.

View detailed information about the Pod:

$ kubectl get pod burstable-pod --namespace=rs-demo --output=yaml

The output shows that Kubernetes gave the Pod a qosClass of Burstable

BestEffort

Pods in the BestEffort QoS class can use node resources that aren’t specifically assigned to Pods in other QoS classes. It’s assigned to pods that don’t have any requests or limits set at all (in any of their containers).

apiVersion: v1
kind: Pod
metadata:
name: best-effort-pod
namespace: rs-demo
spec:
containers:
- name: best-effort-container-demo
image: nginx

A Pod has a QoS class of BestEffort if it doesn’t meet the criteria for either Guaranteed or Burstable.

View detailed information about the Pod:

$ kubectl get pod best-effort-pod --namespace=rs-demo --output=yaml

The output shows that Kubernetes gave the Pod a qosClass of BestEffort

Namespace settings

Limit Ranges

A LimitRange is a policy to constrain the resource allocations (limits and requests) that we can specify for each applicable object kind in a namespace. It is enforced in a particular namespace when there is a LimitRange object in that namespace.

LimitRange provides constraints that can:

  • Enforce minimum and maximum computing resource usage per pod or container in a namespace.
  • Enforce minimum and maximum storage requests per PersistentVolumeClaim in a namespace.
  • Enforce a ratio between request and limit for a resource in a namespace.
  • Set default request/limit for compute resources in a namespace and automatically inject them to Containers at runtime.

Let’s look at a full example of a LimitRange below:

apiVersion: v1
kind: LimitRange
metadata:
name: limit-range
spec:
limits:
- type: Pod #Specifies the limits for a pod
min: #Minimum CPU and memory all the pod’s containers can request in total
cpu: 60m
memory: 6Mi
max: #Maximum limit CPU and memory for all the pods containers
cpu: 1
memory: 1Gi
- type: Container #Specifies the limits for a container
defaultRequest: #Default requests for CPU and memory will be applied to containers that don’t specify them explicitly
cpu: 100m
memory: 10Mi
default: #Default limits for containers that don’t specify them
cpu: 200m
memory: 100Mi
min: #Minimum CPU and memory for a container
cpu: 60m
memory: 6Mi
max: #Maximum limit CPU and memory a container can have
cpu: 1
memory: 1Gi
maxLimitRequestRatio: #Maximum ratio between the limit and request for each resource
cpu: 4
memory: 10
- type: PersistentVolumeClaim #Specifies the limits of storage a PVC can request
min:
storage: 1Gi
max:
storage: 20Gi

Enabling LimitRange:

$ kubectl apply -f k8s-limit.yaml --namespace=rs-demo

ResourceQuota object

A resource quota, defined by a ResourceQuota object provides constraints that limit aggregate resource consumption per namespace. It can limit the quantity of objects that can be created in a namespace by type, as well as the total amount of compute resources that may be consumed by resources in that namespace.

Resource quotas work like this:

  • Different teams work in different namespaces. This can be enforced with RBAC.
  • The administrator creates one ResourceQuota for each namespace.
  • Users create resources (pods, services, etc.) in the namespace, and the quota system tracks usage to ensure it does not exceed hard resource limits defined in a ResourceQuota.
  • If creating or updating a resource violates a quota constraint, the request will fail with the HTTP status code 403 FORBIDDEN with a message explaining the constraint that would have been violated.
  • If the quota is enabled in a namespace for compute resources like cpu and memory, users must specify requests or limits for those values; otherwise, the quota system may reject pod creation. Hint: Use the LimitRanger admission controller to force defaults for pods that make no compute resource requirements.

Here is a manifest for an example ResourceQuota:

apiVersion: v1
kind: ResourceQuota
metadata:
name: quota-demo
namespace: rs-demo
spec:
hard:
requests.cpu: "1"
requests.memory: 1Gi
limits.cpu: "2"
limits.memory: 2Gi
  • Total CPU requests across all pods cannot exceed 1 core.
  • Total memory requests across all pods cannot exceed 1 gigabyte.
  • The total CPU limit across all pods cannot exceed 2 cores.
  • The total memory limit across all pods cannot exceed 2 gigabytes.

Inspecting the ResourceQuota with kubectl describe quota

When a quota for a specific resource (CPU or memory) is configured (request or limit), pods need to have the request or limit (respectively) set for that same resource; otherwise, the API server will not accept the pod.

A ResourceQuota object can also limit the amount of persistent storage that can be claimed in the namespace.

Conclusion

In this story, we have learned how to use Kubernetes to manage resources effectively and efficiently.

The complete source code of this series is available on GitHub.

You can reach out to me and follow me on MediumTwitterGitHubLinkedln

Support me through GitHub Sponsors.

Thank you for Reading !! See you in the next story.

References

👉 Link to Medium blog

Related Posts