Lab6 (Spring Boot/K8S): Persistent Volumes in Kubernetes

In this post, we’ll explore the persistent volumes in Kubernetes.

· Understanding Persistent Volumes
∘ Types of Persistent Volumes
∘ Persistent Volume access modes
∘ Volume Mode
∘ Reclaim Policy
· What is a Persistent Volume Claim?
· Real-world examples
∘ Using a PersistentVolumeClaim in a pod
∘ Connecting to Postgres from the API
· Testing
∘ Endpoint testing
∘ Delete resources
· Conclusion
· References

This series of stories shows how to use Kubernetes in the Spring ecosystem. We work with a Spring Boot API and Minikube to have a lightweight and fast development environment similar to production.

Kubernetes supports many types of volumes. A Pod can use any number of volume types simultaneously. Ephemeral volume types have a lifetime of a pod, but persistent volumes exist beyond the lifetime of a pod. When a pod ceases to exist, Kubernetes destroys ephemeral volumes; however, Kubernetes does not destroy persistent volumes. Data is preserved across container restarts for any volume in a given pod.

Understanding Persistent Volumes

Persistent Volume (PV) is a piece of storage in the cluster that has been provisioned by an administrator or dynamically provisioned using Storage Classes. It is a resource in the cluster just like a node is a cluster resource. PVs are volume plugins like Volumes but have a lifecycle independent of any individual Pod that uses the PV.

apiVersion: v1
kind: PersistentVolume
metadata:
name: mypv
spec:
capacity:
storage: 5Gi
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
persistentVolumeReclaimPolicy: Recycle
mountOptions:
- hard
- nfsvers=4.1
nfs:
path: /tmp
server: 172.17.0.2

Types of Persistent Volumes

PersistentVolume types are implemented as plugins. Kubernetes currently supports the following plugins:

  • csi – Allows integration with storage providers that support the Container Storage Interface (CSI) specification, such as the block storage services provided by cloud platforms.
  • fc – Fibre Channel (FC) storage
  • hostPath – HostPath volume (for single node testing only; WILL NOT WORK in a multi-node cluster; consider using local volume instead)
  • iscsi – iSCSI (SCSI over IP) storage attachments.
  • local – local storage devices mounted on nodes.
  • nfs – Used to access Network File System (NFS) mounts.

In addition, there are volume types such as awsElasticBlockStoreazureDisk, and gcePersistentDisk that support built-in integrations with specific cloud providers. However, these are all deprecated in favor of CSI-based volumes but still available.

Persistent Volume access modes

A PV can be mounted on a host in any way supported by the resource provider. It supports four different access modes.

  • ReadWriteOnce (RWO):The volume is mounted with read-write access for a single Node in the cluster. Any of the Pods running on that Node can read and write the volume’s contents.
  • ReadOnlyMany (ROX):The volume can be concurrently mounted to any of the Nodes in the cluster, with read-only access for any Pod. (read-only by many nodes)
  • ReadWriteMany (RWX):The volume can be mounted as read-write by many nodes. Similar to ReadOnlyMany, but with read-write access.
  • ReadWriteOncePod (RWOP):This new variant, introduced as a beta feature in Kubernetes v1.27, enforces that read-write access is provided to a single Pod. No other Pods in the cluster will be able to use the volume simultaneously.

Volume Mode

Kubernetes supports two volumeModes of PersistentVolumes: Filesystem and BlockvolumeMode is an optional API parameter. Filesystem is the default mode used when volumeMode the parameter is omitted.

Reclaim Policy

Current reclaim policies are:

  • Retain — manual reclamation
  • Recycle — basic scrub (rm -rf /thevolume/*)
  • Delete — delete the volume

For Kubernetes 1.30, only nfs and hostPath volume types support recycling.

What is a Persistent Volume Claim?

PersistentVolumeClaim (PVC) is a request for storage by a user. It is similar to a Pod. Pods consume node resources and PVCs consume PV resources. Pods can request specific levels of resources (CPU and Memory). Claims can request particular sizes and access modes.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: myclaim
spec:
volumeMode: Filesystem
accessModes:
- ReadWriteOnce
resources:
requests:
storage: 1Gi

Real-world examples

We’ll implement a sample REST API that uses Spring Data R2DBC with PostgreSQL Database.

We’ll create a PV and a PVC for PostgreSQL data.

Let’s start by creating a PostgreSQL persistent volume.

apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-pv-volume
labels:
type: local
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce
hostPath:
path: "/mnt/data"

In this Story, we use a hostPath PersistentVolume. Kubernetes supports hostPath for development and testing on a single-node cluster. A hostPath PersistentVolume uses a file or directory on the Node to emulate network-attached storage. In a production cluster, we would not use hostPath. Instead, a cluster administrator would provision a network resource like a Google Compute Engine persistent disk, an NFS share, or an Amazon Elastic Block Store volume.

Then, we define a PVC for Postgres Storage, dynamically provisioned by cluster.

apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pv-claim # name of PVC essential for identifying the storage data
labels:
app: postgres
tier: database
spec:
accessModes:
- ReadWriteOnce #This specifies the mode of the claim that we are trying to create.
resources:
requests:
storage: 1Gi

The PVC requests a volume of at least 1 gibibyte that can provide read-write access for at most one Node at a time.

Using a PersistentVolumeClaim in a pod

Pods access storage by using the claim as a volume. Claims must exist in the same namespace as the Pod using the claim. The cluster finds the claim in the Pod’s namespace and uses it to get the PersistentVolume backing the claim.

apiVersion: v1
kind: Pod
metadata:
name: postgres
spec:
containers:
- name: postgres
image: postgres
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-persistence-storage
volumes:
- name: postgres-persistence-storage
persistentVolumeClaim:
claimName: postgres-pv-claim

Here is the full YAML configuration file with PV, PVC, SVC, Secret, and CM for the Postgres Database:

---
apiVersion: v1
kind: Secret
metadata:
name: postgres-credentials
data:
postgres_user: ZGItYWRtaW4=
postgres_password: VWJrOXJuJHEhZA==
---
apiVersion: v1
kind: ConfigMap
metadata:
name: postgres-conf
data:
host: postgres
port: '5432'
name: bookdb
---
# Define a 'Service' To Expose postgres to Other Services
apiVersion: v1
kind: Service
metadata:
name: postgres
labels:
app: postgres
tier: database
spec:
ports:
- port: 5432
selector:
app: postgres
tier: database
clusterIP: None # Kubernetes does not assign an IP address.
---
apiVersion: v1
kind: PersistentVolume
metadata:
name: postgres-pv-volume
labels:
type: local
spec:
capacity:
storage: 1Gi
accessModes:
- ReadWriteOnce # PV and PVC need to match
hostPath:
path: "/mnt/data" # Usage of hostPath is not recommended in production (for single node testing only)
---
# Define a 'Persistent Volume Claim'(PVC) for Postgres Storage, dynamically provisioned by cluster
apiVersion: v1
kind: PersistentVolumeClaim
metadata:
name: postgres-pv-claim # name of PVC essential for identifying the storage data
labels:
app: postgres
tier: database
spec:
accessModes:
- ReadWriteOnce #This specifies the mode of the claim that we are trying to create.
resources:
requests:
storage: 1Gi

---
apiVersion: apps/v1
kind: Deployment
metadata:
name: postgres
labels:
app: postgres
tier: database
spec:
selector:
matchLabels:
app: postgres
strategy:
type: Recreate
template:
metadata:
labels: # Must match 'Service' and 'Deployment' selectors
app: postgres
tier: database
spec:
containers:
- name: postgres
image: postgres
imagePullPolicy: "IfNotPresent"
env:
- name: POSTGRES_USER
valueFrom:
secretKeyRef:
name: postgres-credentials
key: postgres_user
- name: POSTGRES_PASSWORD
valueFrom:
secretKeyRef:
name: postgres-credentials
key: postgres_password
- name: POSTGRES_DB # Setting Database Name from a 'ConfigMap'
valueFrom:
configMapKeyRef:
name: postgres-conf
key: name
ports:
- containerPort: 5432
name: postgres
volumeMounts:
- mountPath: /var/lib/postgresql/data
name: postgres-persistence-storage
volumes:
- name: postgres-persistence-storage
persistentVolumeClaim:
claimName: postgres-pv-claim

Connecting to Postgres from the API

We will start by creating a simple Spring Boot project from start.spring.io, with the following dependencies: Spring Reactive Web, Spring Data R2DBC, Spring Data JPA, PostgreSQL Driver, R2DBC driver, and Lombok.

Here is the application.yaml file:

spring:
application:
name: spring-data-r2dbc
r2dbc:
url: r2dbc:postgresql://${DB_HOST}:${DB_PORT}/${DB_NAME}
username: ${DB_USERNAME}
password: ${DB_PWD}
repositories:
enabled: true

DB_HOST, DB_PORT, DB_NAME, DB_USERNAME, DB_PWD env variables will be populated by Kubernetes config map and secrets.

Entity class

@Table
@Data
@AllArgsConstructor
@NoArgsConstructor
public class Book {

@Id
private Long id;

@NotBlank
@Size(max = 100)
private String title;

private int page;

private String isbn;

private String description;

private double price;
}

BookRepository.java

@Repository
public interface BookRepository extends ReactiveCrudRepository<Book, Long> {
}

BookController.java

@RestController
@RequestMapping("/api/book")
public class BookController {

private final BookRepository bookRepository;

public BookController(BookRepository bookRepository) {
this.bookRepository = bookRepository;
}

@PostMapping
@ResponseStatus(HttpStatus.CREATED)
public Mono<Book> createBook(@RequestBody Book book) {
return bookRepository.save(book);
}

@GetMapping
public Flux<Book> getBooks() {
return bookRepository.findAll();
}


@GetMapping("/{bookId}")
public Mono<ResponseEntity<Book>> getBookById(@PathVariable long bookId){
return bookRepository.findById(bookId)
.map(ResponseEntity::ok)
.defaultIfEmpty(ResponseEntity.notFound().build());
}

@PutMapping("{bookId}")
public Mono<ResponseEntity<Book>> updateBook(@PathVariable long bookId, @RequestBody Mono<Book> bookMono){
return bookRepository.findById(bookId)
.flatMap(book -> bookMono.map(u -> {
book.setDescription(u.getDescription());
book.setPrice(u.getPrice());
book.setIsbn(u.getIsbn());
book.setPrice(u.getPrice());
book.setPage(u.getPage());
return book;
}))
.flatMap(bookRepository::save)
.map(ResponseEntity::ok)
.defaultIfEmpty(ResponseEntity.notFound().build());
}

@DeleteMapping("/{bookId}")
public Mono<ResponseEntity<Void>> deleteBook(@PathVariable long bookId) {
return bookRepository.findById(bookId)
.flatMap(s ->
bookRepository.delete(s)
.then(Mono.just(new ResponseEntity<Void>(HttpStatus.OK)))
)
.defaultIfEmpty(new ResponseEntity<>(HttpStatus.NOT_FOUND));
}

}

Here is the complete configuration api-deployment

---
apiVersion: v1
kind: Service
metadata:
name: book-api-service
spec:
selector:
app: backend
ports:
- protocol: TCP
port: 8081
targetPort: 8080
# Optional field
# By default and for convenience, the Kubernetes control plane
# will allocate a port from a range (default: 30000-32767)
nodePort: 30163
type: NodePort
---
apiVersion: apps/v1
kind: Deployment
metadata:
name: backend-book-api
spec:
replicas: 2
selector:
matchLabels:
app: backend
environment: dev
template:
metadata:
labels:
app: backend
environment: dev
spec:
containers:
- name: book-api
image: spring-data-r2dbc-postgres-k8s:latest
ports:
- containerPort: 8080
imagePullPolicy: Never
livenessProbe:
httpGet:
path: /actuator/health/liveness
port: 8080
initialDelaySeconds: 15
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 1
readinessProbe:
httpGet:
path: /actuator/health/readiness
port: 8080
initialDelaySeconds: 5
periodSeconds: 5
timeoutSeconds: 2
failureThreshold: 1
env: # array of environment variable definitions
- name: DB_HOST
valueFrom: # select individual keys in a ConfigMap
configMapKeyRef:
name: postgres-conf # name of configMap
key: host
- name: DB_PORT # Setting Database port from configMap
valueFrom:
configMapKeyRef:
name: postgres-conf
key: port
- name: DB_NAME # Setting Database name from configMap
valueFrom:
configMapKeyRef:
name: postgres-conf
key: name
- name: DB_USERNAME # Setting Database username from Secret
valueFrom:
secretKeyRef:
name: postgres-credentials
key: postgres_user
- name: DB_PWD # Setting Database password from Secret
valueFrom:
secretKeyRef:
name: postgres-credentials
key: postgres_password

Testing

Now, we can apply the DB configuration by using the following command:

$ kubectl apply -f postgres-deployment.yml

secret/postgres-credentials created
configmap/postgres-conf created
service/postgres created
persistentvolume/postgres-pv-volume created
persistentvolumeclaim/postgres-pv-claim created
deployment.apps/postgres created

Verifying PV and PVC

The output shows that the PersistentVolumeClaim is bound to the PersistentVolume, postgres-pv-volume.

Let’s also apply api.deployment to consume data from the database.

$ kubectl apply -f api-deployment.yml

service/book-api-service created
deployment.apps/backend-book-api created

As we can see, the database and API pods are running

Endpoint testing

Retrieves the IP address of the cluster.

$ minikube ip
192.168.58.2
POST ‘http://192.168.58.2:30163/api/book’
GET ‘http://192.168.58.2:30163/api/book’
GET ‘http://192.168.58.2:30163/api/book/1′

Well done !!.

Delete resources

$ kubectl delete deploy postgres
$ kubectl delete svc postgres
$ kubectl delete secrets postgres-credentials
$ kubectl delete cm postgres-conf
$ kubectl delete pv postgres-pv-volume
$ kubectl delete pvc postgres-pv-claim
$ kubectl delete deploy backend-book-api
$ kubectl delete svc book-api-service

Conclusion

In this story, we learned in depth what persistent volume is. Kubernetes persistent storage provides a convenient way for applications to request and consume storage resources. Understanding it is important to choose the best option for your application.

The complete source code of this series is available on GitHub.

You can reach out to me and follow me on MediumTwitterGitHubLinkedln

Support me through GitHub Sponsors.

Thank you for Reading !! See you in the next story.

References

👉 Link to Medium blog

Related Posts