Docker: Managing data with Volumes (Part 1) | Manvi Sharma

Post

Docker: Managing data with Volumes (Part 1)

Hello everyone,

In the last post, we learned to perform various operations with containers and images.

In this post, we will explore volumes, which are like state management for running applications with docker. We will discuss different kinds of data we encounter while working with docker.

We will learn about different kinds of volumes, and explore anonymous and named volumes in detail by discussing their implementation in a node server application. We will create these volumes in our system.

Types of Data

There are three types of data we face when working with docker.

1. Application (Code + Environment): This part is written and provided by the developer and is added to the image and container in the build phase. This code can’t be changed once the image is built.

2. Temporary App Data (e.g. entered user input): Data that is fetched or produced in a running container. It is stored in memory or temporary files. Dynamic and changing, but cleared regularly. Read + write, temporarily, hence it is stored in Containers.

3. Permanent App Data (e.g. user accounts): Fetched or Produced in running containers so it is stored in files or a database. It must not be lost if the container stops / restarts. Read + write, permanently, hence stored with Containers & Volumes.

For better understanding, we will start with a node server application, that uses the kinds of data specified above.

// server.jsconst fs = require('fs').promises;const exists = require('fs').exists;const path = require('path');const express = require('express');const bodyParser = require('body-parser');const app = express();app.use(bodyParser.urlencoded({ extended: false }));app.use(express.static('public'));app.use('/feedback', express.static('feedback'));app.get('/', (req, res) => { const filePath = path.join(__dirname, 'pages', 'feedback.html'); res.sendFile(filePath);});app.get('/exists', (req, res) => { const filePath = path.join(__dirname, 'pages', 'exists.html'); res.sendFile(filePath);});app.post('/create', async (req, res) => { const title = req.body.title; const content = req.body.text; const adjTitle = title.toLowerCase(); const tempFilePath = path.join(__dirname, 'temp', adjTitle + '.txt'); const finalFilePath = path.join(__dirname, 'feedback', adjTitle + '.txt'); await fs.writeFile(tempFilePath, content); exists(finalFilePath, async (exists) => { if (exists) { res.redirect('/exists'); } else { await fs.rename(tempFilePath, finalFilePath); res.redirect('/'); } });});app.listen(80);

The server above listens at port 80. We fill in the feedback (title and description). Then it will create a file with the name of the title in a temp folder. We can even view it in the browser. On closing the container we lose the data, which we do not want.

I will add code for other files here so that you can continue with me easily.

Make two folders pages and public:

/pages/exists.html<!DOCTYPE html><html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>This title exists already!</title> <link rel="stylesheet" href="styles.css" /> </head> <body> <header> <h1><a href="/">MySite</a></h1> </header> <main> <section> <h2>This title exists already!</h2> <p>Please pick a different one.</p> <p><a href="/">Start again</a></p> </section> </main> </body></html>

/pages/feedback.html<!DOCTYPE html><html lang="en"> <head> <meta charset="UTF-8" /> <meta name="viewport" content="width=device-width, initial-scale=1.0" /> <title>Share some Feedback!</title> <link rel="stylesheet" href="styles.css" /> </head> <body> <header> <h1><a href="/">MySite</a></h1> </header> <main> <section> <h2>Your Feedback</h2> <form action="/create" method="POST"> <div class="form-control"> <label for="title">Title</label> <input type="text" id="title" name="title" /> </div> <div class="form-control"> <label for="text">Document Text</label> <textarea name="text" id="text" rows="10"></textarea> </div> <button>Save</button> </form> </section> </main> </body></html>

/public/styles.css* { box-sizing: border-box;}html { font-family: sans-serif;}body { margin: 0;}header { width: 100%; height: 5rem; display: flex; justify-content: center; align-items: center; background-color: #350035;}header h1 { margin: 0;}header a { color: white; text-decoration: none;}section { margin: 2rem auto; max-width: 30rem; padding: 1rem; box-shadow: 0 2px 8px rgba(0, 0, 0, 0.26); border-radius: 12px;}.form-control { margin: 0.5rem 0;}label { font-weight: bold; margin-bottom: 0.5rem; display: block;}input, textarea { font: inherit; display: block; width: 100%; padding: 0.15rem; border: 1px solid #ccc;}input:focus,textarea:focus { border-color: #350035; outline: none; background-color: #ffe6ff;}button { cursor: pointer; font: inherit; border: 1px solid #350035; background-color: #350035; color: white; padding: 0.5rem 1.5rem; border-radius: 30px;}

Also, make two empty folders temp and feedback. These are the folders where our data or our feedback is saved. Shown below is the project structure of our application.

media

Let's start by dockerizing the app.

DockerfileFROM nodeWORKDIR /appCOPY package.json .RUN npm installCOPY . .EXPOSE 80CMD [ "node", 'server.js' ]

Then build image

docker build -t feedback-node .

Run container

docker run -p 3001:80 --rm -d --name feedback-app feedback-node

Now on submitting feedback:

media
media

As you can see, our data is saved in the feedback folder and our filename is the same as that of our feedback title. Now if we close our container, we lose this data. When I opened this file after closing and re-running my container, this route was not found.

This is because I created my container with the --rm flag, which led to deleting my container. If I do not use this flag and just close and restart my flag, I do not lose my data, which is correct.

Note: Docker Image is read-only. It has its own internal file system. The container adds an extra read/write layer on top of the image.
So deleting the container, and creating it again from the same image will not persist our data. Multiple containers based on the same image are totally isolated from each other.

Hence, the issue here is how do I persist this data, even on container deletion?

So this is where Volumes come to save our day.

Volumes are a built-in feature of the docker. Volumes are folders on our host machine or hard drive that are mounted (or made available, "mapped") into containers.

Volumes persist if a container shuts down. If a container (re-)starts and mounts a volume, any data inside of that volume is available in the container.

A container can write data into a volume and read data from it.

Now, there are two types of external data storage:

1. Volumes (Managed by Docker): There are two types of volumes Anonymous and Named. Docker sets up a folder/path on our host machine, exact location is unknown to us(= dev). Managed via docker volume commands.

A defined path in the container is mapped to the created volume/mount. E.g. /some-path on our hosting machine is mapped to /app/data. Great for data that should be persistent but which we don’t need to edit directly.

2. Bind Mounts (Managed by us): We define an folder / path on our host machine. Great for persistent, editable (by us) data.

(e.g. source code).

Anonymous Volumes:

We will start by adding a volume to our Dockerfile.

FROM nodeWORKDIR /appCOPY package.json .RUN npm installVOLUME [ "/app/feedback" ]COPY . .EXPOSE 80CMD [ "node", "server.js" ]

Delete the old image, rebuild it again and then restart the container. Now submit feedback and close the container.

Note: The server might crash while submitting feedback, leading to a container stop. We can run the container in the attached mode to check the issue. This is a node issue. The fs.rename function might not worked. Since node issues are not relevant here, just use fs.copyFile the instead and proceed with the steps highlighted above.

On re-running the new container, again our data is lost. This is because the volume we created above was an anonymous volume, which exists only as long as the container exists.

Named Volumes:

So what is the use of this volume, if it does not help us here? We will discuss this later in the next post with bind mounts.

But to persist data after container deletion we will now create a named volume.

To proceed with that we will now create a named volume. So follow the same steps highlighted above after removing the newly added change to Dockerfile:

Rebuild the image with tag volumes:

docker build -t feedback-node:volumes .

Run container(this is the step where we add our named volume using the -v flag):

docker run -d -p 3001:80 --rm --name feedback-app -v feedback:/app/feedback feedback-node:volumes

The -v feedback:/app/feedback maps feedback folder from the container(/app/feeddback) to our local machine(volume).

Note: The first value before the colon(:) is the volume name and the second value is the path from the container.

Now test the data persistence theory:

Once you close the container, run docker volume ls command:

media

The volume still exists, even though the container is deleted.

Now if we run the container again, we can our feedback getting saved. So this works now. Data survives the container removal.

Removing anonymous volumes

They are automatically removed on stopping the container. If somehow, the container crashes and stops on its own, it just lays around and can be removed using the command:

docker volume prune

This will remove anonymous local volumes not used by at least one container.

Removing named volumes

docker volume rm VOL_NAME

That is all about anonymous and named volumes. We will continue with bind mounts in the next post.