When using Docker, you often need to back up data stored in a volume or clone it to another volume for testing purposes.

While you might consider using the docker cp command, this command only supports copying between the container's file system and the host, and cannot directly access volumes managed by Docker.

This article introduces the correct way to copy Docker volumes. Rather than just listing commands, we will focus on 'why' you should use this method and its underlying principles. By understanding this principle, you can apply it naturally whenever needed without memorizing commands.

The core principle is simple.

The safest and most 'Docker-like' way to manipulate Docker volumes is to use a 'temporary container' that can access the relevant volume as a bridge.


1. Copying an Existing Volume to a New Volume (Volume-to-Volume)



This is the most common scenario. You may want to clone the running db_data volume to a test volume called db_data_test.

🤔 WHY: Why Use a Temporary Container?

Docker volumes are stored somewhere in the host's file system (e.g., /var/lib/docker/volumes/...), but the Docker daemon manages this path. It's not recommended to directly access this path and use the cp command, as it can lead to permission issues or data consistency problems.

Instead, we execute a temporary container that mounts two volumes simultaneously.

  1. The container mounts the source volume (source_volume) to the /from path.

  2. The container mounts the target volume (new_volume) to the /to path.

  3. The container immediately executes a simple Linux command (cp or tar) to copy all data from /from to /to.

  4. Once the command finishes, the container is automatically discarded (--rm option).

This container is used solely as a 'tool' for data copying.

🚀 HOW: Command Example

Here is an example of copying data from the source_data volume to the target_data volume.

  1. Create Test Volume (Optional)
docker volume create source_data
docker volume create target_data
# (Assuming that there is data in source_data.)
  1. Copy Using a Temporary Container
docker run --rm \
       -v source_data:/from \
       -v target_data:/to \
       alpine \
       sh -c "cp -a /from/. /to/"

💡 Command Explanation

  • docker run --rm: The --rm flag means to immediately delete the container once the task is complete. This is essential for temporary tasks.

  • -v source_data:/from: Mounts the source_data volume to the /from directory within the container.

  • -v target_data:/to: Mounts the target_data volume to the /to directory within the container.

  • alpine: Uses the alpine Linux image, which is very small yet includes basic utilities like cp and sh.

  • sh -c "...": The command that will run when the alpine container starts.

  • cp -a /from/. /to/:

    • -a (archive): This flag is crucial. Instead of a simple copy (cp -r), it preserves all attributes such as ownership, permissions, and timestamps during the copy. This is very important when dealing with sensitive data like database files.

    • /from/.: Refers to all contents inside the /from directory (including hidden files) rather than the directory itself.


2. Copy Volume Data to Host File System (Backup)

This is used when you want to back up the volume data to a specific directory on your local machine or server as a .tar file.

🤔 WHY: How Does It Work?

The principle is the same as in point 1; only the target has changed from 'Docker volume' to 'host directory'.

Docker supports a bind mount feature that mounts a specific directory from the host into the container.

  1. The container mounts the source volume (source_volume) to the /data path.

  2. The container mounts a specific directory on the host ($(pwd)/backup) to the /backup path.

  3. The container copies (or compresses) the contents of /data to the /backup directory.

  4. This action happens within the container, but since /backup is linked to the actual host directory, the results remain on the host.

🚀 HOW: Command Example

Copies the data from the source_data volume to the backup directory in the current location ($(pwd)).

# Create backup directory on the host
mkdir -p $(pwd)/backup

docker run --rm \
       -v source_data:/data:ro \
       -v $(pwd)/backup:/backup \
       alpine \
       cp -a /data/. /backup/

💡 Command Explanation

  • -v source_data:/data:ro: The :ro (Read-Only) flag has been added. Since backup only needs to read the data, it is advisable to mount it as read-only to prevent accidental modification of the source volume.

  • -v $(pwd)/backup:/backup: Unlike source_data (volume name), the / beginning or $(pwd) (current path) indicates this is an absolute/relative path that refers to the host's directory.


3. Key Copy Tips



Tip 1: Efficient Copy using tar (Compressed Backup)

Using tar instead of the cp command has several advantages, especially when backing up data as an archive file (.tar.gz).

Volume-to-Volume (using tar)

This can be faster than cp, particularly when there are many files.

docker run --rm \
       -v source_data:/from \
       -v target_data:/to \
       alpine \
       sh -c "cd /from && tar -cf - . | (cd /to && tar -xf -)"
  • cd /from && tar -cf - .: Move to /from and package the contents of the current directory (.) into a tar file (c) that outputs to standard output (-).

  • | (cd /to && tar -xf -): Receives the standard output through a pipe (|) and extracts (x) the tar file in the /to directory.

Volume-to-Host (Compressed Backup)

This compresses the contents of the source_data volume into a backup.tar.gz file and saves it on the host.

docker run --rm \
       -v source_data:/data:ro \
       -v $(pwd):/backup \
       alpine \
       tar -czf /backup/backup.tar.gz -C /data .
  • tar -czf /backup/backup.tar.gz ...: c (create), z (gzip compress), f (file) creates the /backup/backup.tar.gz file.

  • -C /data .: This part is important. It first changes to the /data directory before compressing everything inside (.

    ) to avoid including unnecessary /data paths in the tar file.

Tip 2: Restoring Host Files to Volume

The reverse of backup, that is, restoration, also uses the same principle. You just need to reverse the direction of cp.

# Restore data from the $(pwd)/backup directory to new_data volume
docker run --rm \
       -v new_data:/data \
       -v $(pwd)/backup:/backup:ro \
       alpine \
       cp -a /backup/. /data/

Summary

The key to handling Docker volumes is the mindset of "using containers as tools". Instead of directly manipulating volume files on the host, run a temporary container mounted with the necessary volumes and directories to safely execute standard Linux commands like cp or tar within it.

If you remember this principle, you'll be able to freely copy, back up, and restore volume data in any situation.