The latest trend is to keep applications and data separate: server-side applications that follow this concept can usually be horizontally-scaled easily to increase application performance, cloned in case of a failure, or removed if no longer needed. Their storage is decoupled from the application itself, and its lifecycle is independent. While many application components can be designed to be stateless, this is rarely true for an entire application. The primary example is a database — other examples include logs, configuration, and user-generated content that can be relevant to applications that are not on the cutting-edge of stateless transformation (but still may be run in Docker containers). So, clearly, Docker as a technology needs to have a way to allow applications to store and manage persistent data.
One of the solutions for adding persistent data to an application is leveraging object storage like S3. However, S3 isn’t ideal for any type of data — for example, a regular database needs to be placed on a file system, and doesn’t fit the S3 concept.
To solve the problem stated above, Docker volumes were introduced. A Docker volume is a simple concept — it adds a data directory into an ephemeral container. Ephemerality here refers to the fact that an application running in a Docker container should assume that it can be removed or redeployed at any given moment. Therefore, such applications should not keep any valuable (in a long term, i.e. between user's session) data outside of the appropriate storage — and a Docker volume can be an example of such storage.
Practically, this Docker volume is attached to a container on start with either the -v option or via Dockerfile's VOLUME instruction. The volume is also managed independently from the container. It means that the volume is never deleted automatically together with the container it is attached to. Another useful thing about Docker volume is that it can be initialized with some content, like an empty database skeleton. Docker volumes are a native feature of Docker; however, such native implementation is very limited.
The most important limitation of Docker volumes is that out-of-the-box, Docker supports only local volumes (stored on the file system of the server where a container is deployed). This may be acceptable for development and testing environments, but it’s a no-go for production environments because:
- The storage is coupled with the original host so it will be lost if the server is rebuilt.
- Management and data protection strategies are hard to implement with local volumes.
- In a multi-server environment (with Swarm or Kubernetes), the application owner has little control over which particular server will be chosen to deploy a new container.
Relying on a storage which is only available locally is not a viable solution if the points from above are taken into consideration. Of course, attaching an NFS share as a mount point for a container may solve the problem for your Swarm/Kubernetes cluster for the time being, but when real production scaling begins, a reliable, manageable solution is required.
Try Virtuozzo Storage for Docker
Virtuozzo Storage is a solution that turns local storage into highly available, distributed, software-defined storage. It was designed from scratch for cloud-ready applications. Below you’ll find the top features which showcase its applicability:
- Reliability and Availability: Virtuozzo Storage automatically builds and maintains the required levels of data redundancy. It detects failures and performs auto-recovery when required. The storage cluster adjusts to provide continuous data access for clients. There is no single point of failure in any Virtuozzo Storage component—data stored on Virtuozzo Storage will continue to be available when a server hosting pieces of data goes offline.
- Combined Storage and Compute Resources: Virtuozzo Storage does not depend on or require dedicated and/or highly available storage hardware, it instead utilizes regular HDD and SSD drives. It also has very modest requirements for computing power—this allows you to combine compute and storage effectively to build a highly efficient solution that is designed to run applications and store their persistent data.
- Scalability: Linearly scalable to hundreds of nodes and petabytes of data. Both capacity and storage bandwidth scale out linearly with the storage cluster growth.
- Versatility: Block, file, iSCSI, NFS, and S3/Object storage all in one solution, covering the needs of most application developers to store their data in the most effective way. With Virtuozzo Storage you don't need to evaluate, mix and integrate different solutions for different purposes: you can have the same environment for both stateless and stateful applications.
The other two goals that we had in mind when creating Virtuozzo Storage were performance and cost-efficiency. The diagram below compares performance metrics in different access patterns (such as sequential and random reads and writes) of Virtuozzo Storage and a popular open source alternative on the same identical cluster hardware. As you can see, Virtuozzo Storage outperforms the competitor in most data access patterns, sometimes by up to 15 times:
Why Virtuozzo Storage?
The success of our storage is based on two technologies that are effectively utilized in the solution: SSD caching and journaling are used to give a significant performance boost to the storage cluster with only 1 SSD per 4 HDD, and automatic balancing of the cluster keeps "hot" data cached by moving it to less utilized disks/local disk or SSD disks. This ensures that even idle disks are utilized to increase performance.
The Docker Volume plug-in as a technology is available in Docker, going back to version 1.8. This technology allows implementing alternative (to native local volumes) storage back ends for Docker volumes. The plug-in approach adds much-desired flexibility in storage options for Docker users, who can pick a storage solution that is most appropriate for their needs.
After the Docker volume plug-in implementation, all the benefits of Virtuozzo Storage solution become available to Docker deployments. With the Virtuozzo plug-in and Docker Swarm, users can get a production-ready solution that combines storage and compute, eliminating the need for dedicated storage hardware. At the same time, storage performance grows linearly with deployment size and its capacity and performance demands, ensuring that storage is never a bottleneck in the container orchestration platform.
Getting Started with Virtuozzo Storage
Getting started with Virtuozzo Storage is easy — try your free evaluation of Virtuozzo Storage for Docker here. You’ll be provided with a trial license and all the required materials and instructions for setting up Virtuozzo Storage to support persistent storage for Docker. Your Docker cluster will be ready to store its persistent data in no time!