Cloud Storage Installation Requirements

Cloud Storage is a solution for storing containers and is optimized for large amounts of data. It provides replication, high availability, self-healing features for the data, and automatic container fail-over. This guide lists the specific requirements for the Cloud Storage installation.

The diagram below shows the Virtuozzo Application Platform installation requirements based on the Cloud Storage scenario.

Virtuozzo Application Platform cloud storage

Note: The main hardware requirements for the infrastructure and user hosts (which are acting as Clients in Cloud Storage) can be found in the Local Storage guide.

The basic element of Cloud Storage is a VStorage cluster. The Cloud Storage cluster consists of bare-metal servers with three roles assigned:

  • Metadata service (MDS) stores metadata about chunk servers and controls how files keep the contents of virtual machines, how containers are split into chunks, and where these chunks are located.
  • Chunk service (CS) stores all the data on local disks of the server (including the contents of virtual machines and containers) in the form of fixed-size chunks and provides access to them. The data replication for Cloud Storage shall be set to 3:2 (or higher) to provide better I/O load balancing and ensure fault tolerance / high availability in case of chunk servers' failure.
  • User nodes as Storage Clients (CL) are servers where the VZ-based containers and virtual machines run and consume the storage from the Cloud Storage cluster chunk services.

cloud storage cluster

Note: Even though Erasure Coding for Cloud Storage provides higher values of allocatable space for the entire cloud storage cluster - it is not recommended as a production solution for containers' data storing. Erasure Coding is designed for low-load scenarios and is suitable for storing static data only.

For better understanding, please check the Virtuozzo Storage documentation. In case you have any questions remained, please contact the Operations Team.

And below, the following information regarding the Cloud Storage installation is presented:

General Requirements

Besides infrastructure and user hosts, you need to provide at least 5 bare-metal servers (9 or more are recommended) for your Cloud Storage cluster. These bare-metal servers require a separate internal network, which is entirely dedicated to the Cloud Storage CS+MDS roles, and some specific configurations for disks and networking.

Note: Only bare-metal servers should be used for cloud storage. Virtual Machines are not allowed due to the poor performance.

Requirements for Cloud Storage hosts:

1. Chunk services should be run on at least 5 servers (9 or more recommended for better performance).

Minimum hardware requirements for the CS (+MDS) servers:

  • CPU: 10 Cores / 20 Threads; 1.8 Ghz+
  • RAM: 16 GB+; dual channel is highly recommended
  • Disks: 300 GB+ for OS and MDS purposes; 4+ separate SSD drives for each CS+MDS server

2. MDS services should be run on roughly 3 or 5 servers:

  • for clusters with up to 15 servers with chunk service roles, you can combine the MDS service role with chunk service and user nodes (i.e. some servers can be shared between MDS and chunk service roles)
  • for larger clusters, the MDS service role should be run on dedicated servers (i.e. with no roles combined)

3. A separate network for Cloud Storage traffic should be provided, meeting the following requirements:

  • the network should operate at least at 10 Gbps speed
  • the ethernet switches should be non-blocking
  • check the network sizing guidelines below

4. In the case of separate MDS services - the hardware specifications should be the same as for the CS servers.

5. For each server with MDS service, an extra 1 GB of RAM should be added per 100 TB of the total storage in the cluster.

Note: It is highly recommended to split cluster roles per server and store CS+MDS services/roles of the cluster separately from the user nodes (Clients) to avoid possible performance collisions. Cluster roles (CS/MDS) can be combined with the user node (CL) role on the same server only on the Sandbox/PoC platform.

Storage and Storage Devices Requirements

Each server in the cluster, regardless of which Cloud Storage role it is assigned, should have 100-150 GB of reliable RAID1 level storage (or similar in terms of reliability - 2 disks for Linux LVM mirror) for the Virtuozzo 7 operating system installation.

Servers with Chunk Services export local directories as an internal storage backend for the Cloud Storage cluster. These directories have separate ext4 file systems mounted and store data chunks. This ext4 file system should fill exactly one disk, which should not be shared with other file systems. No RAID1 / RAID5 / RAID6 / LVM / dmraid configuration should be used for these file systems; however, RAID controllers can still be used to export individual disks as separate single-disk RAID 0 arrays or JBOD block devices.

Consider the following regarding the RAID controllers usage:

  • configure passthrough for each drive to be exported for chunk service
  • if it is not supported by the RAID controller, configure a separate RAID0 for every drive (do not include other drives in this RAID0)
  • make sure the RAID controller cache is battery-protected

SSD drives considerations:

  • if you use low-end SSD drives for the Cloud Storage backend on chunk servers, additional fast SSD/NVMe drives are recommended for writing journaling on chunk services
    • it might increase overall cluster performance up to 2 times
    • there might be more than one SSD drive needed, which depends on the number and the speed of SSDs you have
    • for proper sizing, please contact the Operations Team
  • consider server-grade SSDs
  • consider SSD drives with support for data protection on power loss
Note: HDD drives are not recommended for production platforms due to poor IOPS performance.

Please, contact the Operations Team if you have questions.

Network considerations:

  • use a separate network for the Cloud Storage traffic
  • the network should operate at least at 10 Gbps; 25 Gbps are recommended
  • the ethernet non-blocking switches usage is recommended
  • network bonding is recommended for better reliability

Storage sizing guidelines:

  • usually, each data chunk is stored in several replicas as a pre-requirement for HA, achieving redundancy and IO load balancing
  • to roughly calculate the usable data in the cluster, refer to Virtuozzo documentation

VM Usage

Virtual machines are not allowed to be used with Cloud Storage scenario in any way due to significant loss of the overall performance and stability.

What’s next?