meta data for this page
  •  

This is an old revision of the document!


Cluster

Different nodes configuration

Cluster is using ONE shared content of /etc/pve. Content of this directory is synchronized on all cluster nodes. IMPACT:

  • storage configuration is one for whole cluster. To create local node storage, storage should be limited only own node.
  • Joining pure Debian based node (BTRFS FS) to cluster will result with “local-zfs” storage visible on Debian BTRFS node with ? sign. To prevent creation unusable storage, storage can be limited to some nodes. Please edit storage and limit it to own node.
  • Custom storage config on nodes can disappear because only one shared config is used
    • Storage removed due to cluster joining can be added manually under another name and then restricted to specific node.

NOTE: make copy of /etc/pve/storage.cfg from each node before joining to cluster.

local-ZFS replication

It is possible to schedule replication of containers/VMs to another node. It makes periodic snapshot on remote local storage. It gives data redundancy and reduces downtime when container is moved between local storage on each node.

shared storage

  • it is recommended to use shared storage available from all nodes.
    • for file pool types (iso, backups) easiest is to use NFS/CIFS share.

Live migration

  • It is only possible if VM disc resides on shared network storage (available from all nodes)

Preparation

Second network interface with separate internal IP network is recommended for redundancy and shared storage bandwidth.

Creation

From Proxmox 6.0 it is possible to use GUI t ocreate and join cluster.

Datacenter –> Cluster –> Create cluster

Joining

Joining other nodes

Join can stuck when there is problem with iSCSI connection from newly added node. Check journal on joining node;

kernel: scsi host9: iSCSI Initiator over TCP/IP
pvestatd[2097]: command '/usr/bin/iscsiadm --mode node --targetname iqn.2020-04.com.zyxel:nas326-iscsi-pve1-isos-target.tjlintux --login' failed: exit code 24
iscsid[1060]: conn 0 login rejected: initiator failed authorization with target
iscsid[1060]: Connection135:0 to [target: iqn.2020-04.com.zyxel:nas326-iscsi-pve1-isos-target.tjlintux, portal: 192.168.28.150,3260] through [iface: default] is shutdown.

Joining process will be waiting infinitely until iSCSI device will be connected. During this time it is possible to disable CHAP authentication in remote iSCSI target, or provide common credentials for Target Portal Group.

iSCSI issue with CHAP

When CHAP is used for iSCSI (manually configured from Debian console), joined node wants to connect to the same iSCSI target using own initiator name and own local configuration. Of coruse NAS326 is configured with only one CHAP login and passwod, so joining is not possible.

Is it possible to add new acl for target on NAS326 using targetcli command, or disable CHAP.

CEPH

Prepare

Read Proxmox CEPH requirements. It requires at least one spare hard drive on each node. Topic for later.

Installation

  • On one of node:
    • Datacenter –> Cepth –> Install Ceph-nautilus
    • Configuration tab
    • First Ceph monitor - set to current node. NOTE. Not possible to use other nodes now because there is no Ceph installed on it
  • Repeat installation on each node. Configuration will be detected automatically.
  • On each node - add additional monitors:
    • Select node –> Ceph –> Monitor
      • “Create” button in Monitor section, and select available nodes.

create OSD

Create Object Storage Daemon

On every node in cluster

  • Select host node
  • Go to menu CephOSD
  • Create: OSD
    • select spare hard disk
    • leave other defaults
    • press Create

create pool

  • Size - number of replicas for pool
  • Min. Size -
  • Crush Rule - only possible to choose 'replicated_rule * pg_num (Placement Groups) use Ceph PGs per Pool Calculator to calculate pg_num''
    • NOTE: It's also important to know that the PG count can be increased, but NEVER decreased without destroying / recreating the pool. However, increasing the PG Count of a pool is one of the most impactful events in a Ceph Cluster, and should be avoided for production clusters if possible.

pool benchmark

Benchmarks for pool name 'rbd' and 10 seconds duration

# Write benchmark
rados -p rbd bench 10 write --no-cleanup
 
# Read performance
rados -p rbd bench 10 seq