An efficient Ceph cluster

nstall Calamari

Why ?

For years, I’ve used various technologies at home to host my family data (photos, videos, and other files related to our working activities..).

At the very beginning, I owned a single linux host with samba, then a DS409slim Synology NAS, and now : a nice ZFS storage system, and so on….

I’ve always been searching for a perfect resilient storage and computing system for my VMs.

The hardware..

First, I searched for the best hardware for my needs :

1) low power but powerful

2) high speed hard disk and network interfaces

3) cheap

4) could be used and/or re-used later for other things..

The following alternatives :

  • ARM
    • pros : energy efficient
    • cons : poor performance, including disk interfaces..
  • Intel :
    • pros : standard, powerful, ..
    • cons : energy…

I looked at several hardware boxes :

ODROID XU4 (octo core, USB 3.0)

Then I found this hardware : Z8300 atom quad core, 2GB DDR3 RAM, 32GB for just 99-2 = 97 euros :

https://www.amazon.fr/BoLv-x5-Z8300-Processor-Graphics-Windows10/dp/B01DFJH78U/ref=sr_1_1?ie=UTF8&qid=1477491551&sr=8-1&keywords=bolv+z83

Cheap, seems powerful and energy efficient…

 

The software :

Debian 8.6 jessie x86_64 (ISO multi arch)

Ceph “jewel”

Topology

Number of MONs

It’s recommended to install 3 MONs for resilience reasons. As I’ve 4 nodes, i will choose 2 physical hosts for the two first MONs, and a virtualized host for the last MON (this VM is on the main server of my home “datacenter”)

Installing the cluster

Preparing the hardware and the OS

Requirements :

This blog do not cover the OS installation procedure. Before you continue be sure to :

  • Use a correct DNS configuration or configure manually each /etc/hosts file of the hosts.
  • You will need at least 3 nodes, plus an admin node (for cluster deployment, monitoring, ..)
  • You MUST install NTP on all nodes.

Now I’m assuming you have followed the installation procedure and the requirements above :). Here’s my configuration :

n0 : 192.168.10.210/24 (physical host : Z83)
n1 : 192.168.10.211/24 (physical host : Z83) 
n2 : 192.168.10.212/24 (physical host : Z83)
n3 : 192.168.10.213/24 (VM)
admin : 192.168.10.177/24 (VM)

Once you have installed jessie on the four nodes and the admin node, then let’s configure them to be deployed only from the admin node. At this point,

1) don’t forget to change your apt repositories if you installed the OSes with a local media. They should now point to a mirror for all updates (security and software).

2) check if “sudo” is intalled

root@n0:~# apt-get install sudo -y

Note : at the time these lines are written, you should install some additional packages to avoid the famous SSH hang behavior when rebooting jessie (systemd related..).

On all nodes, you’ll have to do that :

root@n0:~# apt-get install libpam-systemd dbus -y
root@n0:~# reboot (will hang your ssh session for the last time ;) )

And remember, you MUST also setup ntp. to install ntp :

On each node

sudo apt-get -y install ntp ntpdate ntp-doc

sudo systemctl enable ntp

sudo timedatectl set-ntp true

Configure it and reboot. It’s safe and more efficient to have a time source close to your cluster. Wifi AP, DSL routers often provide such services. My configuration uses my ADSL router, based on openWRT (you can setup ntpd on openwrt…)

Install Ceph

Create the ceph admin user on each node :

On each node, create a ceph admin user (used for deployment tasks). It’s important to choose a different user than “ceph” (used by the ceph installer..)

Note : you can omit the -s directive of useradd, it’s a personal choice to use bash.

root@n0:~# sudo useradd -d /home/cephadm -m cephadm -s /bin/bash
root@n0:~# sudo passwd cephadm
 Enter new UNIX password:
 Retype new UNIX password:
 passwd: password updated successfully
root@n0:~# echo "cephadm ALL = (root) NOPASSWD:ALL" | sudo tee /etc/sudoers.d/ceph
root@n0:~# chmod 444 /etc/sudoers.d/ceph

and so on to host admin, and nodes n1, n2 [and n3, …]

Setup the ssh authentication with cryptographic keys

On the admin node :

Create the ssh key for the user cephadm

root@admin:~# su – cephadm
cephadm@admin:~$ ssh-keygen -t dsa
Generating public/private dsa key pair.
Enter file in which to save the key (/home/cephadm/.ssh/id_dsa):
Created directory ‘/home/cephadm/.ssh’.
Enter passphrase (empty for no passphrase):
Enter same passphrase again:
Your identification has been saved in /home/cephadm/.ssh/id_dsa.
Your public key has been saved in /home/cephadm/.ssh/id_dsa.pub.
The key fingerprint is:
ec:16:ad:b4:76:e4:32:c6:7c:14:45:bc:c3:78:5a:cf cephadm@admin
The key’s randomart image is:
+—[DSA 1024]—-+
|           oo    |
|           ..    |
|          .o .   |
|       . …*    |
|        S ++ +   |
|       = B.   E  |
|        % +      |
|       + =       |
|                 |
+—————–+
cephadm@admin:~$

Then push it on the nodes of the cluster:

cephadm@admin:~$ ssh-copy-id cephadm@n0
cephadm@admin:~$ ssh-copy-id cephadm@n1
cephadm@admin:~$ ssh-copy-id cephadm@n2
cephadm@admin:~$ ssh-copy-id cephadm@n3

Install and configure dsh (distributed shell)

cephadm@admin:~$ sudo apt-get install dsh

…..

cephadm@admin:~$ cd
cephadm@admin:~$ mkdir .dsh
cephadm@admin:~$ cd .dsh
cephadm@admin:~/.dsh$ for i in {0..3} ; do echo “n$i” >> machines.list ; done

Test…

cephadm@admin:~$ dsh -aM uptime
n0:  17:52:09 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00
n1:  17:52:08 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00
n2:  17:52:09 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00
n3:  17:52:09 up 0 min,  0 users,  load average: 0.00, 0.00, 0.00

cephadm@admin:~$ dsh -aM cat /proc/cpuinfo | grep model\ name
n0: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n0: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n0: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n0: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n1: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n1: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n1: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n1: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n2: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n2: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n2: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n2: model name    : Intel(R) Atom(TM) x5-Z8300  CPU @ 1.44GHz
n3: model name    : Intel Core Processor (Broadwell)
n3: model name    : Intel Core Processor (Broadwell)
n3: model name    : Intel Core Processor (Broadwell)
n3: model name    : Intel Core Processor (Broadwell)

Good..

Now you’re ready to install your cluster with automated commands !

Install Ceph software

First make sure to be up to date on each nodes, at the very beginning of this procedure.

Feel free to use dsh from the admin node for each task you would like to apply to the nodes 😉

cephadm@admin:~$ dsh -aM sudo apt-get update

cephadm@admin:~$ dsh -aM sudo apt-get -y upgrade

On the admin node only, configure the apt repositories and install the ceph deployment program (python script) :

wget -q -O- ‘https://download.ceph.com/keys/release.asc’ | sudo apt-key add –
echo deb http://download.ceph.com/debian-jewel/ $(lsb_release -sc) main | sudo tee /etc/apt/sources.list.d/ceph.list
sudo apt-get -qqy update && sudo apt-get install -qqy ntp ceph-deploy

Now you have to create de new directory in the home dir of the user cephadm. The configuration of the cluster will be stored here.

cephadm@admin:~$ mkdir cluster
cephadm@admin:~$ cd cluster

Create your three MONs (according to our topology choices, I have chosen to install 3 monitors, two on the physical hosts, one on a virtualized node)

cephadm@admin:~/cluster$ ceph-deploy new n{0,1,3}
[ceph_deploy.conf][DEBUG ] found configuration file at: /home/cephadm/.cephdeploy.conf
[ceph_deploy.cli][INFO  ] Invoked (1.5.33): /usr/bin/ceph-deploy new n0 n1 n3
[ceph_deploy.cli][INFO  ] ceph-deploy options:
[ceph_deploy.cli][INFO  ]  username                      : None
[ceph_deploy.cli][INFO  ]  verbose                       : False
[ceph_deploy.cli][INFO  ]  overwrite_conf                : False
[ceph_deploy.cli][INFO  ]  quiet                         : False
[ceph_deploy.cli][INFO  ]  cd_conf                       : <ceph_deploy.conf.cephdeploy.Conf instance at 0x7f7971b61ab8>
[ceph_deploy.cli][INFO  ]  cluster                       : ceph
[ceph_deploy.cli][INFO  ]  ssh_copykey                   : True
[ceph_deploy.cli][INFO  ]  mon                           : [‘n0’, ‘n1’, ‘n3’]
[ceph_deploy.cli][INFO  ]  func                          : <function new at 0x7f7971b40500>
[ceph_deploy.cli][INFO  ]  public_network                : None
[ceph_deploy.cli][INFO  ]  ceph_conf                     : None
[ceph_deploy.cli][INFO  ]  cluster_network               : None
[ceph_deploy.cli][INFO  ]  default_release               : False
[ceph_deploy.cli][INFO  ]  fsid                          : None
[ceph_deploy.new][DEBUG ] Creating new cluster named ceph
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[n0][DEBUG ] connected to host: admin
[n0][INFO  ] Running command: ssh -CT -o BatchMode=yes n0
[n0][DEBUG ] connection detected need for sudo
[n0][DEBUG ] connected to host: n0
[n0][DEBUG ] detect platform information from remote host
[n0][DEBUG ] detect machine type
[n0][DEBUG ] find the location of an executable
[n0][INFO  ] Running command: sudo /bin/ip link show
[n0][INFO  ] Running command: sudo /bin/ip addr show
[n0][DEBUG ] IP addresses found: [‘192.168.10.210’]
[ceph_deploy.new][DEBUG ] Resolving host n0
[ceph_deploy.new][DEBUG ] Monitor n0 at 192.168.10.210
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[n1][DEBUG ] connected to host: admin
[n1][INFO  ] Running command: ssh -CT -o BatchMode=yes n1
[n1][DEBUG ] connection detected need for sudo
[n1][DEBUG ] connected to host: n1
[n1][DEBUG ] detect platform information from remote host
[n1][DEBUG ] detect machine type
[n1][DEBUG ] find the location of an executable
[n1][INFO  ] Running command: sudo /bin/ip link show
[n1][INFO  ] Running command: sudo /bin/ip addr show
[n1][DEBUG ] IP addresses found: [‘192.168.10.211’]
[ceph_deploy.new][DEBUG ] Resolving host n1
[ceph_deploy.new][DEBUG ] Monitor n1 at 192.168.10.211
[ceph_deploy.new][INFO  ] making sure passwordless SSH succeeds
[n3][DEBUG ] connected to host: admin
[n3][INFO  ] Running command: ssh -CT -o BatchMode=yes n3
[n3][DEBUG ] connection detected need for sudo
[n3][DEBUG ] connected to host: n3
[n3][DEBUG ] detect platform information from remote host
[n3][DEBUG ] detect machine type
[n3][DEBUG ] find the location of an executable
[n3][INFO  ] Running command: sudo /bin/ip link show
[n3][INFO  ] Running command: sudo /bin/ip addr show
[n3][DEBUG ] IP addresses found: [‘192.168.10.213’]
[ceph_deploy.new][DEBUG ] Resolving host n3
[ceph_deploy.new][DEBUG ] Monitor n3 at 192.168.10.213
[ceph_deploy.new][DEBUG ] Monitor initial members are [‘n0’, ‘n1’, ‘n3’]
[ceph_deploy.new][DEBUG ] Monitor addrs are [‘192.168.10.210’, ‘192.168.10.211’, ‘192.168.10.213’]
[ceph_deploy.new][DEBUG ] Creating a random mon key…
[ceph_deploy.new][DEBUG ] Writing monitor keyring to ceph.mon.keyring…
[ceph_deploy.new][DEBUG ] Writing initial config to ceph.conf…
cephadm@admin:~/cluster$

Now edit ceph.conf (in the “cluster” directory) and insert this line at the end

osd pool default size = 3

The file ceph.conf should contain the following lines :

[global]
fsid = 74a80a50-b7f9-4588-baa4-bb242c3d4cf0
mon_initial_members = n0, n1, n3
mon_host = 192.168.10.210,192.168.10.211,192.168.10.213
auth_cluster_required = cephx
auth_service_required = cephx
auth_client_required = cephx
osd pool default size = 3

Now, install ceph :

cephadm@admin:~/cluster$ for i in {0..3}; do ceph-deploy install --release jewel n$i; done

This command generates a lot of logs (downloads, debug messages…) but should return without error. Otherwise check the error and google it. You can just restart the ceph-deploy program, depending on the error it will work the second time 😉 (I’ve experienced some problems accessing the ceph repository, for ex….)

Now create the mons:

cephadm@admin:~/cluster$ ceph-deploy mon create-initial

Idem, a lot of logs… but no error..

Now create the OSDs (storage units). You have to know which device will be used for the data on each node. In my case, its like this :

n0 : /dev/sda : 500Gb WD “blue” 2,5” hard drive

n1 : /dev/sda : 1000 Gb HGST WD 2,5 ” hard drive

n2 : /dev/sda : 1000 Gb HGST WD 2,5 ” hard drive

n3 : /dev/sdb : 500 Gb WD “blue” 2,5” hard drive (used through an high performance ZFS On Linux :), presented to the VM as /dev/zvol…)

So : you have to issue the following commands depending on your devices. For me, it will be :

cephadm@admin:~/cluster$ for i in {0..2}; do ceph-deploy osd create n$i:sda; done

cephadm@admin:~/cluster$ ceph-deploy osd create n3:sdb

Note : if you previously installed ceph on a device, you MUST “zap” (delete) it before. Use the command “ceph-deploy disk zap n3:sdb” for example.

Now, deploy the ceph configuration to all storage nodes

cephadm@admin:~/cluster$ for i in {0..3}; do ceph-deploy admin n$i; done

And check the permissions. For some reasons the permissions are not correct:

cephadm@admin:~/cluster$ dsh -aM ls -l /etc/ceph/*key*
n0: -rw——- 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring
n1: -rw——- 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring
n2: -rw——- 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring
n3: -rw——- 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring

So issue the following command :

cephadm@admin:~/cluster$ dsh -aM sudo chmod +r /etc/ceph/ceph.client.admin.keyring

and check :

cephadm@admin:~/cluster$ dsh -aM ls -l /etc/ceph/*key*
n0: -rw-r–r– 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring
n1: -rw-r–r– 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring
n2: -rw-r–r– 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring
n3: -rw-r–r– 1 root root 63 Oct 26 19:01 /etc/ceph/ceph.client.admin.keyring

I like dsh and ssh.. 😉

Finally, install the metadata servers

cephadm@admin:~/cluster$ ceph-deploy mds create n0 n1 n3

and the rados gateway

cephadm@admin:~/cluster$ ceph-deploy rgw create n3

(on n3 for me)

Check the ntp status of the nodes (very important if you have several MONs)

cephadm@admin:~/cluster$ dsh -aM timedatectl|grep synchron
n0: NTP synchronized: yes
n1: NTP synchronized: yes
n2: NTP synchronized: yes
n3: NTP synchronized: yes

check the cluster, on one node type :

cephadm@n2:~$ ceph status
cluster 74a80a50-b7f9-4588-baa4-bb242c3d4cf0
     health HEALTH_OK
     monmap e2: 3 mons at {n0=192.168.10.210:6789/0,n1=192.168.10.211:6789/0,n3=192.168.10.213:6789/0}
            election epoch 32, quorum 0,1,2 n0,n1,n3
     osdmap e82: 4 osds: 4 up, 4 in
            flags sortbitwise
      pgmap v220: 112 pgs, 7 pools, 848 bytes data, 170 objects
            148 MB used, 2742 GB / 2742 GB avail
                 112 active+clean

Done.

Installing Calamari

 

dsh -aM ‘echo “deb http://repo.saltstack.com/apt/debian/8/amd64/latest jessie main” | sudo tee /etc/apt/sources.list.d/saltstack.list’

 

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Google photo

You are commenting using your Google account. Log Out /  Change )

Twitter picture

You are commenting using your Twitter account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s