Installing Datastax Enterprise on Raspberry PI 2 with Ubuntu Core OS | Apache Cassandra and DataStax Enterprise - DataStax Academy

 

So for something a little different I was tasked with creating a DSE cluster on some Raspberry Pi's.

2 Datacenters running 3 Raspberry PI's in each DC. (3 Node / 2 DC - Running a pure Cassandra workload)

This turned out to be a really cool and fun project but wasn't straight forward by any means. I thought I would share my experiences and hope you enjoy the show. This is literally a how to guide as much as anything else to allow anyone to be able to put it together with as little hassle as possible.

Did I hit any challenges on the way? You bet ya!

One of the challenges I initially hit was the architecture of the Raspberry Pi's, these are ARM architecture vs i386/x64 so I wasn't sure what was available to the ARM architecture repo in APT for the Raspberry Pi's.

As I soon discovered these Pi guys are awesome, everything from Java to the usual OS tools such as ethtool, sysstat, python 2.7 was available via APT

The next question was what distribution of Raspberry Pi's would be the best candidate for running DSE. The most popular release is Raspbian Pi OS, it's essentially a Debian derivative but had some known qwerks. (Different package manager, different tools)

I came across UbuntuCore for Raspbian Pi which was just a cutdown version of Ubuntu on the ARM architecture. To me this was the best of both worlds, but as I got further along the build process there were some caveats on using UbuntuCore as well.

In the end I would have just stuck to using Raspbian Pi OS, the differences were not an issue in the end and UbuntuCore is still quite new so there were some issues like configuring wifi drivers and compiling certain things under UbuntuCore that were problematic.

There were other issues but I'll focus on the build and what I encountered. Setting up the UbuntuCore image was pretty straightforward, just use the inbuilt `dd` command to image the .img image file to the sdcard. The Raspberry pi only boots from the SD card.

The imaging and basic setup post imaging on the first pi was straightforward, but….  I noticed straight away performance was going to be a pain point.

I continued with the installation and had to start getting the wifi dongle we had procured working for the management of the Pi's. This wasn't so straightforward. I had to compile and build the wifi drivers as they were not included in the base UbuntuCore image nor via a package I could install. So Github to the rescue, after some fiddling i managed to get the driver compiled, inserted and persist on each reboot.

Make sure you use root not sudo as it compiles and builds, and you can load the module but for some reason it wasn't being saved in the correct location so wasn't being persisted on each restart of the PI's.

I then got wpa supplicant (The wifi WPA support) working, I hit a few issues configuring this (The configuration file) but in the end I simplified the config based on the blogs online and it worked.

The rest of the build as per the steps in the blog (Below) up until moving the filesystem to the usb stick was easy and pretty pain free. i.e. installing java ect.

Once the main DSE package was installed and configured via tarbal (There is no ARM package available via APT for DSE) . I configured the basics in DSE, got the node up and running and seemed good so far.

But… The performance was bugging me, for show and tell it was great but I wanted to set this up as a proper cluster not just a show toy. So after some reading I figured out to move all data related information to a USB stick as the throughput was much higher compared to the sdcards we are using by default.

Essentially I moved everything except the boot partition to the USB stick (See below), once I did that performance was much much better. I then completed bringing up the other nodes in the first DC. No hiccups or problems with bootstrapping, when it came time to run cassandra-stress I noticed something.

As I added more Pi's to the ring I expected performance to improve, that wasn't the case. I was getting write timeouts and encountered bizarre behaviour. Nodetool status was taking a long time to return as an example.

I continued the build process adding the second DC with 3 nodes. Updated replication and ran cassandra-stress again. This time it was worse, setting up replication and running repair was fine but write performance let alone read performance sort of sucked.

After doing some reading and playing around I worked a few things out. Firstly in DSE 5.1 there is a new feature in the cassandra.yaml` called `The strategy for optimizing disk read` By default it's set to ssd, the other option is spinning (For spinning disk) changing this to spinning seemed to make a huge difference with both the read and write latencies. I also lowered a number of settings within DSE such as concurrent reads/writes and increased general timeout settings to allow for the latency being seen on the Pi's. (Again see below)

Just for reference we had been told to use OSS C* as nobody could get DSE to run on the pi's but with persistence we manage to get it all running fine.

Remember the Raspberry pi's only have a very limited number of resources so keep that in mind when playing with any perf related settings.

We also did some testing with docker on the pi's that was very interesting, and TBH it performed better. We will blog that at another time. Here are the raw steps so enjoy!!

Now for the install



Pre-requisites

  • SDCard – At least 32GB (Inc. DSE data)

  • Laptop with sdcard reader

  • Mouse / Keyboard / monitor (HDMI)

  • Download Raspberry PI OS image file – UBUNTU CORE 16.04 for Raspberry PI 2/3

  • Wifi Dongle (TP​ ​Link​ ​TL-WN725N​ ​wireless​ ​adapter) (Raspberry pi 2 only) (Optional)

  • Network Switch (optional)

Note: I ended up using the onboard ethernet for node connectivity and used the wifi network for management of the cluster and attached devices.  

 Ubuntu SSO Account Link → https://developer.ubuntu.com/core/get-started/raspberry-pi-2-3



Steps to create SDCard for boot

  • Unzip downloaded image file  ubuntu-core-16-pi2.img.xz

$ cd [location of downloaded image file]

$ tar xvfj ubuntu-core-16-pi2.img.xz

$ ls

ubuntu-core-16-pi2.img
  • Copy the image to the SDCard using the `dd` utility

note: In this example the SDCard is located at `/dev/sdb`, change to suit your device path

e.g

For USB devices and some SDCard readers use the following mount point

$ sudo dd status=progress bs=1M if=ubuntu-core-16-pi2.img of=/dev/sdb

or

Most SDCard readers (Devices) are mounted like so:

$ sudo dd status=progress bs=1M if=ubuntu-core-16-pi2.img of=/dev/mmcblk0 



Note: These instructions are for placing the whole image on sdcard, to use sdcard for boot and USB stick for root you can also use an imaging tool (Google and download the software of choice such as Etcher https://etcher.io/)Once the image has been copied to the SDCard unmount, remove and insert it into the Raspberry Pi (2).

Also note that the default block size of 512Kb was used when imaging the sdcard using `dd`, it's been suggested to use a block size of 1Mb when imaging the sdcard for performance reasons. I've used a 1Mb blocksize in the examples above. 

  • Apply power to boot the Raspberry Pi (2) (This may take a couple of minutes on first boot to come up)

Note: Another suggestion is that with no ethernet loopback plug or an active LAN cable connected it can take up to 5 minutes to boot - Buy a loopback RJ45 plug or plug/terminate an ethernet cable into a switch to avoid a delayed startup

The​ ​system​ ​will​ ​boot​ ​then​ ​become​ ​ready​ ​to​ ​configure.​ ​The​ ​device​ ​will​ ​display​ ​the​ ​prompt:

Press Enter to Configure

Press​ ​enter​ ​then​ ​select​ ​“Start”​ ​to​ ​begin​ ​configuring​ ​your​ ​network​ ​and​ ​an​ ​administrator account.​ ​Follow​ ​the​ ​instructions​ ​on​ ​the​ ​screen,​ ​you​ ​will​ ​be​ ​asked​ ​to​ ​configure​ ​your​ ​network and​ ​enter​ ​your​ ​Ubuntu​ ​SSO​ ​credentials (Complete the stepsto complete the OS installation)

Note: There​ ​is​ ​no​ ​default​ ​ubuntu​ ​user​ ​on​ ​these​ ​images,​ ​but​ ​you​ ​can​ ​run​ ​sudo​ ​passwd​ ​​ ​to​ ​set​ ​a​ ​password​ ​in​ ​case​ ​you​ ​need​ ​a​ ​local​ ​console​ ​login. Don't forget you also need an Ubuntu SSO account with SSH key details Ubuntu SSO Account → https://developer.ubuntu.com/core/get-started/raspberry-pi-2-3



Driver installation

Wifi​ ​Driver​ ​install​ ​Steps

(For​ ​the​ ​TP​ ​Link​ ​TL-WN725N​ ​wireless​ ​adapter) Open​ ​a​ ​terminal​ ​and​ ​run​ ​the​ ​following​ ​(driver​ ​has​ ​already​ ​been​ ​downloaded​ ​and located​ ​on​ ​the​ ​home​ ​directory​ ​for​ ​the​ ​ubuntu​ ​user​ ​on​ ​the​ ​image:

$ sudo apt-get update
$ sudo apt-get install linux-headers-$(uname -r)
$ sudo apt-get update
$ sudo apt-get install build-essential
$ sudo apt-get install git
$ sudo apt-get install wpasupplicant
$ git clone https://github.com/lwfinger/rtl8188eu
$ cd rtl8188eu
$ sudo su
$ make all
$ make install
$ insmod 8188eu.ko



Wifi WPA security and general network configuration

Edit​ ​the `50-cloud-init.cfg`​ ​configuration file:

/etc/network/interfaces.d/50-cloud-init.cfg
  • Configure onboard ethernet to use a static IP address and add​ ​the​ ​`wlan0`​ ​options​ ​to​ ​the file​ ​as​ ​per​ ​the​ ​example​ ​below: (This adds the wifi device interface to the network stack)

$ sudo vi /etc/network/interfaces.d/50-cloud-init.cfg
auto lo
iface lo inet loopback
auto eth0 allow-hotplug eth0
iface eth0 inet static ← THIS
address 10.10.10.101 ← THIS
netmask 255.255.255.0 ← THIS
auto wlan0 allow-hotplug wlan0 ← THIS
iface wlan0 inet dhcp ← THIS
wpa-conf /etc/wpa_supplicant/wpa_supplicant ← THIS
  • Edit​ ​​`/etc/wpa_supplicant/wpa_supplicant.conf`​ ​so​ ​it​ ​looks​ ​similar​ ​to​ ​the below​ ​example​ ​for​ ​a​ ​standard​ ​wifi​ ​network​ ​running​ ​WPA2-PSK. These are your wifi security details: (The details below are an example only)

$ sudo vi /etc/wpa_supplicant/wpa_supplicant.conf

ctrl_interface=DIR=/var/run/
update_config=1
network={
ssid="AP"
psk="12345abcde"
psk=4568b5b43754168001c8f1935955c814581cee05f91733054c407f5272ed4587

Note: More​ ​on​ ​how​ ​to​ ​configure​ ​wpa_supplicant​ ​, scanning local networks and​ ​configuring​ ​Local​ ​AP​ ​scan​ ​options​ ​in the​ ​following​ ​link​ ​ ​→​ ​https://shapeshed.com/linux-wifi/

  • Enable​ ​the​ ​wifi​ ​adapter (wlan0 is my wifi network adapter description, yours may differ)

$ sudo ifup wlan0

ifup: interface wlan0 is configured
  • Add​ ​the​ ​wifi​ ​drivers​ ​to​ ​the​ ​modules​ ​list​ ​so​ ​they​ ​are​ ​loaded​ ​at boot time

  • edit​ ​`/etc/modules`​, add​ ​`rtl8188eu`​ to the bottom of the file ​and​ ​save​. 

$ sudo vi /etc/modules

# /etc/modules: kernel modules to load at boot time.
#
# This file contains the names of kernel modules that should be loaded
# at boot time, one per line. Lines beginning with "#" are ignored.
rtl8188eu



Pre-requisites for Datastax​ ​Enterprise installation

Pre-requisites

● Oracle​ ​Java​ ​8

● Credentials​ ​to​ ​download​ ​Datastax​ ​Enterprise (Currently​ ​at​ ​version​ ​ ​DSE​ ​5.1.1​ ​for​ ​this​ ​install)

● Python​ ​2.7.x

Procedure​ ​for​ ​Installing​ ​Oracle​ ​Java​ ​8​ ​via​ ​`apt` See​ ​the​ ​following​ ​link​ ​for​ ​additional​ ​options Link→​ ​http://www.webupd8.org/2012/09/install-oracle-java-8-in-ubuntu-via-ppa.html

$ sudo add-apt-repository ppa:webupd8team/java

$ sudo apt-get update

$ sudo apt-get install oracle-java8-installer

               (Accept the Oracle Java 8 license agreement)

$ sudo apt-get install oracle-java8-set-default



Installing​ ​Datastax​ ​Enterprise​ ​5.1.1

See​ ​the​ ​following​ ​doc​ ​link​ ​for​ ​installation​ ​instructions​ ​to​ ​download​ ​and​ ​install​ ​the Datastax​ ​Enterprise​ ​binary​ ​tarbal: link → http://docs.datastax.com/en/dse/5.1/dse-admin/datastax_enterprise/instal...

Steps:

  • Download​ ​the​ ​binary​ ​tarbal​ ​using​ ​curl

Note: ​You will need your ​Datastax ​Academy ​login ​credentials:

$​ ​curl​ ​--user​ ​dsa_email_address:password​ ​-L​ ​https://downloads.datastax.com/enterprise/dse.tar.gz​ ​|​ ​tar​ ​xz

​ ​%​ ​Total​ ​ ​ ​ ​%​ ​Received​ ​%​ ​Xferd​ ​ ​Average​ ​Speed​ ​ ​ ​Time​ ​ ​ ​ ​Time​ ​ ​ ​ ​ ​Time​ ​ ​Current
​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​ ​Dload​ ​ ​Upload​ ​ ​ ​Total​ ​ ​ ​Spent​ ​ ​ ​ ​Left​ ​ ​Speed
100​ ​ ​ ​245​ ​ ​100​ ​ ​ ​245​ ​ ​ ​ ​0​ ​ ​ ​ ​ ​0​ ​ ​ ​ ​117​ ​ ​ ​ ​ ​ ​0​ ​ ​0:00:02​ ​ ​0:00:02​ ​--:--:--​ ​ ​ ​117
100​ ​ ​539M​ ​ ​100​ ​ ​539M​ ​ ​ ​ ​0​ ​ ​ ​ ​ ​0​ ​ ​2603k​ ​ ​ ​ ​ ​ ​0​ ​ ​0:03:32​ ​ ​0:03:32​ ​--:--:--​ ​2879k



Configuration of Datastax Enterprise 5.1.1

  • Make​ ​the​ ​following​ ​directories: 

$ ​sudo ​mkdir ​-p ​/var/lib/cassandra; ​sudo ​chown ​-R ​$USER:$GROUP ​/var/lib/cassandra

$ ​sudo ​mkdir ​-p ​/var/log/cassandra; ​sudo ​chown ​-R ​$USER:$GROUP ​/var/log/cassandra

$ ​sudo ​mkdir ​-p ​/var/lib/dsefs; ​sudo ​chown ​-R ​$USER:$GROUP ​/var/lib/dsefs

$ ​sudo ​mkdir ​-p ​/var/lib/spark; ​sudo ​chown ​-R ​$USER:$GROUP ​/var/lib/spark

$ ​sudo ​mkdir ​-p ​/var/log/spark; ​sudo ​chown ​-R ​$USER:$GROUP ​/var/log/spark
  • CD​ ​into​ ​the​ ​Datastax​ ​Enterprise​ ​5.1.1​ ​directory​ ​and​ ​configure​ ​DSE

  • Editing the cassandra.yaml  (This is the minimal configuration required to get DSE running in a pure Cassandra environment)

e.g. My node configuration. 

/home/ubuntu/dse-5.1.1 $ vi resources/cassandra/conf/cassandra.yaml

Clustername = `Datastax AP pie Cluster`

num_tokens: 16

Listen_address = 10.10.10.101

rpc_address = 10.10.10.101

Seeds: `10.10.10.101,10.10.10.102`

​Note: See​ ​additional​ ​tuning​ ​advice​ ​further​ ​down

  • Starting​ ​dse​ ​(cassandra​ ​mode)

/home/ubuntu/dse-5.1.1 $ bin/dse cassandra
  • Stopping​ ​dse​ ​(cassandra​ ​mode)

/home/ubuntu/dse-5.1.1 $ bin/dse cassandra-stop

Note: If creating a Multi-DC environment then make sure you set the correct snitch in your cassandra.yaml file. (Using GPFS for this installation)

Here is a link to our docs on how to add another DC when adding the final 3 nodes → https://docs.datastax.com/en/cassandra/3.0/cassandra/operations/opsAddDC...



Moving​ ​the​ ​ROOT​ ​partition​ ​to​ ​USB​ ​disk:

sudo apt-get update && sudo apt-get install rsync gdisk

sudo mke2fs -t ext4 -L rootfs /dev/sda1

sudo mount /dev/sda1 /mnt

sudo rsync -axv / /mnt

sudo cp /boot/firmware/cmdline.txt /boot/firmware/cmdline.txt.bkp

sudo vi /boot/firmware/cmdline.txt

Change root=/dev/mmcblk0p2 to root=/dev/sda1

Save changes and reboot

(Please​ ​note​ ​the​ ​device​ ​ID's​ ​apparently​ ​changed​ ​after​ ​reboot​ ​so​ ​re-compiling​ ​the​ ​wifi​ ​driver and​ ​loading​ ​it​ ​may​ ​be​ ​required after moving the partition)



OS/DSE/Cassandra​ ​Tuning:



Operating​ ​system​ ​tweaks:

On​ ​the​ ​raspberry​ ​pi​ ​2​ ​/​ ​3​ ​you​ ​can​ ​modify​ ​some​ ​of​ ​the​ ​pi​ ​related​ ​settings​ ​to​ ​get​ ​a performance​ ​boost.​ ​We​ ​will​ ​discuss​ ​the​ ​pi​ ​2​ ​tunings​ ​here.

  • Under the ​`/boot/firmware`​ direcory there is a file called `cmdline.txt` or `config.txt`. Edit​ ​`cmdline.txt` or `config.txt`​ ​and​ ​increase​ ​the cpu frequency ​from​ ​the​ ​default​ ​of​ ​`700`​ ​to​ ​`850` or up to 1000 (1GHz)

  • Change​ ​the​ ​default​ ​cpu​ ​scheduler​ ​to​ ​`NOOP` Edit​ ​`config.txt`​ ​under​ ​`/boot/firmware`​ ​and​ ​modify​ ​`elevator=deadline`​ ​to​ ​`elevator=noop`

  • It's​ ​also​ ​worth​ ​setting​ ​the​ ​cpu​ ​governor​ ​to​ ​`performance`​ ​from​ ​`onDemand`​.

Here are some other cpu/gpu/overvoltage tweaks you can set but use at your own risk:

/boot/config.txt:
arm_freq=800

high hack:
/boot/config.txt:
arm_freq=900
gpu_freq=250
sdram_freq=500

extreme hack:
/boot/config.txt 
arm_freq=1000
core_freq=500
sdram_freq=500
over_voltage=6

 

About LXDM (Display Manager - Desktop GUI)

 

I installed LXDM display manager (xwindows GUI) to simplify some of the set up, once done I suggest disable to the GUI to conserve resources. Edit​ ​and​ ​comment​ ​out​ ​the​ ​lxdm​ ​window​ ​manager​ ​from​ ​starting​ ​when​ ​the​ ​OS​ ​starts.

Note: As mentioned In my installation I downloaded the LXDM (desktop GUI) via apt (That was a lightweight display manager available on ARM arch)

$ sudo apt install lxdm 

To​ ​shutdown​ ​xserver​ ​run:

$ sudo service lxdm stop

To start the xserver make sure you remove the comment in `/ect/X11/default-display-manager` for `/usr/bin/lxdm` (Opposite to what we did earlier - above)

$ sudo service lxdm start



Disable​ ​the​ ​xserver​ ​from​ ​starting:

  • Edit​ ​and​ ​comment​ ​out​ ​the​ ​lxdm​ ​window​ ​manager​ ​from​ ​starting​ ​when​ ​the​ ​OS​ ​starts.

  • Edit:​ ​ ​`/etc/X11/default-display-manager`​ ​and​ ​comment​ ​out​ ​`/usr/bin/lxdm`

$ sudo vi /etc/X11/default-display-manager

#/usr/bin/lxdm



Cassandra​ ​specific​ ​tunings:

Since​ ​the​ ​IO​ ​is​ ​limited​ ​on​ ​the​ ​Raspberry​ ​pi's​ ​it​ ​was​ ​found​ ​that​ ​increasing​ ​the​ ​read/write timeouts​ ​was​ ​needed​ ​to​ ​allow​ ​write​ ​and​ ​read​ ​to​ ​happen​ ​otherwise​ ​timeouts​ ​occurred frequently

It​ ​was​ ​also​ ​found​ ​that​ ​reducing​ ​the​ ​various​ ​throughput​ ​settings,​ ​turning​ ​off​ ​compression etc.​ ​also​ ​reduced​ ​load​ ​on​ ​the​ ​pi's​ ​allowing​ ​them​ ​to​ ​be​ ​more​ ​performant.

  • Modify the following in the `cassandra.yaml`

/home/ubuntu/dse-5.1.1 $ vi resources/cassandra/conf/cassandra.yaml

read/write concurrency:

concurrent_reads: 8
concurrent_writes: 8
concurrent_counter_writes: 8
concurrent_materialized_view_writes: 8

read/write timeouts:

read_request_timeout_in_ms: 50000
range_request_timeout_in_ms: 100000
write_request_timeout_in_ms: 20000
counter_write_request_timeout_in_ms: 50000
cas_contention_timeout_in_ms: 10000
truncate_request_timeout_in_ms: 600000
request_timeout_in_ms: 100000

throughputs:

compaction_throughput_mb_per_sec: 8
stream_throughput_outbound_megabits_per_sec: 50
inter_dc_stream_throughput_outbound_megabits_per_sec: 50
internode_compression: none

C*/DSE USB/SDcard tweak to help with slow IO (Change from SSD to spinning):

# The strategy for optimizing disk read
# Possible values are:
# ssd (for solid state disks, the default)
# spinning (for spinning disks)
disk_optimization_strategy: spinning



DSE​ ​specific​ ​tunings:

  • In​ ​the​ ​`dse.yaml`​ ​turn​ ​off​ ​the​ ​`cql_slow_log`​.

  • Uncomment​ ​and​ ​set​ ​`enabled`​ ​to​ ​`false`

cql_slow_log_options:
enabled: false

Setting the tuning options above will help with performance and can be safely completed once the intial OS build and install of DSE has been achieved. This information was accurate at the time of writing and any referred information or supplied links aren't the responsibility of Datastax. I hope this blog helped anyone who wanted to bring up a Raspberry Pi cluster running C*/DSE . 

A quick mention about the Raspberry Pi 3, the setup is a little different than what I experienced on the Pi 2 so these steps I provided should be mostly transferable to the Pi 3. Having onboard wifi that have the drivers already pre-bundled into the core-os speeds up the install process. That and Raspberry OS seemed a little easier to configure and performed better than ubuntu-core.