Background / Purpose / Getting Started / Cluster Guide
The following are user-defined cluster command functions that will also facilitate communication across the cluster network. Many thanks to Andrew and his guide for coming up with this idea.
All of these functions
are to be defined in each Pi’s ~/.bashrc
file and note that some of these depend on having the IP addresses mapped to each hostname in the /etc/hosts
file.
~/.bashrc
~/.bashrc
cluster_commands.sh
)
cluster-user
command explainednodes
command explainednodes-up
command explainedworkers
command explainedmaster-cmd
command explainedcluster-cmd
command explained
cluster-reboot
command explainedcluster-shutdown
command explainedcluster-scp
command explainedcluster-scpr
command explained~/.bashrc
Before we proceed, let’s back up current ~/.bashrc
file in every Pi just in case
cp ~/.bashrc ~/.bashrc.bak
If you mess up the file, you can always revert back your changes using
cp ~/.bashrc.bak ~/.bashrc
source ~/.bashrc
~/.bashrc
Download our cluster_command.sh script to a hidden
scripts` directory
mkdir -p ~/.scripts
cd ~/.scripts
wget https://raw.githubusercontent.com/gitluis/pi-cluster/master/scripts/cluster_commands.sh
Edit ~/.bashrc
file to include cluster_commands.sh
file
sudo nano ~/.bashrc
Append the following lines at the bottom
...
# including pi-cluster commands
source ~/.scripts/cluster_commands.sh
Source for changes to take effect
source ~/.bashrc
Copy over the ~/.bashrc
file to all workers
cluster-scp ~/.bashrc
Note: For these commands to work, you must have had mapped each Raspberry Pi’s IP address to its hostname as specified in The Pi Network.
cluster_commands.sh
# ------------------- #
# Pi-Cluster Commands #
# ------------------- #
# define cluster user
USER="pi"
# return the common user across all nodes in the cluster
function cluster-user {
echo $USER
}
# return the IP address and mapped hostname of each node in the cluster
function nodes {
grep "master" /etc/hosts | awk '{print $0}' && \
grep "worker" /etc/hosts | awk '{print $0}'
}
# return which nodes are currently up and running
function nodes-up {
echo "Pinging master..."
ping master -c 1 | grep "packet loss"
for worker in $(workers)
do
echo "Pinging $worker..."
ping $worker -c 1 | grep "packet loss"
done
}
# return only the hostname of each worker in the cluster
function workers {
grep "worker" /etc/hosts | awk '{print $2}'
}
# send a command to be executed by all workers in the cluster
function master-cmd {
for worker in $(workers)
do
echo "Executing '$@' in $worker as $USER..."
ssh $USER@$worker "$@"
done
}
# send a command to be executed by all nodes in the cluster
function cluster-cmd {
master-cmd $@
echo "Executing '$@' in master as $USER..."
$@
}
# reboot all nodes in the cluster at the same time
function cluster-reboot {
cluster-cmd sudo shutdown -r now
}
# shutdown all nodes in the cluster at the same time
function cluster-shutdown {
cluster-cmd sudo shutdown now
}
# transfer files from one node to all other nodes
function cluster-scp {
for worker in $(workers)
do
echo "Executing 'scp $1 $USER@$worker:$2' in $worker as $USER..."
scp $1 $USER@$worker:$2
done
}
# transfer directories from one node to all other nodes
function cluster-scpr {
for worker in $(workers)
do
echo "Executing 'scp -r $1 $USER@$worker:$2' in $worker as $USER..."
scp -r $1 $USER@$worker:$2
done
}
Define a common user across all nodes in the cluster.
# define cluster user
USER="pi"
cluster-user
command explainedReturn the common user across all nodes in the cluster.
Usage example
$ cluster-user
pi
nodes
command explainedReturn the IP address and mapped hostname of each node in the cluster.
Usage example
$ nodes
192.168.0.100 master
192.168.0.101 worker1
192.168.0.102 worker2
192.168.0.103 worker3
192.168.0.104 worker4
nodes-up
command explainedReturn which nodes are currently up and running.
Usage example
$ nodes-up
Pinging master...
1 packets transmitted, 1 received, 0% packet loss, time 0ms
Pinging worker1...
1 packets transmitted, 1 received, 0% packet loss, time 0ms
Pinging worker2...
1 packets transmitted, 1 received, 0% packet loss, time 0ms
Pinging worker3...
1 packets transmitted, 1 received, 0% packet loss, time 0ms
Pinging worker4...
1 packets transmitted, 1 received, 0% packet loss, time 0ms
0%
packet loss reflects that such node is up and running. In the case of a node being down, it will display a 100%
packet loss, meaning all packets sent were not received.
workers
command explainedReturn only the hostname of each worker in the cluster.
Usage example
$ workers
worker1
worker2
worker3
worker4
master-cmd
command explainedSend a command to be executed by all workers in the cluster.
Usage example
$ master-cmd date
Executing 'date' in worker1 as pi...
Wed 11 Mar 2020 08:31:47 PM AST
Executing 'date' in worker2 as pi...
Wed 11 Mar 2020 08:31:47 PM AST
Executing 'date' in worker3 as pi...
Wed 11 Mar 2020 08:31:48 PM AST
Executing 'date' in worker4 as pi...
Wed 11 Mar 2020 08:31:48 PM AST
cluster-cmd
command explainedSend a command to be executed by all nodes in the cluster.
The difference between these master-cmd
and cluster-cmd
is that sometimes you want workers to do something but not the master, for that case you can use master-cmd
. And for the case in which you want all nodes in the cluster to do something, you use cluster-cmd
instead.
Usage example
$ cluster-cmd date
Executing 'date' in worker1 as pi...
Wed 11 Mar 2020 10:38:56 PM AST
Executing 'date' in worker2 as pi...
Wed 11 Mar 2020 08:32:11 PM AST
Executing 'date' in worker3 as pi...
Wed 11 Mar 2020 06:22:40 PM AST
Executing 'date' in worker4 as pi...
Wed 11 Mar 2020 09:47:10 PM AST
Executing 'date' in master as pi...
Wed 11 Mar 2020 12:12:58 PM AST
Note: The reason for $@
being after master-cmd $@
is so that the command is executed in the workers first and then in the master node. If a reboot
or shutdown
command is passed and was to be executed in the master node first then the master would be halted before it can tell the workers what they were supposed to do, thus rebooting/shutting down the master but not the workers.
date
syncAs you may have seen, each Pi’s date and time may not be synchronized. It may not be a problem now, but for those people who intend on creating “crontab” jobs, which are nothing more but scheduled job or tasks, it could be a critical issue.
First, we need to make sure htpdate
is installed on all of our Pi boards
cluster-cmd "sudo apt install htpdate -y > /dev/null 2&1"
That will align the nodes within +/- 10 seconds.
A more reliable and robust way is to have them synchronized using a remote server such as time.nist.gov
cluster-cmd sudo htpdate -a -l time.nist.gov
This may take a few minutes before it finished yet it will sync all clocks. Being off by less than 5 seconds is good enough.
cluster-reboot
command explainedReboot all nodes in the cluster at the same time.
Usage example
cluster-reboot
cluster-shutdown
command explainedShutdown all nodes in the cluster at the same time.
Usage example
cluster-shutdown
cluster-scp
command explainedTransfer (secure copy) files from one node to all other nodes.
Usage example
$ cluster-scp ~/.bashrc
Executing 'scp /home/pi/.bashrc pi@worker1:' in worker1 as pi...
.bashrc 100% 3594 2.3MB/s 00:00
Executing 'scp /home/pi/.bashrc pi@worker2:' in worker2 as pi...
.bashrc 100% 3594 2.1MB/s 00:00
Executing 'scp /home/pi/.bashrc pi@worker3:' in worker3 as pi...
.bashrc 100% 3594 2.5MB/s 00:00
Executing 'scp /home/pi/.bashrc pi@worker4:' in worker4 as pi...
.bashrc 100% 3594 2.2MB/s 00:00
This overwrites the /home/pi/.bashrc
file to the same path
in all workers.
cluster-scpr
command explainedTransfer (secure copy) directories from one node to all other nodes.
Usage example
$ cluster-scpr ~/.scripts/
Executing 'scp -r /home/pi/.scripts/ pi@worker1:' in worker1 as pi...
cluster_commands.sh 100% 1805 1.5MB/s 00:00
Executing 'scp -r /home/pi/.scripts/ pi@worker2:' in worker2 as pi...
cluster_commands.sh 100% 1805 1.7MB/s 00:00
Executing 'scp -r /home/pi/.scripts/ pi@worker3:' in worker3 as pi...
cluster_commands.sh 100% 1805 1.7MB/s 00:00
Executing 'scp -r /home/pi/.scripts/ pi@worker4:' in worker4 as pi...
cluster_commands.sh 100% 1805 1.1MB/s 00:00
This overwrites the /home/pi/.scripts
directory and its contents to the same path
in all workers.
Note: You must ensure that the preceding /
at the end is specified for all contents to be transferred over.