Creating Kubernetes Cluster on Bare Metal Servers

This is a documentation and a ready made script to setup basic Kubernetes Cluster to connect to NetBook

Note: This guide is for Ubuntu based systems

Option1 : Quick guide

If you want to do this with a simple script, follow the below steps

export IPS= ('ip1', 'ip2', 'ip3') ( IP addresses of the nodes you want to install Kubernetes clusters)

git clone https://github.com/netbookai/kubespray

cd kubespray/blob/nvidia-gpu-support/

bash multi-gpu-node.sh

Option2: Step by step guide

Prerequisites Client:

Note: All these steps should be run from one of the nodes that you want in your kubernetes cluster

  • Git

  • Python 3.8-3.10

sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.8
  • Verify by checking

python3 --version
sudo apt install ansible
  • openssh-client

sudo apt-get install openssh-client

Prerequisites Server:

  • openssh-server, openssh-client

Installing Cuda-drivers on GPU

Make sure SSH access is configured first

To configure, run

ssh-keygen

Copy the public key and add it to the authorized keys file

cat /home/***/.ssh/id_rsa.pub : copy the terminal output

Open the below file and the paste the keys

vim /home/***/.ssh/authorized_keys

Proceed on Installation steps

Note: If you are

// Some code
git clone https://github.com/netbookai/kubespray.git -b nvidia-gpu-support
cd kubespray
pip3 install -r requirements-2.12.txt 
cp -rfp inventory/sample inventory/mycluster
declare -a IPS=(172.31.12.239,172.31.12.250,172.31.12.235) #Replace with correct IPs of the nodes you want in the cluster
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
ansible-playbook -i inventory/mycluster/hosts.yaml  --become --become-user=root pre.yaml

Setting up k8s cluster

cd kubespray
ansible-playbook -u <remote_user> -i inventory/mycluster/hosts.yaml  --become --become-user=root cluster.yml 

Post K8s setup

ansible-playbook -u <remote_user> -i inventory/mycluster/hosts.yaml  --become --become-user=root post.yaml

Getting Kubeconfig of cluster

SSH to one server and the config will be found at /etc/kubernetes/admin.conf you just copy and paste this to ~/.kube/config

Installing Nvidia Plugin

Prerequisites:

Helm: Install helm via this : https://helm.sh/docs/intro/install/

kubectl create ns netbook
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin    && helm repo update
helm install --generate-name nvdp/nvidia-device-plugin -n netbook

Installing DCGM and sending logs to netbook

cd kubespray
kubectl apply -f gpu/

Post this, you should be able to see your node and cluster metrics on NetBook's platform.

To validate if everything is working fine, run the below commands

nvidia-smi

You should see GPU usage from this

To validate if Kuberentes is installed and netbook exporter is working, run this

watch kubectl get po -A

You should see pods running in netbook namespace for dcgm exporter

For issues, reach out at support@netbook.ai

Last updated