Creating Kubernetes Cluster on Bare Metal Servers
This is a documentation and a ready made script to setup basic Kubernetes Cluster to connect to NetBook
Note: This guide is for Ubuntu based systems
Option1 : Quick guide
If you want to do this with a simple script, follow the below steps
export IPS= ('ip1', 'ip2', 'ip3') ( IP addresses of the nodes you want to install Kubernetes clusters)
git clone https://github.com/netbookai/kubespray
cd kubespray/blob/nvidia-gpu-support/
bash multi-gpu-node.sh
Option2: Step by step guide
Prerequisites Client:
Note: All these steps should be run from one of the nodes that you want in your kubernetes cluster
Git
Python 3.8-3.10
sudo apt update
sudo apt install software-properties-common
sudo add-apt-repository ppa:deadsnakes/ppa
sudo apt install python3.8
Verify by checking
python3 --version
Ansible Installation guide(kubespray/ansible.md at master Β· netbookai/kubespray ) if this doesnβt work try
sudo apt install ansible
openssh-client
sudo apt-get install openssh-client
Prerequisites Server:
openssh-server, openssh-client
Installing Cuda-drivers on GPU
Make sure SSH access is configured first
To configure, run
ssh-keygen
Copy the public key and add it to the authorized keys file
cat /home/***/.ssh/id_rsa.pub : copy the terminal output
Open the below file and the paste the keys
vim /home/***/.ssh/authorized_keys
Proceed on Installation steps
Note: If you are
// Some code
git clone https://github.com/netbookai/kubespray.git -b nvidia-gpu-support
cd kubespray
pip3 install -r requirements-2.12.txt
cp -rfp inventory/sample inventory/mycluster
declare -a IPS=(172.31.12.239,172.31.12.250,172.31.12.235) #Replace with correct IPs of the nodes you want in the cluster
CONFIG_FILE=inventory/mycluster/hosts.yaml python3 contrib/inventory_builder/inventory.py ${IPS[@]}
ansible-playbook -i inventory/mycluster/hosts.yaml --become --become-user=root pre.yaml
Setting up k8s cluster
cd kubespray
ansible-playbook -u <remote_user> -i inventory/mycluster/hosts.yaml --become --become-user=root cluster.yml
Post K8s setup
ansible-playbook -u <remote_user> -i inventory/mycluster/hosts.yaml --become --become-user=root post.yaml
Getting Kubeconfig of cluster
SSH to one server and the config will be found at /etc/kubernetes/admin.conf
you just copy and paste this to ~/.kube/config
Installing Nvidia Plugin
Prerequisites:
Helm: Install helm via this : https://helm.sh/docs/intro/install/
kubectl create ns netbook
helm repo add nvdp https://nvidia.github.io/k8s-device-plugin && helm repo update
helm install --generate-name nvdp/nvidia-device-plugin -n netbook
Installing DCGM and sending logs to netbook
cd kubespray
kubectl apply -f gpu/
Post this, you should be able to see your node and cluster metrics on NetBook's platform.
To validate if everything is working fine, run the below commands
nvidia-smi
You should see GPU usage from this
To validate if Kuberentes is installed and netbook exporter is working, run this
watch kubectl get po -A
You should see pods running in netbook
namespace for dcgm exporter
For issues, reach out at support@netbook.ai
Last updated