786 Views
To create a Kubernetes master on Google Cloud and add on-premise instances with GPUs as nodes, follow these steps:
Step 1: Set Up the Kubernetes Master on Google Cloud
- Create a Google Kubernetes Engine (GKE) Cluster:
- Go to the Google Cloud Console.
- Navigate to the Kubernetes Engine section and click on Clusters.
- Click Create and configure your cluster:
- Choose the Standard cluster type.
- Select a region or zone close to your on-premise instances.
- Configure the master and node settings as needed.
- Click Create to launch the cluster.
2. Obtain Cluster Credentials:
- After the cluster is created, click on the cluster name to view details.
- Click on the Connect button to get the command for obtaining the cluster credentials.
- Run the provided
gcloud
command in your terminal to authenticate your local environment with the GKE cluster:sh gcloud container clusters get-credentials <cluster-name> --region <region>
Step 2: Set Up On-Premise Nodes with GPU
- Install Kubernetes Components on On-Premise Instances:
- On each on-premise instance, install Docker,
kubeadm
,kubelet
, andkubectl
:sh sudo apt-get update sudo apt-get install -y docker.io sudo apt-get install -y apt-transport-https curl curl -s https://packages.cloud.google.com/apt/doc/apt-key.gpg | sudo apt-key add - sudo apt-add-repository "deb http://apt.kubernetes.io/ kubernetes-xenial main" sudo apt-get install -y kubelet kubeadm kubectl sudo apt-mark hold kubelet kubeadm kubectl
2. Install NVIDIA Drivers and GPU Support:
- Install the NVIDIA drivers and CUDA Toolkit:
sh sudo apt-get install -y nvidia-driver-450 sudo apt-get install -y nvidia-cuda-toolkit
- Install the NVIDIA Kubernetes device plugin:
sh kubectl apply -f https://github.com/NVIDIA/k8s-device-plugin/blob/master/nvidia-device-plugin.yml
3. Join the On-Premise Nodes to the GKE Cluster:
- On your GKE master node, create a token to allow nodes to join the cluster:
sh kubeadm token create --print-join-command
- The output will provide a command that you need to run on each on-premise instance:
sh kubeadm join <master-ip>:<port> --token <token> --discovery-token-ca-cert-hash sha256:<hash>
- Run this command on each of your on-premise instances to join them to the cluster.
Step 3: Verify and Configure the Cluster
- Verify Nodes in the Cluster:
- Run the following command on your GKE master to see all nodes:
sh kubectl get nodes
- You should see both the GKE nodes and the on-premise nodes listed.
2. Configure Workloads to Use GPUs:
- To use the GPUs on your on-premise nodes, create a Kubernetes deployment that requests GPU resources:
yaml apiVersion: apps/v1
kind: Deployment
metadata:
name: gpu-workload
spec:
replicas: 1
template:
metadata:
labels:
app: gpu-workload
spec:
containers:
- name: gpu-container
image: <your-gpu-enabled-image>
resources:
limits:
nvidia.com/gpu: 1 # Request one GPU
- Deploy this configuration with:
sh kubectl apply -f gpu-deployment.yaml
Step 4: Monitor and Manage the Cluster
- Use Kubernetes tools like kubectl, Kubernetes Dashboard, or other monitoring tools to manage your cluster.
- Regularly update and patch both your GKE cluster and on-premise nodes to ensure compatibility and security.
Important Notes
- Ensure network connectivity between your GKE cluster and on-premise instances. You might need a VPN or a secure connection to manage this.
- Properly configure firewall rules to allow traffic between the GKE master and on-premise nodes.
- Regularly update GPU drivers and Kubernetes versions on both GKE and on-premise nodes.
This setup will allow you to have a Kubernetes cluster spanning both Google Cloud and your on-premise infrastructure with GPU capabilities.
Conclusion
To simply create a Kubernetes master on Google Cloud and also want to add on-premise instances along with GPUs as nodes, then follow all the above-mentioned steps in the described order. For doing it seamlessly, make sure that you have the best GPU dedicated server.