Skip to main content

Posts

Managing AI Workloads in Kubernetes and OpenShift with Modern GPUs [H100/H200 Nvidia]

 AI workloads demand significant computational resources, especially for training large models or performing real-time inference. Modern GPUs like NVIDIA's H100 and H200 are designed to handle these demands effectively, but maximizing their utilization requires careful management. This article explores strategies for managing AI workloads in Kubernetes and OpenShift with GPUs, focusing on features like MIG (Multi-Instance GPU), time slicing, MPS (Multi-Process Service), and vGPU (Virtual GPU). Practical examples are included to make these concepts approachable and actionable. 1. Why GPUs for AI Workloads? GPUs are ideal for AI workloads due to their massive parallelism and ability to perform complex computations faster than CPUs. However, these resources are expensive, so efficient utilization is crucial. Modern GPUs like NVIDIA H100/H200 come with features like: MIG (Multi-Instance GPU): Partitioning a single GPU into smaller instances. Time slicing: Efficiently sharing GPU res...
Recent posts

Choosing the Right OpenShift Service: Service Mesh, Submariner, or Service Interconnect?

In today’s digital world, businesses rely more and more on interconnected applications and services to operate effectively. This means integrating software and data across different environments is essential. However, achieving smooth connectivity can be tough because different application designs and the mix of on-premises and cloud systems often lead to inconsistencies. These issues require careful management to ensure everything runs well, risks are managed effectively, teams have the right skills, and security measures are strong. This article looks at three Red Hat technologies—Red Hat OpenShift Service Mesh and Red Hat Service Interconnect, as well as Submariner—in simple terms. It aims to help you decide which solution is best for your needs. OPENSHIFT Feature Service Mesh (Istio) Service Interconnect Submariner Purpose Manages service-to-service communication within a single cluster. Enables ...

What's New in Red Hat OpenShift 4.17

What's New in Red Hat OpenShift 4.17 Release Overview: · Kubernetes Version:  OpenShift 4.17 is based on Kubernetes 1.30, bringing enhancements and new capabilities. Notable Beta Features: 1.     User Namespaces in Pods:  Enhances security by allowing pods to run with distinct user IDs while mapping to different IDs on the host. 2.     Structured Authentication Configuration:  Provides a more organized approach to managing authentication settings. 3.     Node Memory Swap Support:  Introduces support for memory swapping on nodes, enhancing resource management. 4.     LoadBalancer Behavior Awareness:  Kubernetes can now better understand and manage LoadBalancer behaviors. 5.     CRD Validation Enhancements:  Improves Custom Resource Definition (CRD) validation processes. Stable Features: 1.     Pod Scheduling Readiness:  Ensures that...

Effortless Management: A Guide to Registering and Unregistering Linux Machines with Red Hat Subscription Manager using CLI with Key benefits

To initiate the registration process for your Linux machine with Red Hat Subscription Manager, the first step is to log in to the official Red Hat site, where licenses are managed. Access the site at https://access.redhat.com and provide your credentials using GUI (Browser) UserName: ee.ibraraziz@gmail.com Password: ************** Follow these steps to register your system using CLI and make sure in restricted environment following address https://access.redhat.com/* is whitelist so that machine can communicate to the redhat offical site: 1. Register the system using the following command:    >>subscription-manager register    You will be prompted to enter your Red Hat account credentials.      UserName: ee.ibraraziz@gmail.com      Password: ************** After successfully logging in,   Additionally, you may need to provide the Pool ID associated with your subscription. Follow t...

Install TANZU CLI in Linux for Vsphere with Tanzu

The VMware Tanzu command-line interface (Tanzu CLI) is  a command-line tool that connects you to Tanzu . Just like Kubectl CLI. For example, you can use the Tanzu CLI to: Create and manage management clusters. Create and manage workload clusters. Manage Kubernetes releases The first thing to ensure is that it aligns with or is compatible with the Kubernetes cluster version, just like you would when installing the kubectl client. When working with packages in a Tanzu Kubernetes Cluster (TKC) and utilizing the Tanzu CLI, it becomes highly beneficial for managing packages within the Tanzu Cluster. Installation To obtain the binary corresponding to your cluster version, please use the provided link. To successfully download the binary, you must have access to the VMware portal and select the highlighted version https://customerconnect.vmware.com/en/downloads/details?downloadGroup=TKG-161&productId=988&rPId=99512 After downloading the files, transfer the binaries to the jump h...