HPE Private Cloud AI Bootcamp

H54GRS

Course ID

H54GRS

Duration

5 days

Format

ILT/VILT

Overview

This course builds on the Basic AI Applications and Workloads Training (H54FSS) course, advancing into HPE Private Cloud AI. With contents spread across enterprise-scale AI design, deployment, and optimization, the participants will explore sophisticated workload architectures, multi-modal AI, advanced fine-tuning strategies, retrieval-augmented generation (RAG) at scale, and orchestration of multi-agent systems. The course emphasizes automation, MLOps, performance tuning and NVIDIA NeMo, preparing learners to manage and future-proof AI in enterprise environments.


Through a blend of instructor-led modules and hands-on labs, you gain practical skills to design, deploy, and govern advanced AI workloads. This training concludes with strategic guidance for innovation and building AI adoption roadmaps in large organizations.


This course includes an initial discussion to understand audience background, setup prerequisites, and tailor the examples and exercises in the course according to audience level.

Course ID

H54GRS

Duration

5 days

Format

ILT/VILT

  • Audience

    This course is ideal for:

    • Data and AI professionals
    • Software and DevOps professionals
    • IT administrators and security professionals
    • Team leaders and technical managers
    • AI enthusiasts
  • Prerequisites

    Before attending this course, you should have the following:


    • Intermediate Python Programming knowledge:
      • Familiarity with python syntax, functions, and loops
      • Classes, list comprehension, and JSON parsing
      • Numpy, Pandas, and data visualization
    • Intermediate machine learning experience
      • Exposure to machine learning and/or deep learning frameworks
      • Exposure to libraries like Keras in either TensorFlow or PyTorch
    • Containers
      • A basic understanding of container technologies (Docker and Kubernetes) is optional but helpful
  • Objectives

    After completing this course, you should be able to:

    • Design and implement advanced AI workload architectures, including distributed and hybrid models
    • Build and manage large-scale data pipelines with feature stores and multimodal data integration
    • Develop and deploy multi-modal AI applications combining text, image, and structured data
    • Apply advanced fine-tuning and transfer learning methods such as LoRA, QLoRA, and adapters
    • Design and optimize enterprise-scale Retrieval Augmented Generation (RAG) pipelines
    • Orchestrate and manage multi-agent AI systems with memory, planning, and collaboration
    • Understand hybrid loud orchestration and AIOps capabilities with HPE OpsRamp
    • Overview of hybrid cloud automation and MLOps with HPE Morpheus Enterprise
    • Automate end-to-end AI lifecycle management using CI/CD, registries, and drift detection
    • Build agentic AI systems with NVIDIA NeMo for performance, reliability, and cost efficiency
Divider
  • Course outline

Module 1: Advanced AI Workload Architectures

  • The private cloud enterprise—the bare metal, VMs, and containers
  • The private cloud enterprise—from DevOps to analytics
  • GreenLake for Private Cloud Enterprise
  • Multi-node training and distributed inference
  • Hybrid AI deployment models (on-prem + cloud)
  • Advanced workload classification (fine-tuning, RAG+, multi-modal AI)
  • Designing AI pipelines for enterprise scale

Module 2: Advanced Data Engineering for AI


  • Feature engineering at enterprise scale
  • Data versioning and lineage
  • Handling images, audio, and video
  • Feature stores and real-time data ingestion

Module 3: Multi-Modal AI Applications

  • Fundamentals of multi-modal architectures
  • Multi-modal transformers and embeddings
  • Use cases: document intelligence, vision-language models
  • Integration with enterprise data sources

Module 4: Fine-Tuning and Transfer Learning


  • Review of pre-trained model adaptation
  • LoRA, QLoRA, and adapters
  • Fine-tuning vs prompt-tuning vs distillation
  • Case studies for customizing enterprise models

Module 5: Retrieval-Augmented Generation at Scale

  • Hierarchical retrieval and multi-vector indexing
  • Large-scale vector databases
  • Query expansion and relevance tuning
  • Hybrid retrieval (dense + sparse)

Module 6: Orchestration and AIOps with HPE OpsRamp

  • Multi-agent collaboration patterns
  • Orchestration frameworks (LangGraph, AutoGen)
  • Memory, planning, and tool use in agents
  • Enterprise scenarios: incident response, alert triage, compliance enforcement, resource optimization with HPE OpsRamp
  • Proactive capacity planning and performance forecasting using HPE OpsRamp
  • Automated SLA/SLO monitoring and service health scoring
  • AIOps-driven resource scaling and optimization recommendations

Module 7: Advanced Automation and MLOps


  • CI/CD for AI pipelines
  • Model registries and governance
  • Continuous validation and drift detection
  • Automation frameworks for lifecycle management
  • Governance and access controls with HPE Morpheus Software role-based orchestration
  • Cost, quota, and lifecycle automation through HPE Morpheus Software policies
  • HPE Morpheus Software analytics for hybrid-cloud observability
  • Blueprint-based deployment of consistent ML environments across clouds and on-prem
  • API-driven automation for provisioning, deprovisioning, and updating model-serving environments

Module 8: NVIDIA NIM Foundations and Enterprise Model Deployment

  • NVIDIA inference stack: Triton, TensorRT-LLM, NeMo, CUDA
  • NIM microservices: LLM, RAG, multimodal, embeddings, ASR/TTS
  • Deploying NIM containers via Docker or Helm charts
  • GPU optimization and performance tuning for NIM services
  • Enterprise integration using REST and gRPC endpoints
  • Monitoring and observability: DCGM, Prometheus, NVML

Module 9: Building Enterprise RAG Pipelines with NVIDIA NIM

  • NVIDIA RAG reference architecture
  • NIM embedding and reranker microservices
  • Hybrid retrieval design: dense + sparse + vector fusion
  • GPU-accelerated vector search using RAFT and cuVS
  • Integration with Triton Inference Server
  • Large-scale RAG orchestration patterns for enterprise workloads

Module 10: NVIDIA NeMo for Fine-Tuning and Model Customization

  • NeMo architecture: collections, configs, Lightning integration
  • Distributed training with Megatron-LM (TP/PP/DP)
  • Dataset preparation, tokenization, and pre-processing pipelines
  • Parameter-efficient fine-tuning workflows
  • Model checkpointing, evaluation, and export

Module 11: NeMo Guardrails for Enterprise AI Governance

  • Overview of NeMo Guardrails and RailSpec syntax
  • Building rule sets for enterprise applications
  • Grounding, safety, and privacy rule enforcement
  • Logging, evaluation, and continuous guardrail optimization
  • Integration with LLM apps, RAG pipelines, and agents

5 reasons to choose HPE as your training partner

  1. Learn HPE and in-demand IT industry technologies from expert instructors.
  2. Build career-advancing power skills.
  3. Enjoy personalized learning journeys aligned to your company’s needs.
  4. Choose how you learn: in-person , virtually , or online —anytime, anywhere.
  5. Sharpen your skills with access to real environments in virtual labs .

Explore our simplified purchase options, including HPE Education Services – Learning Credits .

  • Lab outline

Lab 1: Designing Advanced AI Architectures

  • Create a high-level architecture for distributed and hybrid AI workloads

Lab 2: Building a Feature Store

  • Implement a feature store for managing and reusing engineered features

Lab 3: Multi-Modal Data Ingestion

  • Ingest and preprocess text, image, and tabular data for a multi-modal pipeline

Lab 4: Building a Multi-Modal Model

  • Train or adapt a model that combines vision and text inputs

Lab 5: Fine-Tuning with LoRA

  • Apply LoRA fine-tuning to a pre-trained model for a domain-specific task

Lab 6: Prompt-Tuning vs Full Fine-Tuning

  • Compare the results of parameter-efficient tuning vs full fine-tuning

Lab 7: Implementing Enterprise RAG

  • Build a large-scale RAG pipeline with hybrid retrieval (dense + sparse)

Lab 8: Query Expansion in RAG

  • Implement query expansion techniques to improve retrieval relevance

Lab 9: Multi-Agent Orchestration

  • Design a multi-agent workflow using an orchestration framework

Lab 10: Agent Memory and Planning

  • Implement agent memory and reasoning for contextual task completion

Lab 11: CI/CD for AI Pipelines

  • Automate deployment of models with versioning and rollback strategies

Lab 12: Fine-Tuning Models Using NVIDIA NeMo

  • Fine-tune an open-source foundation model using NeMo and evaluate model improvements

Lab 13: Building a GPU-Accelerated RAG Pipeline with NeMo

  • Build a RAG workflow using NeMo for embeddings, a vector database, and Inference Server for model serving

Lab 14: Implementing Guardrails and Governance with NeMo Guardrails

  • Apply RailSpec rules for safety, grounding, and compliance and integrate guardrails into an LLM application

Lab 15: Multi-Agent AI Workflows Using NeMo Tools

  • Build a multi-agent system where agents use NeMo models, embeddings, and guardrails to collaboratively solve tasks

Recommended for you