Skip to main content

HPE Private Cloud AI: HPE AI Essentials for Data Engineering

H46BGS

Table of Contents

Table of Contents

    Course ID

    H46BGS

    Duration

    1 day

    Format

    ILT/VILT

    Overview

    This course demonstrates authoring and monitoring workflows and data pipelines using HPE AI Essentials Software. It also introduces EzPresto, a SQL query engine, and covers how to use data visualizations and dashboards using Apache Superset. In addition, you learn about data pipeline management using Apache Airflow and Ray framework on HPE AI Essentials.


    The course comprises 30% lectures and 70% practical hands-on labs.

    Course ID

    H46BGS

    Duration

    1 day

    Format

    ILT/VILT

    Audience

    This course is ideal for system administrators, integrators, data engineers, and learners who wants to implement HPE AI Essentials solution.

    Prerequisites

    Before attending this course, you should have:

    • An understanding of Kubernetes or any container orchestration software
    • A basic understanding of big data open-source tools and frameworks
    • A basic understanding of HPE GreenLake for File Storage

    Objectives

    After completing this course, you should be able to:

    • Describe the features and capabilities of HPE AI Essentials
    • Demonstrate running federated queries across various data sources using HPE AI Essentials
    • Demonstrate authoring and monitoring workflows and data pipelines using HPE AI Essentials
    • Use data visualizations and dashboards with HPE AI Essentials
    Divider

    Course outline

    Module 1: Introduction to HPE AI Essentials Software


    • Introduction to HPE AI Essentials:
      • Features and capabilities
      • Data engineering components
      • Navigation
      • How to get started

    Module 2: Introduction to EzPresto


    • Understand what EzPresto is
    • Discuss EzPresto key features
    • Recognize EzPresto architecture
    • Define connect data sources using HPE AI Essentials
    • Describe steps to connect to external applications through JDBC using HPE AI Essentials
    • Use Spark to query EzPresto on HPE AI Essentials
    • Discuss connectivity to EzPresto via Python client using HPE AI Essentials
    • Define cache data

    Module 3: Introduction to Workflows


    • Define Airflow functionality
    • Recognize Airflow architecture and its components
    • Configure Airflow DAGs Git repository using HPE AI Essentials
    • Demonstrate Airflow configuration using HPE AI Essentials

    Module 4: Superset Overview


    • Define Superset
    • Demonstrate BI reporting using Superset on HPE AI Essentials
    • Demonstrate retail store analysis dashboard using Superset on HPE AI Essentials

    5 reasons to choose HPE as your training partner

    1. Learn HPE and in-demand IT industry technologies from expert instructors.
    2. Build career-advancing power skills.
    3. Enjoy personalized learning journeys aligned to your company’s needs.
    4. Choose how you learn: in-person , virtually , or online —anytime, anywhere.
    5. Sharpen your skills with access to real environments in virtual labs .

    Explore our simplified purchase options, including HPE Education Learning Credits .

    Lab outline

    Lab 1: Getting started with HPE AI Essentials

    • Task 1: Log in to HPE AI Essentials
    • Task 2: Add user

    Lab 2: Running Federated Queries Across Various Data Sources


    • Task 1: Exploring Data Sources
    • Task 2: Connecting External Applications to EzPresto via JDBC
    • Task 3: Submitting Presto Queries from Notebook

    Lab 3: Authoring and MonitoringWorkflows and Data Pipelines

    • Task 1: Airflow DAGs Git repository Configuration
    • Task 2: Configuring Airflow

    Lab 4: Batch and Stream ETL with Apache Spark


    • Task 1: Batch ETL with Apache Spark
    • Task 2: View output
    • Task 3: Stream ETL with Apache Spark

    Lab 5: Orchestrating Spark Applications


    • Task 1: Preparing the environment
    • Task 2: Running DAG for Spark Application
    • Task 3: Verify DAG Output

    Lab 6: ETL and Visualize Data from Spark Application


    • Task 1: Getting the environment ready
    • Task 2: Running DAG for Spark Application
    • Task 3: ETL and Visualisation in Jupyter

    Lab 7: ETL Using Custom Data Sources


    • Task 1: Pull and install custom MySQL image
    • Task 2: Accessing the database and listing tables

    Recommended for you