Understanding the HPE Ezmeral Data Fabric Database OJAI Connector for Spark

Using the HPE Ezmeral Data Fabric Database OJAI connector for Spark enables you build real-time and batch pipelines between your data and HPE Ezmeral Data Fabric Database JSON. Before getting started, it is important that you understand Spark terminology and workflow, system requirements and support, and OJAI connector and API features.

The HPE Ezmeral Data Fabric Database OJAI connector includes a set of APIs that enable you to write applications that consume HPE Ezmeral Data Fabric Database JSON tables and use them in Spark. The HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark is a companion to the HPE Ezmeral Data Fabric Database Binary Connector for Apache Spark, which provides the equivalent functionality for HPE Ezmeral Data Fabric Database Binary tables.

HPE Ezmeral Data Fabric Database OJAI Connector with Spark Workflow

You can use the HPE Ezmeral Data Fabric Database OJAI Connector to extract data from HPE Ezmeral Data Fabric Database or filesystem and transform that data using either Spark or Spark SQL, and then load it into HPE Ezmeral Data Fabric Database JSON:

HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark Features

Principal features of the HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark include the following:

The following features are not supported:

Supported Product Versions and System Requirements

To use the HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark, you must have the following minimum software versions:

Support for DataFrames and Datasets is available starting in the MEP 4.0 release.

OJAI API

The HPE Ezmeral Data Fabric Database OJAI Connector for Apache Spark uses the OJAI API internally to access HPE Ezmeral Data Fabric Database JSON tables.