Integrate Spark with HBase or HPE Ezmeral Data Fabric Database when you want to run Spark jobs on HBase or HPE Ezmeral Data Fabric Database tables.
/opt/mapr/spark/spark-<version>/mapr-util/compatibility.version
file:
hbase_versions=<version>The HBase version depends on the
current MEP and MapR version that you are
running. hbase-site.xml:
<property>
hbase.table.sanity.checks</name>
<value>false</value>
</property>
hbase-site.xml to the
{SPARK_HOME}/conf/ directory.
configure.sh
copies the hbase-site.xml file to the Spark directory
automatically.hbase-site.xml file in the
SPARK_HOME/conf/spark-defaults.conf file:
spark.yarn.dist.files SPARK_HOME/conf/hbase-site.xml
create '<table_name>' , '<column_family>'
mapr user or as a user that
mapr impersonates:
/opt/mapr/spark/spark-<spark_version>/bin/spark-submit --master <master> [--deploy-mode <deploy-mode>] --class org.apache.hadoop.hbase.spark.example.rdd.HBaseBulkPutExample /opt/mapr/hbase/hbase-<hbase_versrion>/lib/hbase-spark-<hbase_version>-mapr.jar <table_name> <column_family>The
master URL for the cluster is either spark://<host>:7077,
yarn, or local (without deploy-mode). The deploy-mode is either
client or cluster.hbase(main):001:0> scan '<table_name>'