This section describes the Hive connector for HPE Ezmeral Data Fabric Database JSON table.
The Hive connector supports the creation of HPE Ezmeral Data Fabric Database based Hive tables. You can create a JSON table on HPE Ezmeral Data Fabric Database and load CSV data and/or JSON files to HPE Ezmeral Data Fabric Database using the connector. HPE Ezmeral Data Fabric Database based Hive tables can be:
The following table lists the Hive data type and the corresponding (supported) HPE Ezmeral Data Fabric Database OJAI type:
| Hive Type | HPE Ezmeral Data Fabric Database OJAI Type |
|---|---|
| BOOLEAN | BOOLEAN |
| BINARY | BINARY |
| TINYINT | BYTE |
| DATE | DATE |
| DOUBLE | DOUBLE |
| FLOAT | FLOAT |
| INT | INT |
| BIGINT | LONG |
| SMALLINT | SHORT |
| STRING | STRING |
| TIMESTAMP | TIMESTAMP |
CREATE TABLE primitive_types (
id string,
bo boolean,
d double,
da date,
f double,
i int,
s string,
ts timestamp)
STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/tbl","maprdb.column.id" = "id"); Here:maprdb.table.name,
maprdb.column.id and STORED BY
'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
are mandatory properties.maprdb.column.id column should be of
type string or binary.To create a Hive table that exists on HPE Ezmeral Data Fabric Database, specify EXTERNAL in
the table DDL. If the table created is EXTERNAL, when the table is
dropped, only its metadata is deleted; the underlying HPE Ezmeral Data Fabric Database data remains intact. On the other
hand, if the table is not EXTERNAL, dropping the table deletes both the
metadata associated with the table and the underlying HPE Ezmeral Data Fabric Database data.
For example, suppose a
JSON table named /apps/my_users with the following
values:
{"_id":"001","first_name":"John","last_name":"Doe","age":34}
{"_id":"002","first_name":"Jack","last_name":"Smith","age":26}
To create a Hive table over existing HPE Ezmeral Data Fabric Database
JSON table:
CREATE EXTERNAL TABLE primitive_types (
user_id string,
first_name string,
last_name string,
age int)
STORED BY 'org.apache.hadoop.hive.maprdb.json.MapRDBJsonStorageHandler'
TBLPROPERTIES("maprdb.table.name" = "/apps/my_users","maprdb.column.id" = "user_id"); Now,
because table primitive_types points to HPE Ezmeral Data Fabric Database table, you can perform ETL query similar
to filesystem based Hive
tables:SELECT COUNT(*) FROM test_external;
SELECT MAX(age) AS label FROM test_external;
...