Enabling High Availability for Spark Thrift Server

With MEPs 5.0.4 or 6.3.0 and later, you can enable high availability for the Spark Thrift Server. Note the following characteristics of high availability for the Spark Thrift Server:
To enable high availability, use the following steps:
  1. Install Spark Thrift Server on all the cluster nodes where it is needed:
    On Ubuntu
    apt-get install mapr-spark-thriftserver
    On Red Hat / CentOS
    yum install mapr-spark-thriftserver
    On SUSE
    zypper install mapr-spark-thriftserver
  2. Add the following properties to the /opt/mapr/spark/spark-<spark_version>/conf/hive-site.xml file on all the nodes where the Spark Thrift Server is installed
    <property>
    <name>hive.zookeeper.quorum</name>
    <value><zk_host1_>,<zk_host_2>,…,<zk_host_n></value>
    </property>
    
    <property>
    <name>hive.zookeeper.client.port</name>
    <value><zk_port></value>
    </property>
    
    <property>
    <name>hive.server2.support.dynamic.service.discovery</name>
    <value>true</value>
    </property>
    
    <property>
    <name>hive.server2.zookeeper.namespace</name>
    <value><zk_namespace></value>
    </property>
    For example:
    <property>
    <name>hive.zookeeper.quorum</name>
    <value>node1.cluster.com,node2.cluster.com,node3.cluster.com</value>
    </property>
    
    <property>
    <name>hive.zookeeper.client.port</name>
    <value>5181</value>
    </property>
    
    <property>
    <name>hive.server2.support.dynamic.service.discovery</name>
    <value>true</value>
    </property>
    
    <property>
    <name>hive.server2.zookeeper.namespace</name>
    <value>ts2-ts2</value>
    </property>
    Note: The values that you provide for the hive.server2.zookeeper.namespace property should be different for the hive-site.xml in the Spark and Hive directories.
  3. Restart the Spark Thrift Server to apply the changes following the script in the .sbin directory at /opt/mapr/spark/spark-<spark_version>/ or by running a maprcli command on all configured nodes:
    ./sbin/stop-thriftserver.sh
    ./sbin/start-thriftserver.sh 
    or
    maprcli node services -nodes <host_1>,<host_2>,<host_n> -name spark-thriftserver -action restart
  4. Launch the Zookeeper command line interface, and check the Spark Thriftserver znode by running the following commands:
    /opt/mapr/zookeeper/zookeeper-<version>/bin/zkCli.sh -server <ip:port of zookeeper instance>
    ls /<hive.server2.zookeeper.namespace>
    For example:
    /opt/mapr/zookeeper/zookeeper-3.4.11/bin/zkCli.sh -server node1.cluster.com:5181
    ls /ts2-ts2
    [serverUri=node1.cluster.com:2304;version=;sequence=0000000000]
  5. Using Beeline, you can connect to the Spark Thrift Server by using the following string:
    beeline> !connect jdbc:hive2://<hostname -f>:5181/default;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=<hive.server2.zookeeper.namespace>;
    For example:
    ./bin/beeline
    Warning: Unable to determine $DRILL_HOME
    Beeline version 1.2.0-mapr-spark-MEP-6.0.0-1912 by Apache Hive
    beeline> !connect jdbc:hive2://node1.cluster.com:5181/default;ssl=true;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=ts2-ts2;auth=maprsasl;
    Connecting to jdbc:hive2://node1.cluster.com:5181/default;ssl=true;serviceDiscoveryMode=zooKeeper;zooKeeperNamespace=ts2-ts2;auth=maprsasl;
    20/03/29 21:38:19 WARN MaprSaslClient: SASL Server qopProperty: auth-confis different from Client: auth-conf,auth-int,auth.Using Server one
    Connected to: Spark SQL (version 2.4.4.0-mapr-630)
    Driver: Hive JDBC (version 1.2.0-mapr-spark-MEP-6.0.0-1912)
    Transaction isolation: TRANSACTION_REPEATABLE_READ
    1: jdbc:hive2://node1.cluster.com:5181/defaul> show databases;
    +-----------------+
    | databaseName |
    +-----------------+
    | default             |
    +-----------------+
    1 row selected (0.11 seconds)
Note: High availability for the Spark Thrift Server can be used in conjunction with HiveServer2 high availability. For more information about HiveServer2 high availability, see Enabling High Availability for Hive.