Multi-Tenancy with HPE Ezmeral Data Fabric

Multi-tenancy support out-of-the-box that manages distinct data sets, user groups, and applications in the same cluster.

+ show more
Solution Overview
TAP IMAGE TO ZOOM IN

Introduction

Organizations seek to share IT resources cost-effectively and securely among multiple data sets, user groups,

and applications. Platforms that support this architecture are commonly known as multi-tenant technologies.

Big Data platforms are increasingly expected to support multi-tenancy out-of-the-box, as non-native approaches are impractical. For example, maintaining separate clusters for separate tenants requires excessive administrative overhead and introduces significant processing latency. Also, partial solutions such as Apache Hadoop YARN or other resource managers only handle one definition of tenant (such as, the tasks/applications) and ignore the most-critical component of a multi-tenant environment—the data.

The key to multi-tenancy is the ability to ensure strict isolation of the distinct tenants while also allowing some level of sharing when necessary. The metaphor of an apartment building applies well, where some components allow isolation (residents in distinct, private units) and other components allow sharing (elevators, hallways, fitness center, pool, and such).

  • Use case overview

    Multi-tenancy is useful and critical in a range of cases.

    Three common use cases are:

    Enterprise data lake: Often, organizations start using HPE Ezmeral Data Fabric in a specific area, for example, data warehouse optimization for the marketing department. Soon, the customer service team also wants to run an app for churn prediction. Ultimately, questions arise as to who has access to which data and for what purposes, also taking into account regulatory issues (such as the Sarbanes–Oxley Act in banking or data protection legislation throughout the industry). With HPE Ezmeral Data Fabric, separate departments can share the same cluster, with some data shared across departments and some data kept private within each department. The multi-tenancy from HPE Ezmeral Data Fabric helps ensure the users and user groups can only access the data for which they are authorized.

    Software/Platform/Infrastructure as a service: Some organizations provide IT services—such as Big Data as a service—to internal or external customers. A basic requirement of such service providers is to isolate customers while achieving guaranteed service-level agreements (SLAs), be it in terms of availability or latency. In this scenario, one customer accessing another customer’s data could be disastrous for the business. In addition, a service provider may require the flexibility to be able to run parts of the multi-tenancy deployment in a hybrid cloud setup, for example, to benefit from the elasticity of public clouds.

    Data lifecycle management: Data often undergoes many transformations to suit the business needs of various departments and user groups. Multi-tenancy is a key enabler of data lifecycle management, where each group

    of users is necessarily responsible for different stages of the data. For example, stages may include raw ingested data, data scientist transformations, BI analyst-ready data, and archived data. Data in each stage can be stored as

    a separate tenant, to make sure only authorized users can access it. Also, with multi-tenancy, administrators can introduce constraints around how much compute and storage each stage is allowed to use.

    HPE Ezmeral Data Fabric offers built-in multi-tenancy capabilities to support both isolation and sharing in a single cluster. Volumes are the foundation of multi-tenancy. They are logical partitions you create in the cluster upon which you can set specific policies. Volumes group together related directories, files, database tables, and streams as a cohesive unit to help simplify Big Data management. You can use volumes to enforce disk usage limits, set replication levels, establish ownership and accountability, and measure the costs of different projects or departments. A single cluster can have many volumes, up to hundreds of thousands. Volumes automatically grow across multiple nodes in the cluster as you insert more data in them, unlike static Linux® partitions.

TAP IMAGE TO ZOOM IN

Figure 1. HPE Ezmeral Data Fabric

  • Conclusion

    Enterprise architectures that include data from many sources often require multi-tenancy capabilities to provide data isolation as well as data sharing.

    HPE Ezmeral Data Fabric offers multi-tenancy support out-of-the-box to manage distinct data sets, user groups, and applications in the same cluster. This capacity is achieved through volumes, enabling isolation for both compute and storage, and includes security and reporting.

Download the PDF

Linux is the registered trademark of Linus Torvalds in the U.S. and other countries. All third-party marks are property of their respective owners.


© Copyright 2020 Hewlett Packard Enterprise Development LP. The information contained herein is subject to change without notice. The only warranties for Hewlett Packard Enterprise products and services are set forth in the express warranty statements accompanying such products and services. Nothing herein should be construed as constituting an additional warranty. Hewlett Packard Enterprise shall not be liable for technical or editorial errors or omissions contained herein.