HPE Ezmeral Data Fabric Global Namespace
A key feature of the HPE Ezmeral Data Fabric is the concept of the Global Namespace. This technical white paper will describe this feature in more detail.
+ show more
Technical white paper
The ability to see a logical view of all the globally distributed data that a company possesses, regardless of where it is physically, is an essential enabler to extract the full value from that data.
The global namespace from HPE Ezmeral Data Fabric provides a consolidated view into files that can be in separate clusters across multiple edge, on-premises, and cloud environments with single security management and audit plane, enabling development of new location-independent applications move smoothly across bare metal, private cloud, public cloud, or the edge.
The global namespace from HPE Ezmeral Data Fabric aggregates multiple data type information into a unified structure where physical servers, share locations, and private and public cloud instances are part of a single name. Imagine the impact on your developers, analysts, and data scientists when applications and workloads can directly access files, tables, and event streams and objects as if they were local. It simplifies the conceptual design of large systems and allows multiple applications to work together on the same data sets.
End users get a unified view of files, tables, and event streams allowing them access without having to be aware of the physical data location. This allows for easy data management, minimal overhead, and distributed scale since the data can now be spread across, on-premises, the cloud, and on the edge.
HPE Ezmeral Data Fabric can be deployed in different locations and linked together to create a global data fabric. Edge clusters, core clusters, and stretch clusters can all be connected to create a data fabric that can span data centers and even geographies (see Figure 1).
For smaller locations, the data fabric can be deployed as an edge cluster with as little as three nodes. Compute hosts can be added to the HPE Ezmeral Runtime so that compute can be co-located with data or compute resources can be separated from the data for the most efficient and localized analytics workloads
Data fabric edge is a small footprint edition of the global data platform that you can use to capture, process, and analyze IoT data close to the source.
Data fabric edge is a fully functional data fabric cluster that can be run on small form-factor commodity hardware, such as Intel® Next Unit of Computing (NUCs). Edge clusters are supported in 3- to 5-node configurations.
Each cluster supports the full capabilities of the data fabric global data platform, including the capacity for files, tables, and streams, along with related data management and protection capabilities such as security, snapshots, mirroring, replication, and compression. The added ability to run popular Apache open-source projects allows outstanding application portability at the edge.
Generally, the data fabric nodes in a single cluster are co-located, with the topology configurable across numerous racks to enhance resiliency and redundancy using a combination of replication and/or erasure coding.
There may be some storage resources in other locations that can be configured to be part of the same data fabric cluster, this is possible using the concept of a stretch cluster. However, these kinds of deployments are very sensitive to latency and must be planned carefully with direct connections such as dark fiber between nodes in the data fabric cluster.
Configure the global namespace
The global namespace is configured when data fabric clusters are configured. It is important to plan the global namespace architecture before beginning the configuration of the clusters.
When creating a global namespace, you should consider the following factors.
- Location of the data fabric in relation to your application’s access needs
- Data fabric name for each location
- Volume names as it relates to each location for your application needs
- Any business continuity and disaster recovery to be accounted for additional capacity for cross-cluster replication of data
By creating a plan for your data, you will be able to apply security controls more easily for those user and application needs, as well as protect your data.
The data fabric clusters in different locations must be linked together using a cross-cluster connection.
To perform this, use the configure-crosscluster.sh utility. Examples for configuring cross-cluster features are available here:
Note : Cross-cluster connections cannot be performed from the management control system (MCS) UI via a web browser
The following screenshot from a HPE Ezmeral Data Fabric MCS UI shows two data fabrics that have been connected; one is the core cluster “m2-dfse-us” and the oner is an edge cluster “m2-dfse-edge1.” These two clusters are joined with a cross-cluster connection in the same global namespace “mip.storage.hpecorp.net.” Both clusters can be configured from the same MCS UI session.
The global namespace of HPE Ezmeral Data Fabric aggregates file information into a unified structure where physical servers in multiple locations are part of a single name.
This allows developers, analysts, and data scientists to develop, deploy, and run applications, along with workloads that can directly access files, tables, event streams, objects, and metadata as if they were local. It simplifies the design of large integrated systems and allows multiple applications to work together on the same data sets.
Docker logo is a trademark or registered trademark of Docker, Inc. in the United States and/or other countries. Intel is a trademark of Intel Corporation or its subsidiaries in the U.S. and/or other countries. All third-party marks are property of their respective owners.