Simplify External Data Sharing with Quobyte’s S3 Gateway

Quobyte
3 min readFeb 8, 2024
Data Sharing

Collaboration between researchers from different universities, often across countries, has been a part of scientific research for decades. As part of these collaborations, sharing data sets and other files is often a challenge. This includes researchers accessing data stored in home or project directories on an HPC compute cluster across organizations or collaborating institutions like hospitals uploading patient data such as MRI images.

Accessing high-performance file systems directly using native protocols or NFS is often not possible due to security and administrative concerns:

  • Opening the network to external users and institutions is a security risk and involves a significant administrative effort to set up VPN connections.
  • The clients connecting to the file system are untrusted, e.g., managed by external organizations, or internal users have root access or use their private computers at home.
  • External users aren’t registered with the AD/LDAP system, plus they might have different credentials from their organization on their machines.

Object storage, also known as S3, is a secure protocol for accessing data over the internet. Communication between the clients and the storage system is over HTTPS and is secure and encrypted. S3 gateways can be placed behind load balancers and firewalls for external data sharing. Users do not need to use a VPN to access the internal network.

External Collaboration

However, it is not ideal to create a separate storage or cloud system solely for sharing data. This is because users would need to manually transfer data between these systems. The result is multiple copies that are out of sync, wasted space, and frustrated users.

Share your High-Performance Scale-out File System via S3

With Quobyte’s S3 gateways, you can seamlessly share files between the file system and S3. We have built our gateways on custom code instead of “slapping on” random open-source software. The result is a tightly integrated S3 support where data sharing is seamless for users and applications regardless of the access protocol.

Accessing data from different interfaces

By sharing the high-performance parallel Quobyte file system externally via S3, your users can access the same files from the GPU cluster, the HPC system, their workstations, and any external computer seamlessly. Just like the file system layer, Quobyte’s S3 access is designed for high-performance and parallel access.

The Quobyte S3 gateways map all S3 operations on file system operations, so Quobyte volumes are concurrently accessible as S3 buckets. This also extends to access control through unified ACLs, which can be modified from the file system client as well as S3. Since there is only one set of ACLs for Quobyte for all interfaces (Linux, NFS, Windows, S3…), it is easy to keep track of who’s allowed to access files and properly enforce access restrictions.

In addition, Quobyte’s S3 access offers the following benefits:

  • Multi-tenancy support for file system and S3 access
  • Self-service for users to manage access keys from the Quobyte webconsole
  • Scalable S3 performance with an unlimited number of S3 gateways
  • Full integration with Quobyte’s policy engine for data management and automated tiering
  • Low latency operations on flash and cost-effective storage on HDD in one system

Originally posted on Quobyte’s blog on February 02, 2024.

--

--

Quobyte

Quobyte empowers customers by providing real software storage so that they can keep up with the ever-increasing amounts of data in today’s data-driven world.