External Tiering Can Be Extremely Tiring (For Your Users and Admins)
External Tiering: A Useful Tool, But Use it Wisely
Have you ever felt like you’re reaching for the wrong tool for the job? In the world of data storage, external tiering can be that hammer that seems to solve every problem. But just like a hammer wouldn’t be your first choice for delicate woodwork, external tiering isn’t always the best solution.
Why Tier Data?
There are two main reasons for tiering data:
- Data Temperature: Data comes in varying degrees of “hotness.” Frequently accessed data (hot) needs to be readily available, while less used data (warm) can be accessed with a slight delay. Very rarely accessed data (cold) can be moved to a more cost-effective storage tier.
- Storage Media Costs: Different storage media offer different performance and cost characteristics. Flash storage is ideal for hot data due to its speed, but it’s expensive. Hard Disk Drives (HDDs) are more economical for cold data, but slower to access.
Internal vs. External Tiering
- Internal Tiering: This approach keeps data within the same storage system, seamlessly moving it between different media types (e.g., flash to HDD) based on access patterns. Benefits include:
- Transparency: Users access data under the same path, regardless of its physical location.
- Near-instantaneous Access: Data remains readily available, with access times adjusted for the storage media (e.g., faster on flash, slower on HDD).
- High Performance: Systems like Quobyte with parallel IO and striping/ erasure coding deliver high performance even on the HDD tier.
- Simplified Management: Admins only need to manage a single storage system.
- External Tiering: This method moves data to a completely separate storage system, typically for very cold data. Common examples include tape libraries or cloud storage like AWS Glacier.
Benefits include:
- Reduced Costs: External storage can be very cost-effective for rarely accessed data.
However, external tiering comes with drawbacks:
- Slow Access: Retrieving data from an external system can be significantly slower than internal tiering.
- Management Complexity: Admins need to manage both the primary storage and the external system.
- Unpredictable Costs: External services like AWS Glacier can have high retrieval fees, leading to unexpected costs.
The “Hammer” Analogy
External tiering excels for storing large amounts of very cold data, such as experimental results or legal archives. It provides a cost-effective and energy-efficient solution. It can also be beneficial for organizations with extensive data and the expertise to manage their own tape library (typically for 100+ PB of data).
However, external tiering becomes a poor choice when compensating for an undersized or expensive hot storage tier. This leads to:
- Increased Costs: Maintaining both internal and external storage systems is expensive. External services like S3 can further exacerbate costs due to slow data retrieval over internet bandwidth and per-operation fees.
- User Frustration: Users experience delays as data is “paged in” from the external system.
- Constant Tiering: A small hot tier leads to users competing for space, causing data to constantly move between internal and external storage, impacting performance.
The Quobyte Advantage
Quobyte offers a robust internal tiering solution that addresses the limitations of external tiering:
- Native Support: Quobyte seamlessly integrates NVMe for low latency and high throughput alongside HDDs for cost-effective storage.
- Multi-Tier Flexibility: Create as many tiers as needed (e.g., high-end flash, read-optimized flash, HDD, high-density archive HDDs).
- Granular Control: The Quobyte policy engine empowers admins to define tiering rules for specific file types at any time.
- Transparent Movement: Files are moved between tiers without impacting user access.
- Intelligent Optimization: Quobyte goes beyond basic tiering by intelligently switching between storage media within the same file, optimizing performance for mixed workloads with large and small data files (common in AI and life sciences).
By leveraging Quobyte’s internal tiering capabilities, you can ensure your data storage is efficient, cost-effective, and delivers the performance your users need. Stop using external tiering as a one-size-fits-all solution and choose the right tool for the job.
Quobyte’s S3 Gateway
Originally posted on Quobyte’s blog on April 08, 2024.