Welcome to 2024 and all the innovation that the year will bring! Throughout 2023, Google Cloud diligently worked to amplify its storage offerings and provide key feature enhancements to existing services. The wide variety of storage options, their applications, and utilization are broad and varied. In this blog post, we will explore the key Google Storage product offerings, how to utilize the service, and the key feature enhancements that Google delivered in 2023 within the storage product line.
We have consolidated all this information so that you won’t have to spend your day searching for it. Instead, you can focus on leveraging the technology to improve your day-to-day operations! Let’s begin. This will be fairly thorough, so be advised, it will be quite the read!
Below are the links to explore the storage products based on your specific interests:
- Google Cloud Storage
- Google Persistent Disk
- Google FileStore
- Google Cloud Backup Service
- Google Cloud Hyperdisk
- Google Parallelstore
- Google Cloud NetApp Volumes
Google Cloud Storage
Google Cloud Storage (GCS) is Google Cloud’s most widely used and versatile storage offering. It falls under the industry category of Object Storage, and quite frankly, this product offering needs no introduction.
Leverage GCS for a wide range of object storage solutions. The use cases are broad and varied, including hosting documents and files of any type for content delivery, storing document relationships, and serving as a reliable solution for backup, archival, artificial intelligence, machine learning, and more.
The 2023 updates
Increased egress bandwidth quotas: In late September 2023, Google Cloud Storage changed its approach to enforcing egress bandwidth quotas. Instead of a single default value, quotas now depend on individual project history and billing status. This means most projects will see unchanged or increased quotas, allowing for smoother data transfer.
Cloud Storage FUSE (link): Released in 2023, FUSE allows Linux or UNIX hosts to mount and access a cloud storage bucket as a local filesystem. FUSE underwent a wide variety of enhancements, including:
- Log rotation configuration: Cloud Storage FUSE now supports the ability to configure log rotation (link).
- Mounting behavior control: You can control the mounting behavior of Cloud Storage FUSE by using a configuration file instead of global options.
- ARM64 Machine Support:Cloud Storage FUSE is now available for ARM64 machines.
This is ideal for AI/ML and Google Kubernetes Engine (GKE) workloads that require direct server OS access to the bucket and enable object level control from ‘gsutil’ or Storage API access for applications.
Pub/Sub to Cloud Storage subscriptions (link): The Pub/Sub integration allows for a simplified streaming data (object) ingestion into your object storage buckets. The export subscription writes messages to an existing Cloud Storage bucket as they are received without the need to configure a separate subscriber client.
Anywhere Cache: Provides an elastically scalable zonal SSD read cache to minimize egress bandwidth costs with predictable low latencies.
gRPC API (link): The introduction of gRPC API in Cloud Endpoints as a new Cloud Storage API option provides more efficient routing for analytics workloads, resulting in a reduction of overall execution time. The gRPC API enables direct calls to methods on a server application located on a different machine, simulating the interaction as if it were a local object
Improved Hadoop connector (link): Enhanced write performance for Hadoop/Spark workloads on Cloud Storage via parallelization and disk buffering.
Enhanced data transfer and management:
- Cloud Storage client library transfer manager: Improved read/write performance in client libraries by parallelizing uploads and downloads.
- Transfer for HDFS: This enables the use of Storage Transfer Service to easily migrate petabytes of Hadoop/Spark workloads to Google Cloud.
Improved security and monitoring:
- Object Lifecycle Management (OLM) enhancements: This includes support for custom time-based rules and improved integration with Cloud Functions for automated actions based on lifecycle events.
Overall, 2023 provided a great deal of enhancements to Google Cloud Storage to provide more versatility and performance across a variety of different workloads from AI/ML through high-performance computing. All of the updates to GCS can be seen here. You can also find all historical updates to GCS here.
Google Persistent Disk
The Persistent Disk (PD) offering is the most commonly utilized storage tool in Google Cloud along with Google Cloud Storage. Google Persistent Disk is a directly attached disk to all Google Compute Engine (GCE) instances. Users have the option to leverage Persistent Disk in the formats of Standard Persistent Disk as HDD and SDD or Balanced Persistent Disk for performance and cost management. Persistent Disk is deployed in either a zonal configuration or regional configuration depending on the high availability needs of your tech stack. Google Persistent Disk falls under the Storage category of Block Storage.
Persistent Disk is leveraged by Compute Engine VMs for a wide variety of workloads. Think of this as the drive or mount point on your server. You could be building VMs to host containers, IT infrastructure tooling, databases, stand-alone file servers and more. Leveraging Persistent Disk should be a balance of performance, capacity, cost and availability features for your server.
The 2023 updates
- SSD Persistent Disk expansion: The maximum capacity for standard SSD Persistent Disks was increased to 32 TiB in June 2023, providing more options for workloads requiring high IOPS and low latency.
- Regional Persistent Disk Availability: Certain Persistent Disk types became available in more Google Cloud regions, offering users greater choice and control over their storage locations (Seoul, Mumbai, Osaka, Jakarta, and Salt Lake City),
- Replica recovery checkpoint of a regional Persistent Disk for crash consistency: You can use the checkpoint to create disk snapshots from an incomplete zonal replica.
- Use of a regional Persistent Disk as a VM boot disk (link).
- Persistent Disk Asynchronous Replication (link): This is a big deal! This capability provides the ability to have a replicated copy in another region in standby mode. This will give users aggressive RPO capabilities inline to the host.
- For all historical updates on Persistent Disk, click here.
Google Cloud Filestore is Google Cloud’s native Network Attached Storage Solution. Filestore continues to serve NFSv3 protocol storage for the needs of your Linux and Unix based GCE instances in Google Cloud.
Filestore is a general purpose storage product that allows multiple servers to mount a common file system where they can jointly collaborate. File storage can be leveraged for a wide range of applications from simple file sharing capabilities to hosting IO intensive applications and databases. Filestore is ideal for scenarios where the scale-out of hosts is required, and there’s a need for consistent data sharing across all hosts without the necessity for multiple copies.
The 2023 updates
New tiers and options:
- Filestore is now available in the following regions: Berlin, Dammam, Doha, and Turin.
- Support for Google Cloud VMware Engine (GCVE) as an NFS Datastore (link).
- Enterprise Key Management with customer-managed encryption keys (link).
- Announced in September 2023, this new tier offers regional instances with 1–10 TiB of storage, catering to high-performance and data-intensive needs. It includes features like non-disruptive upgrades, snapshots, and backups.
- GKE now supports extensions for single-share backup and restore for GKE clusters (link), along with multi-share access for GKE. This includes support for up to 80 shares across a single enterprise-tier instance (link).
- Enterprise tier backups of Filestore (link)
High Scale SSD tier (beta):
- Released in December 2023, this tier provides exceptional performance for demanding workloads like AI/ML and high-performance computing. It boasts capacities from 60–320 TB and allows scaling between those ranges.
- The ability to revert (roll-back) snapshots (link)
- Filestore multi-shares for GKE (preview): This feature offers smaller NFS-mounted Persistent Volumes from a single Filestore instance, making it more cost-efficient for GKE users with varied storage needs.
- IP-based access control (beta): This feature allows granular control over file share access by specifying permitted client IP addresses, improving security.
- Private services access (beta): This feature enables creating Filestore instances on Shared VPC networks in service projects, enhancing security and network control.
- Increased maximum capacity: The Standard SSD tier now allows up to 200 TiB of storage, accommodating larger datasets.
- Performance optimizations: Ongoing improvements have been made to enhance IOPS and throughput across various tiers.
- Expanded regional availability: Filestore is now available in more Google Cloud regions, providing wider accessibility.
Filestore brought many new features and capabilities that provide a higher degree of security, performance, and scalability. The strides of multi-share capabilities for GKE instances provides Kubernetes-based instances the ability to consume storage in smaller doses while providing detached persistent volume claims at scale. The changes continue to provide Filestore with key features and capabilities to help users build more robust infrastructure. All product updates can be found here.
Google Cloud Backup and DR Service
The Google Cloud Backup and DR Service is a managed service the provides backups for Google Compute Engine instances and VMware VMs directly from the Google Cloud console. Backups can be stored in Google Cloud Storage as well as archived. Recovery is flexible, occurring within a project, across multiple projects, and supporting multi-region recovery. The Google Cloud Backup and DR Service provides features such as ‘incremental-forever’ backups (which save significant storage usage in the long run), instant mount and recovery for low Recovery Time Objectives (RTOs), application-consistent backups, and more.
Leverage Google Cloud Backup and DR Service to safeguard against ransomware attacks, unplanned outages, and enable recovery in remote regions of Google Cloud for Compute Engine VMs and VMware Engine VMs. Ensuring the protection and backup of critical databases running on any GCE VM or GCVE VM is paramount. This includes databases such as Microsoft SQL, Oracle, SAP, IBM Db2, MySQL, as well as file systems, and more.
The 2023 updates
- Archive snapshots for Compute Engine instance backups: This allows moving older backups to lower-cost storage tiers for long-term archival and compliance purposes (link).
- Simplified appliance update experience: The management console offered a streamlined process for updating backup/recovery appliances, facilitating easier maintenance.
- Agent support for additional operating systems: The backup and recovery agent gained support for RHEL 8.6, 8.7, and 9.0, as well as Oracle Enterprise Linux 8.7 and 9.0, expanding compatibility.
- Improved metrics reporting: The management console and backup/recovery appliance received enhanced metrics reporting to aid in monitoring and troubleshooting.
- Project cleanup guidance: Google added specific guidance for deleting or disabling backup and DR components within a project, ensuring proper cleanup procedures.
- Support for Audit Logging, Cloud Logging, and Cloud Monitoring.
- Management console running highly available within a deployed region as well as management console integration with the Google Cloud CLI.
- Security Command Center Premium integration: Event Threat Detection, a premium service, released new rules specifically designed to monitor and detect potential threats within Google Cloud Backup and DR Service, enhancing overall security posture (link).
Leveraging and deploying backups is now easier than ever on the Google Cloud console. We cannot overstate the importance of protecting your data, exercising routing recovery tests and maturing your Business Continuity practice. For all Google Cloud Backup and DR service 2023 updates, click here.
Google Cloud Hyperdisk
This is the news! Welcome to the juicy part of storage – where Google introduces the concept of Storage Area Network-style Block Storage to the cloud. It’s a game-changer with the General Availability release of Hyperdisk Throughput and the Preview of Hyperdisk Balanced. This innovation is especially significant for structured workloads running select N2, N2D, and T2D GCE instances. But why is this so important? Let’s delve into ‘The how.’
Hyperdisk emerges as a powerful storage option, available for selection when provisioning specific N2, N2D, and T2D GCE instances. Check out the full list of supported instance types here. Opt for Hyperdisk in scenarios where your VMs handle structured workloads demanding high throughput, high IOPS, or a combination of both – especially suitable for database-centric workloads like Oracle or SQL. While Persistent Disk may come to mind for your performance needs, it comes with caveats. Persistent Disk’s bandwidth is capped at 2,400 MB/s and 100K IOPS for sustained Read or Write, tethered to the VM vCPU count.
Users could explore Local SSD. Local SSD already gives us up to 900K Read IOPs and 800K Write IOPS. If we leverage NVME Local SSD you’ll get up to 3.2M Read IOPS and 1.6M Write IOPS. It’s true! However, keep in mind that the maximum storage provision is limited to 9TiB (up to 24 disks) and 12TiB (up to 32 disks) of capacity. Additionally, note that there are no data persistence guarantees (here) or backup capabilities associated with Local SSD.
Alas! The beauty of Hyperdisk. Boasting speeds of up to 10,000 MBps and supporting a maximum of 500K Read/Write IOPS, Hyperdisk unlocks the gateway to high-performance workloads on GCE!
Leverage Hyperdisk for databases, AI/ML, and High-Performance Compute (HPC) workloads to provide dedicated low latency, high IOPS, and high-throughput needs.
The 2023 updates
- Tiered options: The release of Hyperdisk Extreme and Hyperdisk Throughput. Currently, Hyperdisk Balanced is in preview mode.
- Google Cloud Hyperdisk Documentation: [https://cloud.google.com/compute/docs/disks/hyperdisks]
Continuing with the new! In the preview beta phase, let’s delve into Parallelstore! It’s a high-performance, managed parallel file service built on Intel DAOS. Parallelstore boasts a POSIX-compliant file system, delivering a massively scalable boost with low-latency access from host to storage.. Sounds fast? Well, it is!
Parallelstore falls under the category of filesystems. What sets it apart from traditional NAS is the scalability, data distribution, and number of concurrent users that can simultaneously access data. While Google has not yet published specific numbers for performance benchmarks, throughput benchmarks, or protocol support, our initial speculation is that NFSv4 is likely supported.When will Parallelstore fit your infrastructures needs? The moment Filestore falls short of meeting your performance expectations or capacity demands. That’s when it’s time to harness the power of Parallelstore for your HPC requirements, AI/ML intensive training workloads, or Computer-Aided Engineering tasks such as rendering workloads, complex modeling, and more. To say this solution will be fast, durable, and scalable is an understatement! We can’t wait to get our hands on this.
The 2023 updates
- Google Cloud’s blog post: https://cloud.google.com/architecture/parallel-file-systems-for-hpc
- SiliconANGLE article: https://www.infoq.com/news/2023/08/google-cloud-new-storage-options/
- InfoQ article: https://cloud.google.com/architecture/parallel-file-systems-for-hpc
Google Cloud NetApp Volumes
Last but not least, Google Cloud and NetApp have joined forces to create a fully managed first-party solution called Google Cloud NetApp Volumes. This solution expands Google Cloud’s Network Attached Storage (NAS) offerings. Users can now leverage the technology directly from the Google Cloud Console. Google Cloud NetApp Volumes rounds out the storage solutions portfolio for Google. This provides Google Cloud customers with a robust solution that is feature-rich and protocol-ready. Google Cloud NetApp Volumes provides various NFSvx, SMBx or a mix of both protocols to a volume.Let’s outline some of the features:
- Provision volumes from 100GiB to 100TB
- Data protection galore: Automated Snapshots, integrated backup, cross region volume replication, rapid cloning.
- Security features: LDAP, Active Directory integration
- Kubernetes CSI integration
- Here is a list of all of the product capabilities. Brace yourself, they are a lot!
Google Cloud NetApp Volumes provides a wide range of usage. Leverage this storage solution when looking to create Windows-based File Shares.
- Windows File Shares: Leveraging this solution will reduce complexities of managing or rolling your own Windows File Server. The same applies for a Linux/Unix based file system.
- Google Cloud VMware Engine Datastore: Additionally, you can also leverage this solution for additional Datastore capacity for Google Cloud VMware Engine.
- Database Storage: Database solutions running on Google Compute Engine and VMs in Google Cloud VMware Engine benefit from decoupling the storage from the compute node by having the database tables, logs, etc run on Google Cloud NetApp Volumes.
- Kubernetes: Kubernetes-based workloads can also leverage this solution for Persistent Volume Claims.
The greatest power of this solution is that it provides inline asynchronous replication to secondary regions with very aggressive RPOs and RTOs with the capability to have inline failover and failback between regions by copying only the delta changes that have taken place between site failovers.
Security is top of mind for your workloads that demand strict access controls, data encryption, and temporary capacity on-demand requirements.
Finally, keep in mind that this solution comes from the NetApp family of products. Explore NetApp Global File Cache. Global File Cache allows your uses to be geographically located in different regions of Google Cloud and through this software-based solution, you can quickly and securely cache datasets closer to the end user, thus reducing or eliminating latency associated with file access for Windows File Serving needs. Global File Cache is a great tool to use with Google Cloud NetApp Volumes.