What are the best debut albums of all time? Who are the greatest Impressionist artists? To find out, check Ranker, the world’s leading publisher of fan-powered ratings–featuring polls on just about everything, including entertainment, brands, sports, food, and culture. Ranker attracts over 30 million monthly unique visitors worldwide and manages one of the largest databases of people-powered opinions.
Founded in 2010, now over one billion votes and counting power Ranker Insights, a treasure trove of deep psychographic correlation data that delivers personalized consumer recommendations and audience insights to third-party organizations seeking a deeper understanding of consumer tastes and preferences. Ranker was listed on Deloitte’s 2019 Technology Fast 500 and is ranked No. 5 on Fast Company’s list of the 10 most innovative companies in data science for 2021.
Recently, the data science company launched Watchworthy, a statistically relevant crowdsourced TV and movie recommendation app, powered by its proprietary machine learning platform and algorithms, for consumers to build personal watchlists from over 200 networks and streaming services.
Since Skylar Wolf, Ranker’s Chief Architect, started with the company ten years ago, the company’s database has grown to over one billion rows of data. “Ranker is very much a big data company,” says Wolf. “We’ve rapidly iterated and created so many models that the biggest problem we needed to solve was managing fifty different models in production, training them, deploying them, keeping all this up at scale, and serving real-time results.”
Keeping all those models up and running is important to the success of Ranker. “Every downtime results in loss of revenue,” says Ramesh Nori, CTO at Ranker. “Previously, there were instances where we had a significant spike in traffic because of some event that has happened within the music community or in pop culture. The burst in traffic was causing latency and we were existentially dependent on our hosted CSP to resolve such issues.”
In order to reduce downtime, the company decided to transition from its previous cloud solution provider (CSP) based on virtual machines (VMs) to Google Cloud Platform (GCP) and Google Kubernetes Engine (GKE).
“We wanted to switch from a VM-based, old-school architecture to a Kubernetes containerized environment. Unsurprisingly, Google Cloud has the best implementation of Kubernetes because they created it. In addition, we were impressed with Google Cloud’s managed services offerings.”– Skylar Wolf | Chief Architect at Ranker
Ranker’s models, particularly with Watchworthy, are very large, so they require a lot of RAM and CPU. With so many different models, each using varying levels of compute resources, specifying the size of nodes to exactly match the size of each model was challenging, often resulting in excess capacity on nodes.
“Our models typically don’t work in a serverless environment as there are size limits on the functions you can have–our models exceed those limits,” says Wolf. “Then, when we started using GKE Standard clusters, everything was great, but we started noticing that we had unused memory and processor resources. Allocating those resources in Kubernetes had become less efficient.”
Google Cloud and SADA, a Google Cloud Premier Partner and a three-time Google Cloud Reseller Partner of the Year, went all-in to assist Ranker’s engineering team to achieve their goals of saving money and reducing complexity, while increasing the stability and redundancy of their infrastructure. Upon Google Cloud’s introduction of GKE Autopilot, SADA encouraged and guided Ranker to migrate to this new mode of operating a Kubernetes cluster more easily and cost effectively.
“GKE Autopilot is now at the center of our ML systems. All our ML models are deployed through GKE Autopilot,” says Wolf. “From there, they’re serving recommendations and scores in real-time, scaling automatically, and saving us money at the same time. In our application service layer we are reading and writing to Cloud SQL–MySQL–for our transactional data. At the entry points into our network, we use Memorystore (Redis) to power our concept of global redirects, as well as provide sub-millisecond level access to data managed by our APIs.”
Eventually, all the data ends up in BigQuery. Periodically, data is copied from Ranker’s transactional databases into BigQuery. Ranker also uses BigQuery’s streaming API to ingest real-time data for certain types of valuable data, including its poll voting data.
“We have hundreds of dashboards featuring thousands of metrics, which are automatically updated multiple times a day with the data coming from BigQuery,” says Wolf. “This simply wouldn’t be possible or cost-effective with any other technology.”
As a result of working with SADA to implement GKE Autopilot, Ranker was able to simplify operations of managing the Kubernetes cluster infrastructure, control plane, and nodes–reducing the need to learn nitty-gritty details of cluster configuration. GKE Autopilot automatically applies industry best practices and can reduce or eliminate node management operations, maximizing Ranker’s cluster efficiency.
“With GKE Autopilot, you’re only charged for the resources requested; there’s no concept of unused capacity on nodes, no worrying about node sizes. That’s been a massive advantage and allowed us to reduce our costs by 30% to 50%, just by switching from GKE Standard to GKE Autopilot.”– Skylar Wolf | Chief Architect at Ranker
Ranker also started utilizing Google Cloud’s new Spot Pods for its production architecture to run dev/test cluster workloads in an environment that can handle some capacity disruption. “Cumulatively, the cost savings that we’ve seen just from using these new products is pretty immense–in excess of 60%,” says Wolf. “I think it’s a great testament to the innovation Google Cloud is putting into GCP and its related services.”
Migration to GCP has allowed Ranker to spend less time monitoring architecture performance and more time focusing on innovation, building feature-rich systems, and delivering value to its users. Overall, SADA helped Ranker:
- Migrate infrastructure from a legacy CSP to GCP
- Shift their full production ML stack to GKE Autopilot
- Scale infrastructure to address traffic spikes
- Reduce latency of image resizing systems
- Improve stability and performance of Ranker services
Running a Kubernetes cluster is not necessarily supposed to be entertaining. However, with GKE Autopilot, Ranker found a more efficient way to deliver additional fun for their users and increase profitability for themselves.