FinOps practitioners and finance professionals often begin their FinOps journeys reporting on cloud spend using the cloud providers’ billing console, invoices, or billing data exports. Depending on the scenario, they might also employ a third party cost management tool such as Cloudability or Ternary that offers consolidated reporting across clouds along with enhanced reporting and optimization recommendations.
In either case, practitioners are usually only able to generate reports on cloud cost data and some limited infrastructure performance data using these methods. There are many scenarios, however, where practitioners may find that these data sets are insufficient to meet their needs and additional data sets need to be integrated with the cloud billing data to improve reporting. In this article, we examine three case studies that illustrate FinOps challenges that our clients faced and how integrating additional data sets with the cloud billing data solved their problems.
Case #1: Analyzing and charging back spend on containerized workloads
Our financial services client had migrated a number of workloads to Kubernetes clusters over a period of several months. As more and more containerized workloads came online, the compute costs for the clusters grew and their finance team began to ask for reports showing the costs each of the client’s products were generating within the clusters.
Around the same time, the client noticed a disturbing growth of their non production workload costs in the clusters, specifically in accounts where customer demonstration and proof-of-concept instances were running. When they began to investigate whether some workloads could be turned off because the customers had moved past the demonstration/proof-of-concept phase, they realized that they had no way of determining which customers’ instances were running on which compute instances. This meant that none of them could safely be turned off until they could verify that no “live” customer data was still in use on them. The cause of both problems–an inability to report cost by product and by customer–was insufficient information in the cloud billing data. The client was billed for the cost of each cluster node’s runtime, with no breakdown of the cost of the containers running on the node.
The technical operators of the clusters, however, had access to additional tags not found in the billing data. Specifically, compute instances in Kubernetes clusters have infrastructure tags that are separate from the cost-oriented tags that appear in the billing data, and these tags could reveal which containers were associated with which customers and products.
How dataset integration solved the problem
We obtained an export of what is known as a “metering data” report from the Kubernetes clusters and imported it as a table into BigQuery. We then drafted a set of queries that enriched the billing data with labels from the metering export, which provided the ability to report on the cost of containers within each compute node. From there we were able to help the client develop an approach for remediating both labels and their cluster architecture to allow the same reports to be generated with the costing data alone moving forward. The result was a great success: the client realized very significant cost savings once they were able to determine which non-production workloads were no longer needed and could be shut down, and their finance team was able to conduct much better customer and product profitability analysis.
Case #2: Departmental chargeback
Each month, our retail client’s Finance team had to spend hours calculating the chargeback of their cloud costs to various departments using a large spreadsheet. Their workbook included a sheet that “mapped” their cloud accounts to departments in a many-to-one fashion such as “accounts X, Y and Z are charged to department 1.” We have observed the use of such “account mappings” in just about every client we have worked with and their use for chargeback is time-consuming and error-prone. Worse, it limits visibility into the consumer’s cloud economics to those who have access to the accounting team’s financial reports. It is far better for technical and product owners in an organization to have real-time visibility into their cloud spend to allow them to optimize their consumption and quickly recognize anomalies.
In the case of this particular client, the account mapping had between 10 and 15 columns of information defining the relationship between their cloud accounts and a number of accounting related dimensions such as general ledger codes, profit centers, products, etc. Their month-end “closing” process was very manual and the results had to be carefully audited, as the spreadsheet program used formulas to perform lookups between cost reports they exported from the cloud provider’s console and the account “mapping.”
How dataset integration solved the problem
We imported the cloud provider’s billing export into BigQuery and created an “external table” out of their “mapping” spreadsheet. The “External Table” feature of BigQuery is very convenient: it allows BigQuery to access a flat file in Cloud Storage and treat the data in the file as a table in the database. This greatly eases maintenance of such data, since the file can simply be downloaded, edited, and re-uploaded, resulting in an instant update to BigQuery.
Using this method, the maintainer of the account mapping need not have knowledge of SQL query language. We generated queries that enriched the billing data for this client with all of the fields in the “account mapping” sheet in the form of a BigQuery “View.” This allowed their teams to simply access the new View with a Data Studio report where they could instantly report on any of the new dimensions in the “account mapping” spreadsheet. Monthly chargeback went from an arduous process buried in their large spreadsheet to a few clicks in Data Studio by anyone in the organization, who instantly could view and export reports that were fully segmentable by the account mapping.
Case #3: Isolating the cost of software licenses
We noticed a considerable amount of spending on compute instances with commercial Linux operating system licenses at our multi cloud SaaS client. As a best practice, we recommend that our clients audit their technical teams periodically to verify that such commercial operating systems are necessary and that free and open source operating systems cannot be substituted.
Before approaching the technical teams, our client wanted to be sure that the potential savings would be worth the cost of conducting the technical audits.There was a problem calculating the cost, however. The commercial operating systems were being consumed on a cloud other than Google Cloud, and due to the way the cloud provider charged for compute instances, our client had no way of knowing what the commercial licenses were costing them. Specifically, unlike Google Cloud, the cloud provider did not break out commercial operating systems’ license fees from the fees for the instances’ run times; they were simply charged a flat rate in the form of “one hour of runtime for instance type X running operating system Y.”
How dataset integration solved the problem
We downloaded the cloud vendor’s pricing lists for compute and imported it as a table into BigQuery. We then drafted queries that looked up the prices for the same instances the client was using with commercial operating systems, but our lookup fetched the pricing for the free operating system version of the SKU. Subtraction of one price from the other revealed the cost of the license fee itself.
The results were startling: the license cost of the operating system as a percentage of the entire charge for the instances rose dramatically the smaller the instances that were in use. This actually made sense to us, since the commercial license fee was charged in effect for technical support by the cloud vendor, and the level of effort they needed to exert (and so the price they should charge) would naturally reach some minimum level no matter how small a compute instance was.
With this discovery, we generated a report detailing the technical owners spending the most on commercial operating system instances and those who tended to use smaller instances in particular. We then approached these technical owners with our findings. They were startled to see how much they were spending on the software licenses, and soon realized substantial savings in moving many of their instances to free and open source compute operating systems.
Summing it all up
There are countless more examples of the usefulness of dataset integration to FinOps. We believe that every FinOps team should adopt the best practice of ingesting their cloud billing data into a scalable database service such as BigQuery so that the integration of such datasets can be affected with minimal lead time when needed. As a general practice, we believe that commercial cloud cost management tools such as Ternary or Cloudability can provide technical owners such as Engineering or Product leads with visibility into their cloud spend, while FinOps practitioners use the database and dataset integration to conduct deeper analysis of their cloud spend as needed.
Want to go deeper?
Read our white paper on unlocking the value of cloud FinOps.