In today’s constantly shifting business landscape, more companies are looking to harness the power of data to drive decision making and achieve a competitive edge. Unfortunately, many businesses face challenges when it comes to working with data at scale due to the amount of data engineering and wrangling required to get datasets ready for use.
Crux Informatics addresses these challenges by helping businesses alleviate the burdens of data management. Crux simplifies data delivery by helping companies reliably get the data they need, how and where they need it.
Crux supplies their business customers with data—mostly as files. Their job is to make all that data look the same and supply it wherever the customer wants. “The data appears on our platform in the same way,” said Mark Etherington, CTO of Crux Informatics. “It’s accessible and deliverable in the same way, and it’s consistent across suppliers for the consumer. We give our customers ubiquity.”
In essence, Crux cleans up data for customers, so their employees can focus on the work they’ve spent their lives training for. Crux’s customers don’t want their PhD data scientists doing the dirty dishes of data clean up. And they don’t want their infrastructure engineers taking a stab at data manipulation. “These experts are very difficult to hire,” said Etherington. “Our customers want them to add value in their fields of specialization and not spend time on repetitive data normalization tasks.”
In a few short years, Crux has already managed more than 800 million data files while working with more than 125 data supplier partners. But with over 3.5 petabytes of data, Crux began to worry about cost issues that could arise if they had to duplicate that data.
And when the company began to design their platform for the future, they realized that their current data warehouse solution wouldn’t scale cost effectively as the amount of data increased. “We didn’t want a lot of carry costs,” said Etherington. “Because we deliver data, we’re carrying storage, database load and computation of queries on the database. Those numbers quickly ramp up the more data you put in—and we were putting a lot of data in.”
In addition, after increasing their customer base tenfold over the past 12 months, Crux required more flexibility from their platform to deliver data for all the different use cases in which their customers required it. “We need to help every data consumer at each of our customers access data in exactly the way she or he wants it, even if that means supplying it in a spreadsheet,” said Etherington.”
Seeking a better cost structure for their growing business brought Crux full circle to where the company was born—Google Cloud Platform (GCP). Because Crux’s platform is hosted on GCP, the BigQuery enterprise data warehouse solution on top of Google Cloud was just a great fit. “BigQuery simply has great economics,” said Etherington. “They’re supplier-biased economics. I wouldn’t have to pay for load, yet I would get load performance that’s significantly better.”
“Moving to BigQuery gave us a great deal. SADA and Google were invested in our success, working with us to develop a cost structure to meet our needs.”Mark Etherington | CTO, Crux Informatics
And because Crux is a “super supplier” of data, according to Etherington, not being charged to load data would vastly improve their business fundamentals. “Moving to BigQuery gave us a great deal. SADA and Google were invested in our success, working with us to develop a cost structure to meet our needs,” said Etherington. In addition to an improved cost structure, Etherington believes that the security integration with tooling in BigQuery is second to none.
Integration extends to making Crux a part of the BigQuery ecosystem. This enables them to offer data in more ways, including as spreadsheets. “All these innovations for delivering data require that Crux and BigQuery have tight integration on security, identity authentication and connection to the ODBC drivers for SQL servers,” said Etherington.
To implement BigQuery for Crux, Google Cloud worked with their partner company SADA, and Etherington is glad they did. He’s impressed with how the SADA team responded as soon as Crux made the call, helping them enhance BigQuery’s performance to better meet their needs. “Support is very easy to dismiss, but when we needed help, the SADA team was on it,” said Etherington. “Less mature companies don’t understand that support and engineering are an important connection with clients.”
“Support is very easy to dismiss, but when we needed help, the SADA team was on it. Less mature companies don’t understand that support and engineering are an important connection with clients.”Mark Etherington | CTO, Crux Informatics
Crux also appreciates how user-friendly BigQuery is. BigQuery, which is fully-managed and requires no resources, such as disks or virtual machines, enables the user to simply move their data into the platform and let it handle the hard work. Users can control access to both the project and their data based on business needs, such as giving others the ability to view or query their data. “We had trouble testing BigQuery’s provisioning because it just worked,” said Etherington. “I couldn’t get it to ask me for a password because if you have a Google account, all the tools automatically work.”
With a great commercial deal, better runtime economics, ease of use and the right support, Crux flipped the switch to BigQuery.
Crux has seen many positive changes since implementing BigQuery. By using the BigQuery data warehouse solution built on top of Google Cloud, Crux can deliver load times 10 times faster than separate solutions.
“Because we’re already on GCP and leveraging Google Cloud Storage, we’re loading into the data warehouse that’s sitting on top of its native storage cluster—on top of Colossus,” said Ryan Haggerty, Head of Infrastructure and Operations at Crux. “Being able to very quickly and efficiently load our data into BigQuery allows us to build more product offerings, makes us more efficient and allows us to offer more value-added services. Having BigQuery as part of our toolkit enables us to think up more products that help solve our customers’ challenges.”
In addition, by moving to BigQuery on Google Cloud, Crux no longer needs to replicate databases to different locations. If Crux wanted to set up a different solution, they would need to copy their database. “If you try to do that with other solutions, you have to set up replications,” said Haggerty. “For example, I would have to make an explicit copy to get my data in the EU. With BigQuery, you just select the region and the data is there. That’s the difference between running on top of the cloud provider and being a cloud.”
Most importantly, BigQuery enables Crux to supply their customers with the required data how, where and when they want it without having to perform a series of difficult technical tasks. “If you’re a power BI user talking to the Google Cloud backend, BigQuery works out of the box, plain and simple,” said Etherington. “It doesn’t require you to switch off the OS and install some fuzzy drivers and so on. BigQuery just works.”
In summary, the benefits that Crux gained by working with SADA to move to BigQuery on Google Cloud include:
- Better economics and improved cost structure
- Integrated security and deeper integration with tools across the board
- 10 times faster load times
- Enhanced query performance
“Without a doubt, BigQuery and SADA are enabling our business,” said Etherington.