The drive for location analytics has increased the need for geospatial analytics at scale. Things such as connected devices, tracking digital twins, and sustainability initiatives have significantly increased the volume and velocity of captured geolocation data, stressing the geospatial processing systems and the related storage subsystems. There are many ways to analyze large geospatial data sets – unfortunately, many are slow or fail if the dataset is too large. When trying to make comparisons between technologies, it becomes difficult as the geospatial space has no commonly accepted benchmarking framework (there is no “TPC-H for GIS,” so to speak). The Defence Science and Technology Laboratory (DSTL) in the United Kingdom commissioned and published a benchmark whitepaper, “Benchmarking Technologies with Geospatial Big Data,” comparing many commonly used technologies, including Geospark, Postgres, and MongoDB, and proves to be a solid place to start for creating common tests across GIS technologies.
At SADA, with our expertise in the Google Cloud Platform (GCP), we wanted to test these challenging analytics requirements using the geospatial capabilities of BigQuery by replicating the data generation, data loading, and querying benchmarks as closely as we could. Not only were we successful, but the performance also proved to be highly responsive and an excellent option for geospatial data analytic challenges at scale.