Data Optimisation for Autonomous Driving


Your Trusted Data Partner

As the Trusted Data Partner of the transformation programme, Anaeko performed data optimisation services including a discovery and rapidly deployed teams across data optimisation, hybrid cloud integration, integrated analytics and multicloud DevOps. Each team applied an iterative approach to understanding, analysing, and iterating data optimisation solutions that were used to communicate progress, performance and technical trade-offs.

4 Month Project

5 Billion Video Frames

800 Sensor Files per second

60% Cost Savings

The Business Problem

Processing video and sensor data faster than your competitors is critical to becoming the world-leader in autonomous driving. An automotive company had a growing fleet of self-driving cars generating huge volumes of data and the cost and effort to manage these was spiralling. To maintain a marketleading position the company needed data optimisation to accelerate their analytical research.

The Technical Problem

The infrastructure team needed to optimise data processing, reducing infrastructure costs of storing and processing data and operational costs of managing data, so that the analytics team could monitor more vehicles. The infrastructure and analytics teams were under pressure to meet aggressive deadlines, so they needed a partner who could work their existing platforms and around their busy schedules.

The Data Optimisation Challenge

The automotive company generated hundreds of Petabytes of high def video files and high frequency streaming sensor data. Each test vehicle captured video files from 8 cameras and sensor data from 15 sensors. The test fleet transmitted 800 objects per second and captured over 5 billion video frames from a short trial. The company needed to efficiently search the camera and sensor data to identify common events and outlier incidents and to do this they needed to intelligently tag and curate all data.

Download Full Case Study

Data Optimisation Services

Data Discovery - Data Optimisation Services

Data Discovery

using deep inspection to
classify research data.

Data Ingestion - Data Optimisation Services

Data Ingestion

to increase the volume of data that could be processed.

Data Migration - Data Optimisation Services

Data Migration

to eliminate bottlenecks in the research pipeline.

Metadata Management - Data Optimisation Services

Metadata Management

using machine learning to enrich analytics.

Metadata Querying - Data Optimisation Services

Metadata Querying

to reduce data management overhead.

Data Storage - Data Optimisation Services

Data Storage

across hot and warm file and object storage.

Data Processing - Data Optimisation Services

Data Processing

to maximise network and compute usage.

The Data Optimisation Solution

In 4 months Anaeko delivered a data optimisation solution that integrated existing infrastructure and applications with elastic cloud services using a flexible microservice architecture. We provided comprehensive test results and benchmarks from performance, integrity and automated regression test suites so that in-house teams could onward maintain the solution. We produced configuration and operational user guides, in line with existing process, for operational teams to manage the solution.

The Data Optimisation Platform

After considering the technical and operational requirements, and leveraging our experience of optimising Petabyte-scale storage platforms, Anaeko developed high performance Python agents running on a scale-out Kubernetes container platform. Anaeko optimised the system using parallel processing patterns and took advantage of multiple hybridcloud services including containers, machine learning, and object storage to accelerate development. In our asynchronous processing design, the sensors and camera data was first pushed onto object storage for software agents to download and deep-inspect files, then generate metadata published to a metadata catalogue.

A metadata search user interface enabled downstream analysts to search metadata and locate matching files for further processing. We applied a bespoke algorithm for training the machine learning models.

To manage cost, files were stored in a cost effective warm tier object store when not being actively processed and when the model was being trained, files were transferred from the warm tier to more performant hot tier, leveraging faster Network File System (NFS) storage. The warm and hot tiers were connected by a 10Gbps line that we maintained at 70-80% utilisation while maximising utilisation of compute resources.

We developed deep-inspect algorithms to extract metadata and redact sensitive data including Vehicle Identification Numbers. Our agents enabled analysts to define search tags and patterns that were used to search for matching files and report back file identifiers and locations. To efficiently process millions of files, parallel processing was implemented both at a container level and through multithreading. Our solution maximised utilisation of available compute infrastructure, and monitored network throughput, throttling processing where necessary to avoid network backlog.

The Technical Benefit

Using our solution the automotive company was able to process 800 files per second while maximising the efficiency of the available network and compute infrastructure.

The extensible architecture enabled future custom search and scan agents to be rapidly developed and for processing of additional file types to be added to meet other data analytics pipeline needs. The fully tagged and categorised object store acted as an efficient storage platform that scaled with continued use. The automated test suites and DevOps pipelines acted as a best-practice framework for future projects.

The Business Benefit

The data optimisation delivered an estimated 60% TCO reduction. Combining Hot and Warm Tier storage within a software-defined storage architecture reduced the storage cost per file. Not only was the underlying storage hardware cheaper for object storage than for NFS, but the object store required less storage admin effort to manage. The end-to-end solution maximised utilisation of the available network and compute infrastructure optimising the processing cost. The in-house teams had clear visibility of progress throughout the project and were left with an extensible framework of open standard and open source services, integrated and automated within an efficient multicloud DevOps environment.

Anaeko scales storage

Request a consultationGet a quote