Data Privacy in Autonomous Vehicles – Can Anonymization Solve The Problem?

An average AV would generate around 19 terabytes of data per hour, earning them the title “data centers on wheels". What is the status quo regarding data privacy? How can these vast amounts of data be protected while not hindering the advance of AV technology development?
Data Privacy in Autonomous Vehicles – Can Anonymization Solve The Problem?

Cars of the future will be fully connected, always online, and require artificial intelligence to drive autonomously – disrupting the way we think about mobility forever. Although most autonomous vehicles companies (level 4+ on the SAE scale), such as Zoox and Waymo, are currently operating on the Mobility-as-a-Service (MaaS) model, the speed of development of self-driving technology means the prospect of privately owned self-driving cars is closer than ever. According to Navigant Research, 94.7 million vehicles with self-driving capabilities will be sold annually by 2035, with a large chunk of that being privately owned vehicles. This presents a unique problem of the flood of data generated and the consequential need to devise a framework for safeguarding data privacy.

For instance, a study by Intel suggests that just one autonomous vehicle will generate about 4,000 GB (4 terabytes) of data every day if only AV-specific sensors are taken into account. The annual amount of data generated is even more staggering! According to AAA, an average American car could produce between 380 TB to 5 100 TB of data in just one year while driving 17,600 minutes on average.

Amount of data generated by an Autonomous Vehicle
Autonomous Vehicles – Data Centers on Wheels

What data is collected by Autonomous Vehicles?

Data generated by an AV range from basic navigation requests to more complex information like driving conditions, reports about the traffic, and drivers’ driving habits. These data can be classified under three main categories:

1. Passenger and Owner Information: Autonomous Vehicles may collect information about the owner or passenger of the vehicle, such as your settings like contacts you’ve synced from your phone, and addresses coded into the navigation system. Other identification information, such as your fingerprints or facial detection, would be needed to enable features like keyless entry or automatic customization of the safety, comfort, and entertainment preferences, such as data on your favorite music artists or Netflix series.

There are even proposals of cameras installed in the vehicle’s interior in the future to detect whether the driver is tired or even asleep and to ensure that hands are always on the vehicle steering wheel (a legal necessity for the current 2+ level of AVs). Driver activity detection could be an essential addition in the case of automated driving and probably will be enforced by law until we arrive at levels 3 and 4 autonomous vehicles.

2. Location Data: Collected through GPS and data sent through the sensors on the AV, location data can be used for navigation purposes, such as real-time traffic updates, route calibration, and suggesting any points of interest on the planned route.

3. Sensor Data: Autonomous vehicles have dozens of sensors that enable them to gather information about their surroundings and operate safely. This includes front, rear, and side cameras, LiDARs, radar, thermal imaging devices. All these sensors work in unison to determine the objects on the road and make predictions about the surroundings. The graphic below shows some of the sensor data collected in an AV and how that is used to get the vehicles into action.

4. V2X Data: Vehicle-to-X (Vehicle-to-Everything) communication is the real-time data exchange between an automobile and other vehicles, the infrastructure, and the cloud. AVs will be the prime example of connected cars, offering increased comfort and making driving safer and more efficient. V2X will allow cars to exchange real-time traffic info, including any accidents, traffic jams, or hazards on the route ahead, thus improving driver assistance and eventually autonomous driving functions.

Should You Be Concerned About Data Privacy in Autonomous Vehicles?

Data generation and collection in an automobile is not a new phenomenon. Technologies such as Event Data Recorders and onboard diagnostics have been part of cars for decades and have been helpful in the analysis of crashes or technical failures. But the technological advances in the past few years have led to an explosion in the volume and variety of data collected. Cars are no longer merely mechanical boxes designed to take you from point A to B. Instead, they are no less than your computer or smartphone, where your data is just as necessary as the machine itself.

Another example for increasing data collection within the vehicle is a new EU law that will make it mandatory for new cars to be installed with a black box from 2024. The devices are supposed to reduce the number of traffic accidents by anonymously storing accident data – such as when the airbag is deployed – and the crucial moments right before it. This would include the speed, the use of the brakes, driver alertness, and other measured values from the vehicle, quite similar to a black box on an airplane.

Although there are concerns that the data will attract insurance companies and car manufacturers, the EU law clearly states that the data is not actually about individual accidents. The data are to be used to improve road safety, and according to the text, “the stored data does not allow the user or owner of a particular vehicle to be identified.” This way, the drivers’ data privacy is protected while data for improving road safety is collected.

Data privacy, in general, is not only limited to the automotive industry. Cars are just another technology in the world of the Internet of Things. Consequently, there are already legislation and privacy management measures being taken to mitigate users’ concerns.

For instance, In Europe, the General Data Protection Regulation (GDPR) also applies to the automotive sector besides most technologies that deal with data privacy. In the US, there isn’t any all-encompassing law like GDPR, but the Federal Trade Commission (FTC) is one agency that vows to protect consumers against deceptive or unfair trade practices.

Keeping up with the technology is undoubtedly a challenge for lawmakers, and it will be tough to pin down a specific set of rules. The Government Accountability Office (GAO) report discovered how automobile companies limit data collection, use, and sharing as per data privacy best practices. Still, their privacy notices are arcane and difficult to understand for consumers, and therefore many problems related to data and its usage remain.

Are People Concerned About Data Privacy in AVs?

Concerns about data privacy are not mere speculation; around 96% of new passenger vehicles sold in the US have some event data recorders, which record and describe the actions taken seconds before and after a crash. National Highway Traffic Safety Administration (NHTSA) has been mulling over mandating these event data recorders for all new vehicles under 8500 lbs.

Actions like these will hamper the adoption of a self-driving vehicle equipped with more complex event data recorders. According to research, 54% of participants, who were residents of cities with and without Uber autonomous vehicle fleets, said they would rather spend five minutes opting out of identifiable data collection options than give away their information. They also showed high discomfort for recognizing, identifying, and tracking individuals/vehicles – and understandably so!

How to Mitigate the Privacy Concerns?

A multi-pronged strategy has to be adopted to protect the data collected by AVs, increase the general acceptance of AVs, and alleviate the concerns around the technology.

Effective Legislation

As mentioned above, several data protection laws are already working to protect personal data usage. These include GDPR in Europe, PIPEDA in Canada, CCPA in California, and APPI (Japan). When integrated with existing road traffic laws through proactive legislation, these can help ease the concerns of AV adopters.

Legislative bodies need to discuss and address critical questions like who should own the vehicle’s data? What types of data should be allowed to be stored? With whom can these data be shared? How will such data be made available? And for what purposes will these datasets be used?

Data Privacy

Some heartening steps in this regard have already been taken. For example, in Germany, the Autonomous Vehicle Bill, 2017 modified the existing Road Traffic Act. It defined the requirements for fully-automated vehicles and addressed the driver’s rights.

Similarly, the US Driver Privacy Act of 2015 declares that the information collected about the driver belongs to the vehicle owner and restricts data retrieval, apart from certain exceptions such as court orders or vehicle safety research.

Integrate Privacy by Design

Following GDPR guidelines in this regard might be the best course of action, which advocates the data minimization principle, i.e., minimizing the collection of personal data to what is only necessary while ensuring full-functionality.

In addition, data privacy can also be compromised in case of security breaches, Therefore, integrating data security features into the vehicle design has to be prioritized by the manufacturers instead of being left as a task during manufacturing, which serves as more of an after-thought and therefore is never very well integrated into the whole vehicle’s ecosystem. And according to a study from IBM, finding a security error in the design phase costs less than 1/6th of the cost of when you see it in the implementation phase. It is about 15 times cheaper than during the testing phase and 1/100th of the cost if it comes as an afterthought in product maintenance (usually when the company is sued).

Data Anonymization

Some of the data collected from AVs, such as driving patterns, is necessary for 3rd parties for various purposes. For instance, Tesla has been using the ‘shadow mode’ driving to train their self-driving algorithms through imitation training and improve their database. Revoking access to that data for the sake of privacy concerns would be a setback to Tesla’s development of an Autonomous Driving system.

Other examples include how AV data storage and dissemination can help transportation network managers and designers, for instance. The data could also facilitate policy development, such as traffic congestion pricing schemes using location and time of the day or shift from a gas tax to a Vehicle-Miles-Travelled (VMT) fee. Traffic signal systems could be programmed more efficiently and assist transportation planners in evaluating future improvements and making more effective investments and transportation policies.

Collecting data in automobiles is indispensable to making Autonomous Driving a reality. That makes it imperative that the data privacy is taken care of by anonymizing it. This would make it GDPR compliant, Recital 26 in GDPR to be specific, which defines anonymized data as “data made anonymous in a way that the subject is no longer recognizable.’

Anonymizing the name of the driver or owner of the vehicle before sending out the driving behavior or usage stats or blurring out the faces of the people caught in the vehicle’s camera are examples of easy yet effective steps towards tackling privacy concerns. This would go a long way in mitigating most privacy concerns, which can be a significant hurdle in adoption if ignored. This way, no one’s privacy is compromised while valuable data is collected to make traffic systems more efficient.

Privacy Concerns Should Not Hinder AVs’ Progress

Autonomous Vehicle technology is a reality, and improving the perception and acceptance of people will only speed up the development and adoption. Known as data centers on wheels, AVs generate, collect, and analyze immense and complex amounts of data, which makes the privacy and security of that data a natural cause of concern for the users. Through apt legislation, integration of privacy safeguards in the design, and anonymization of the data, this problem can be easily circumvented and can facilitate the adoption of AVs in the near future!

Data Privacy in Autonomous Vehicles – Can Anonymization Solve The Problem?

Don’t want to miss any news?

Subscribe to our newsletter for regular updates straight to your inbox.

You may also be interested in

Florian Petit
The Future of Mobility
The world is moving towards adopting autonomous vehicles at a fast pace, but are we leaving environmental concerns in our wake? This blog will discuss how we can envision a future that is both autonomous and sustainable
Florian Petit
The Future of Mobility
There are currently two different approaches to achieving fully autonomous driving. Either building up from level 1 to level 5, like many OEMs aim to do right now, or diving right into Level 4 autonomy, as currently pursued by the tech companies. What are the advantages of each approach and what is their current status? Is it possible to bridge the two approaches?
Florian Petit
The Future of Mobility
Since Elon Musk claimed that LiDAR is not needed for autonomous driving, there has been a very lively discussion in the industry. Will it work without LiDARs? Are cameras enough to safely bring autonomous driving onto the streets?

Next events

POWTECH

September 27
- 29, 2022
Nuremberg, Exhibition Centre, Booth 3-677
Urban Tech Forward Logo

Urban Tech Forward

September 27
- 28, 2022
Warsaw
VDI Wissensforum

VDI Conference – Future of Buses

November 8
- 9, 2022
Düsseldorf