Machine Learning (ML) and Artificial Intelligence (AI) has been influencing information security and governance for many years. These technologies are expected to become more integral and widespread–this industry report by Vantage Market Research estimates that by 2028, the global AI cybersecurity market will reach $35 billion. Though AI has a great many benefits, including early detection of cyber events, and faster data cleaning, it comes with its own threat landscape. The latest, “data poisoning” is creating a cybersecurity crisis. Here’s how to handle the new trend.

What is Data Poisoning?

Many companies utilize AI to catch “malicious software” by feeding their detection systems a large amount of correctly-labeled data, permitting the machine to “train” itself to identify that data and take certain actions when it does.

Bad actors are poisoning and manipulating training data by submitting “bad,” or inaccurate data into the input set. The AI accepts this “bad” or inaccurate data as “good” and “useful” data that should be implemented into its code. But it’s really intended to confuse and reduce the accuracy of the AI.

Data poisoning via mislabeled data is a common occurrence with organizations that utilize open-source data to train their systems. And there doesn’t need to be a ton of bad data in those sets. According to two researchers who presented at the HITCon security conference in Taipei last year, This use of mislabeled code “could fully bypass defenses by poisoning less than 0.7 percent of the data submitted to the machine-learning system.”

How does such a small bit of bad data poison a whole system? Per a Techgenix article, “AI doesn’t forget or automatically place data into hierarchies. Instead, it remembers everything exactly as it was given. It also won’t discriminate between different data or discard anomalies unless you tell it [to].” According to the Washington Post, this “offers a virtually untraceable method to get around AI-powered defenses.”

How to Protect Your Organization

1. Carefully Curate Your Data Inputs

Regardless of whether you have an in-house or third-party AI service provider, ensure that your provider monitors and inspects the labels of all data inputs to ensure that data that is being utilized as labeled. Your organization should consider using only datasets that it’s collected itself and avoid using open source or publicly available datasets for training AIs. Another option is to create a filter that all data going into your systems must pass through to check for accurate labeling before using it to teach an AI.

2. Keep Your AI Updated

Once you have created a system for monitoring the inputs, you will be required to constantly update your AI systems to recognize new and previously unrecognizable threats, and teach the systems how your organization wants them to respond. Regardless of the approach that your organization employs to verify data, training your AI will require a substantial amount of oversight on the monitoring party. According to this report, “the solutions to AI poisoning are stunningly similar to how we train children.”

3. Train Your Employees

Often, threat actors will rely on an employee’s lack of awareness surrounding data poisoning and their general lack of cybersecurity training to infiltrate and compromise your information systems. As such, your organization should ensure that those who manage data input or your overall information systems are educated in data poisoning tactics.

* * * * * * *

To read our coverage on the metaverse and the regulatory implications that businesses and consumers must consider before engaging, click here.

For ADCG’s Breach Report and more news updates discussing: Connecticut Legislators Pass Data Privacy Bill; FFIEC Releases 2022 HMDA Reporting Guide; India’s National Health Authority (NHA) Releases Data Management Guide; Facebook Unable to Comply With GDPR; and the EU Data Privacy Authorities Vow to Work Together Against Big Tech, click here.

To browse through our previously published articles and news alerts, please visit our website, and don’t forget to subscribe to receive free weekly Data and Cyber Governance news and Breach Reports directly to your email.

Previous
Previous

News and Alerts for May 3, 2022

Next
Next

The Metaverse and Data Privacy Issues