What your business needs to know about data protection and AI

AI and generative AI are being used in more and more business contexts, but what does that mean for data protection?

The data input into AI models is where data protection is needed the most, but the very structure of AI models makes this hard to manage. Plus, in a system that literally relies on being fed data, how possible is it going to be to limit and “protect” it?

Data protection hinges on concepts such as transparency, which can be very tricky to achieve with generative AI models that don’t leave a clear query trace. And when it comes to AI bias, it has been hard to trace and track down the exact origins of these views.

So, with all that in mind, what can you, as a business and an organisation, do to improve AI and data protection?

Data feeds data – be cautious

Machine learning desperately needs data. It thrives on data. It needs data for its pattern recognition, decision making, and predictions.

Recent AI developments are connected to increased processing power: now that more data can be processed at a faster rate, the models themselves are better and faster.

If you allow people to input their own data into generative AI models: how is that protected or policed?

Are you asking people to input data into AI? You need a policy for that.
Have you got any disclaimers? Disclose how data is being used.
You must ensure that once personal data is in the system, it cannot be extracted.
Data minimisation is another important concept – only collect what you have to.
Delete what you don’t need, or better yet, don’t save it in the first place! (Thankfully a lot of data processing can be done in real-time without the need to save a lot of data).

You can have policies that stipulate what sort of data can be inputted into the model, and you can train it to reject things like email addresses that come in standard formats.

💡Tip: Chatbot policies may provide you with some important examples and context.

Transparency is key to data protection, yet AI troubles transparency

Probably one of the key challenges with AI and data protection is how to be transparent when the technology itself is still new or obscure.

A generative AI output floats free, with no bibliography or cited source. This can make tracing data origins incredibly difficult. And can you really “trace” a request in AI? Is it possible to give people a jargon-free explanation of how the model has used data (and what data)?
Who can we hold accountable for the biases in AI? How do we address bias in AI? Human intervention is a key way to deal with AI bias (though other biases may come in…)
User consent is key. This is one thing that you can get right. It can be hard to describe complex systems, but you can ask for and obtain user consent. Tell users how you plan to use their data.

User consent is the bare minimum, but it should also be said that relying on AI models can make it harder for businesses to have control of their own data protection.

AI and personalisation – anonymise and encrypt

AI can be used to personalise content, but you need to interrogate whether this is being done in a safe way?

Anonymise data wherever possible. Anonymised and encrypted data will still allow you to provide a great customer experience, but safely.
What data do you collect? What is done with it and how can it be traced? Get on top of your data ecosystem.
Encrypt data wherever possible. This includes data at-rest and in transit.

Legislation and requirements: going beyond the here and now

GDPR principles and responsible AI principles are very similar, but there is a sense that legislation is lagging behind technology. (This always happens when new tech becomes more widely available).

There is an onus on businesses to operate in a data-safe way, without necessarily having a clear guideline or policy to follow (yet). But just because something is yet to be encoded in law, doesn’t mean it isn’t something we should want to do.

Implementing CCPA and GDRP and other data privacy laws (depending on your markets) is possible with AI tools and models, but you may have to work harder to interpret guidelines.
Data requests, a core part of GDPR, can be tricky when it comes to AI input. AI is not a traditional database, so it is hard to “pull” data out if somebody has a data request. Keep this in mind and don’t allow personal or identifiable data into the model.
As stated before, consent and telling the user how and why their data is being handled is paramount. Having robust policies will help manage and set expectations when it comes to privacy, retrieval etc.

Safer AI usage

Don’t be put off by some of the data protection challenges outlined, just be committed to learning how to use AI in a safe way.

Keep people’s data on lockdown and tell them how and when you are using it.
Encourage people to also take responsibility to use AI safely.
Differential privacy is a very important concept for AI and data protection. It means that aggregation and aggregate data is used to ensure no individual data is passed through.
Homomorphic encryption is where computations are done to the encrypted data before it is unencrypted again. It is basically an extra step of data encryption.
Federated learning or collaborative learning essentially splits the machine learning model over multiple different models to combat centralisation, bias etc.

Here is our guide to information security, cybersecurity and AI to help you leverage AI in a safe way.

Final words

With AI, just like anything, there’s ideal usage, and then there’s the reality. The reality is that a lot of us are grappling with something very new, doing our best to stay on top. Very few businesses plan to fall foul of data privacy laws or regulations, but it can be tricky to operate in a constantly evolving tech landscape.

Your policies should consider the ideal and the reality. How can you safeguard your business and its clients? Accountability is key.
Like anything, it’s about what you do with it. When harnessing AI, ensure you leave enough space for regulatory work and checks too.
Remember: you can operate in a data protection friendly environment anywhere and with anything!

Keen to learn more about AI? Check out our blog on how AI is revolutionising publishing or peruse our TimeAI products for ideas on how publishers can benefit from AI.

Publishing

Edtech

BIM

Digital

Blog

Categories

AI + data protection =? What your business needs to know about data protection and AI

Data feeds data – be cautious

Transparency is key to data protection, yet AI troubles transparency

AI and personalisation – anonymise and encrypt

Legislation and requirements: going beyond the here and now

Safer AI usage

Final words

Learn about our AI products calibrated for publishers

Share this article on social media:

Interested in our services? Contact us and let's have a convo.

All our blogs about AI

What does it take to become leaders in digital publishing as educational publishers?

AI and content production for publishers: how to get strategic with AI

Editorial AI: what you should know about using AI in editorial work – editor’s guide

AI + data protection =? What your business needs to know about data protection and AI

AI & Infosec guide for your business: Information security and cybersecurity in the AI world

How AI can really help your business: AI beyond the hype

Artificial intelligence & copyright law: Navigating an evolving digital world

AI in publishing: Leveraging artificial intelligence in digital publishing

Read more about it

Website migration success: Web migration tips for your brand’s next website project

AI in publishing: Leveraging artificial intelligence in digital publishing

How to use robots.txt & the meta robots tag – developer indexing guide

Why should publishers consider publishing an app?

What we have learned from solving digital publishing challenges since 1987

Digital publishing continues to grow in popularity

We are also on social media.

Let's have a chat!

Let's have a chat!

Publishing software

Custom software

Us