The security implications of Microsoft Copilot

Microsoft Copilot offers many benefits to your organization, but it also brings risk. Learn how to harness the power of Copilot while keeping your data secure, and ensuring your customers' privacy.

Adam Roberts

Written by

Adam Roberts

Reviewed by

Published:

July 29, 2024

Last updated:

The security implications of Microsoft Copilot

Finding it hard to keep up with this fast-paced industry?

Subscribe to FILED Newsletter.  
Your monthly round-up of the latest news and views at the intersection of data privacy, data security, and governance.
Subscribe Now

Since the Generative AI market exploded in 2022 with the arrival of OpenAI’s large language model (LLM) ChatGPT, professionals have become increasingly reliant on the technology to get their work done. Meanwhile, the technology industry has raced to reorient itself around AI.

In May 2023, Microsoft—an investor in OpenAI—unveiled its offering: Copilot, which it said combined the power of an LLM with a user’s data and Microsoft 365 apps. Copilot is embedded into Microsoft 365 apps such as Word, Excel, and Teams, and can be used in many of the standard ways GenAI apps are used: to generate text, and questions on data, and make suggestions.

But Copilot also works across the operating system as a whole, in a mode called “Business Chat”, allowing users to use Copilot to update information in one app based on activity in another. To choose an example used in Microsoft’s Copilot launch announcement, you could ask Copilot, “Tell my team how we updated the product strategy,” and it will generate a status update based on the morning’s meetings, emails and chat threads.

As Copilot is built into Microsoft 365, the model has access to all of the data a user does, resulting in a lower level of friction and, the company argues, an improved user experience. Rather than keeping ChatGPT or Claude open in a window and dealing with the hassle of copying and pasting text between apps—not to mention the security and privacy implications of using customer data in free AI platforms—employees can have GenAI tightly integrated into their work tooling and experience.  

Copilot differs from ChatGPT or similar LLMs in that it has a different level of access to data. With commercial GenAI platforms, users mainly interact through a chat window, and manually input a limited amount of data, along with a given prompt. The user is generally more aware of the data they are providing and can take steps to generalize or anonymize the data they provide. Copilot, on the other hand, is an existing LLM provided with company data as additional context at the time of retrieval, also known as retrieval augmented generation (RAG). This offers a theoretically more relevant and targeted solution based on retrieved knowledge, rooted in your context and content.

Key Copilot use cases

While Copilot can perform many of the same productivity and creativity tasks as other GenAI platforms, its built-in nature makes it particularly suited to document creation and editing. Given it is drawing on your company knowledge, it has a massive amount of context to draw upon, leading at least in theory to more tailored and less generic content and more useful suggestions. The platform also offers a “super spellcheck/grammar check”, and guidance for tasks like inputting Excel formulae.

Copilot can also retrieve data much more easily than a human can, making it easier to cite or check internal or external resources when preparing reports or making recommendations.

According to Gartner, 55% of organizations have implemented or are implementing generative AI, with Copilot the obvious starting point given its simplified implementation. But the platform does come with risk.  

Risks and concerns

While Microsoft Copilot offers “commercial data protection”, the US Congress has banned its use by members of congress and staffers, due to the risk of data leaking to non-approved services, and Gartner also urges caution.  There are a number of risks associated with the platform. They can be grouped into:

  • Data compliance risks
  • Data quality risks
  • Amplification of existing weaknesses in security

Copilot poses data safety and quality risks

As with all GenAI platforms, Copilot is only as good—and as compliant—as the data it is trained on. As much as possible, you must avoid inputting sensitive customer data—under many regulations, you need a user’s consent before using it for something like an AI model.

Data compliance

You must also ensure any sensitive and confidential information is classified appropriately.

Consider a scenario where a business or financial analyst asks Copilot to generate a quarterly financial report. If data is improperly classified, the model will incorporate sensitive and confidential data into the report and fail to label it as such. The report could then be shared with external parties such as potential investors, breaching privacy and confidentiality policies, and potentially impacting the company’s value.

Data quality

Then there’s the Redundant, Obsolete and Trivial data (ROT). There is often much more ROT than there is legitimate data—think: test files or drafts. The presence of ROT can skew the output of Copilot, and lead to the model providing poor quality or incorrect information based on irrelevant content. A simple example: you may ask Copilot how much Personally Identifiable Information (PII) your organization holds, and have an incorrect number come back, which may lead to incorrect actions on your part.

Copilot can amplify existing security weaknesses

Because of the way Copilot is integrated into your business data, the service can act as an accelerant, making any existing poor data security and data governance practices more dangerous.

Since Copilot has access to data based on a given user’s permissions, the platform has access to any sensitive content a user does. The problem is, in many cases, users have far too much access to far too much sensitive data. A study by Concentric found 15% of business-critical resources were at risk of oversharing, and can be seen by internal and external users who should not have access, and 90% of business‐critical documents are shared outside the C‐suite. When is the last time you audited your employees’ access?

Bringing Copilot into the organization without proper preparation could result in a series of “own goals”: an over-permissioned user could simply ask Copilot for the CEO’s salary, or request sensitive employee performance records, breaching privacy policies and leading to internal chaos.  

An attacker that has gained access to such a user’s account can ask for much more, triggering a major data breach. And reports or documents generated by Copilot could inadvertently contain sensitive information or confidential company information, leading to a privacy breach when shared externally.

How to use Copilot safely

To implement good AI practices in your organization, you need to invest in strong data governance and keep customer privacy at the forefront of your strategy.

Remove any data you are not entitled to hold based on applicable privacy laws, along with ROT. Any remaining data must be classified according to sensitivity, so you have an idea of data risk prior to deploying Copilot.

The next stage is to ensure your data is locked down to just the people who need access. Perform an audit of who can access what data, and ensure you remove the risk of over-permissioned users. Overall, make sure you can accurately track data provenance: the origins, ownership, and lineage of data throughout its lifecycle.

When it comes to deploying Copilot, pick a few narrow use-cases and deploy in stages, rather than letting your organization loose on the technology (or the reverse). Ensure you leverage subject matter experts to vet results and determine whether they may contain sensitive data or be influenced by ROT, and—as always with GenAI—take the results as a starting point, not an end-result.

The RecordPoint solution

RecordPoint provides solutions to help you safely and securely deploy Copilot, by enabling you to know what your data estate looks like and where the risk lies.

Data discovery

RecordPoint connects to all of your essential business systems, including Microsoft 365, allowing you to understand the data you have and where it lives with a continuous inventory and cataloging, so you are ready to act on it. You can view every data asset you possess in one central place, including dark data, without having to move a thing.

Data understanding

Once you have discovered your data, RecordPoint’s machine learning and security models help you identify and locate sensitive data, like PII and Payment Card Industry (PCI) data, so you can manage access, and take action to protect it.  

The platform’s Intelligence Signaling feature scans incoming data and records for PII—from critical PII like social security numbers, tax file numbers, driver’s license numbers, and passport details, to less sensitive PII like names, email addresses, and phone numbers—as well as PCI data.

RecordPoint’s Classification Intelligence feature allows users to train a machine learning model to automatically categorize data based on content and context. These models can be built through a simple interface and include key features like prediction probability scores.

Data minimization  

There is no exposure risk for data you do not have. RecordPoint allows you to manage retention and disposal of all your data, in line with regulations like the General Data Protection Regulation (GDPR), to ensure that you store the minimum amount of data you need. If you manage your data minimization correctly, there will be fewer sensitive data that Copilot can leak, and less ROT that can skew your results.

Data provenance and explainable AI (XAI)

RecordPoint allows you to enhance decision-making and data trustworthiness by tracking the origins, ownership, and lineage of data throughout its lifecycle with data provenance.

When you have data provenance, you can then more easily enable traceability in AI models and their data sources, enabling clear explanations of AI decisions to stakeholders and regulatory bodies, leading to explainable AI (XAI) outcomes.

Discover a better platform

Whatever your business, if you are considering deploying Copilot and want to ensure you do so safely, RecordPoint can help. Explore the platform now, or book a demo for a full walk-through.

Discover Connectors

View our expanded range of available Connectors, including popular SaaS platforms, such as Salesforce, Workday, Zendesk, SAP, and many more.

Explore the platform

Discover Classification Intelligence

Scalable auto-classification using Artificial Intelligence (AI) and Machine Learning (ML).

Learn More
Share on Social Media
bg
bg

Assure your customers their data is safe with you

Protect your customers and your business with
the Data Trust Platform.