How harmful content protection works in Microsoft 365 Copilot Chat and agents

Organizations can now have more control over how Copilot handles sensitive or potentially offensive content in users' conversations. The harmful content protection setting in Microsoft 365 Copilot is available through a new policy. This policy is designed to enable access to sensitive material in Copilot Chat and Microsoft agentic experiences for people whose work might demand it.

Example scenarios include:

  • Content moderation
  • Investigations
  • Law enforcement
  • Legal review

This article describes the harmful content protection setting, how it works, and things to keep in mind.

Important

  • For example, if harmful content protection is turned off, you might see sensitive or offensive content in Copilot Chat responses.
  • Responsible AI protections, such as prompt injection defense, copyright safeguards, and image protections remain active, even if harmful content protection is turned off.
  • Harmful content protection settings don't affect images.
  • You can only turn off harmful content protection if your administrator has enabled the feature for you. 

What is harmful content protection?

Harmful content protection is a built-in safeguard for Copilot Chat and agentic experiences. It filters out content related to:

  • Sexual material
  • Violence
  • Hate speech
  • Fairness concerns
  • Self-harm

Harmful content protection helps ensure that Copilot responses remain safe and appropriate for general use.

The new harmful content protection toggle allows you to turn off this protection within a conversation in Copilot Chat and Microsoft agents. This feature allows you to summarize or analyze sensitive or potentially offensive information as part of your workflow.

By default, the harmful content protection toggle is turned on in each new conversation in Copilot Chat and agentic experiences. If you need to turn it off, you can turn it off for a conversation.

How do I turn off harmful content protection?

  1. In your Copilot Chat or agent window, select the More menu, and look for the Harmful content protection toggle, as shown in the following screenshot:

    Screenshot sohwing the Harmful content protection toggle in Microsoft 365 Copilot Chat

  2. Toggle Harmful content protection to Off to turn it off for the remainder of your conversation.

What happens when I turn off harmful content protection?

When you turn off harmful content protection, it applies only to your current conversation. Here's what to expect:

  • Copilot is able to process sensitive information more fully. You'll be able to analyze or summarize potentially harmful or offensive content that was previously blocked.
  • Depending on your queries and the materials you reference, you could see sensitive or offensive content in Copilot Chat or agent responses.
  • Images aren't affected when harmful content protection is turned off.
  • Once harmful content protection is turned off in a conversation, you can't re-enable it in the same conversation. It stays off for the rest of the conversation. However, harmful content protection is turned on in every new conversation in Copilot Chat and agentic experiences.

Important

Responsible AI protections, such as prompt injection defense, copyright safeguards, and image protections remain in place and can't be turned off.

Why don't I see the harmful content protection toggle?

If you don't see Harmful content protection in your Copilot Chat or agent menu, it's most likely because your administrator hasn't turned on this feature for you.

More ways to work with Copilot Chat

​​​​​​​