Organizations can now have more control over how Copilot handles sensitive or potentially offensive content in users' conversations. The harmful content protection setting in Microsoft 365 Copilot Chat is available through a new policy. This policy is designed to enable access to sensitive material in Copilot Chat for people whose work might demand it. Example scenarios include:
-
Content moderation
-
Investigations
-
Law enforcement
-
Legal review
This article describes the harmful content protection setting, how it works, and things to keep in mind.
Important:Â
-
If harmful content protection is disabled, you might see sensitive or offensive content in Copilot Chat.
-
Responsible AI protections, such as prompt injection defense, copyright safeguards, and image protections remain active, even if harmful content protection is disabled.
-
Harmful content protection settings don't affect images or agents.
-
You can only disable harmful content protection if your administrator has enabled the feature for you.Â
What is harmful content protection?
Harmful content protection is a built-in safeguard for Copilot Chat. It filters out content related to:
-
Sexual material
-
Violence
-
Hate speech
-
Fairness concerns
-
Self harm
Harmful content protection helps ensure that Copilot responses remain safe and appropriate for general use.
The new harmful content protection toggle enables you to turn off this protection within a conversation in Copilot Chat. This feature allows you to summarize or analyze sensitive or potentially offensive information as part of your workflow.
By default, the harmful content protection toggle is enabled in each new conversation in Copilot Chat. If you need to turn it off, you can disable it for a conversation.
How do I disable harmful content protection?
-
In your Copilot Chat window, select the More menu, and look for the Harmful content protection toggle, as shown in the following screenshot:
-
Toggle Harmful content protection to off to disable it for the remainder of your conversation.
What happens when I disable harmful content protection?
When you disable harmful content protection, it applies only to your current conversation. Here's what to expect:
-
Copilot is able to process sensitive information more fully. You'll be able to analyze or summarize potentially harmful or offensive content that was previously blocked.
-
Depending on your queries and the materials you reference, you could see sensitive or offensive content in Copilot Chat.
-
Images and agents are not affected when harmful content protection is disabled.
-
Once harmful content protection is disabled in a conversation, you can't re-enable it in the same conversation. It stays off for the rest of the conversation. However, harmful content protection is enabled in every new conversation in Copilot Chat.
Important:Â Responsible AI protections, such as prompt injection defense, copyright safeguards, and image protections remain in place and cannot be turned off.
Why don't I see the harmful content protection toggle?
If you don't see Harmful content protection in your Copilot Chat menu, it's most likely because your administrator hasn't enabled this feature for you.
Related articles
​​​​​​​