Sensitive content filter

In an effort to help moderate the use of AI safely, Jarvis does have a content filter. Basically, we don't want Jarvis writing anything that will make your grandma blush.

The filter's aim is to detect any generated text that could be sensitive or unsafe. The filter is less about the actual word and more about the intent of the writing. Meaning, the filter is going to flag content based upon what the projected intent of this content could be.

From what we know, the filter will make mistakes but it is built to err on the side of caution. A warning message will appear in your app to give you guidance that something in what you inputted is not passing through the content filter.

What to do once you get the warning message

First, look at your paragraph as a whole and try to understand what potential intent is getting flagged. Then, try to replace the word that is getting flagged. If you do not know what the word is that is getting flagged try looking for potentially "unsafe" or "sensitive" words.

Example: This special event would release some magic bullet to eliminate all of my relationship difficulties, triggering everlasting love and joy in my life.

  • This got flagged due to the unsafe filter of using releasing bullets.

Enabling a more flexible content filter

Some accounts run into the content filter for topics that we'd consider safe. These are flagged in error. After review, we will move these accounts to a more flexible filter.

If you believe your account is getting filtered in error, send us an email at [email protected] and be patient.

We will respond and review each use case that could be getting a false positive within the content filter.

In your message please include what input was flagged. Please note, there will be use cases that are being blocked validly by the content filter as well. Not all inputs that receive this message are happening mistakingly.

Next up
Additional help & training