Moderation

This module is used to review the content input by end-users and the output content of the LLM within the application. It is divided into two types of extension points.

Extension Points

  • app.moderation.input - Extension point for reviewing end-user input content

    • Used to review the variable content passed in by end-users and the input content of conversational applications.

  • app.moderation.output - Extension point for reviewing LLM output content

    • Used to review the content output by the LLM.

    • When the LLM output is streamed, the content will be segmented into 100-character blocks for API requests to avoid delays in reviewing longer outputs.

app.moderation.input Extension Point

When Content Moderation > Review Input Content is enabled in applications like Chatflow, Agent, or Chat Assistant, Dify will send the following HTTP POST request to the corresponding API extension:

Request Body

{
    "point": "app.moderation.input", // Extension point type, fixed as app.moderation.input here
    "params": {
        "app_id": string,  // Application ID
        "inputs": {  // Variable values passed in by end-users, key is the variable name, value is the variable value
            "var_1": "value_1",
            "var_2": "value_2",
            ...
        },
        "query": string | null  // Current dialogue input content from the end-user, fixed parameter for conversational applications.
    }
}
  • Example

API Response Specifications

  • Example

    • action=direct_output

    • action=overridden

app.moderation.output Extension Point

When Content Moderation > Review Output Content is enabled in applications like Chatflow, Agent, or Chat Assistant, Dify will send the following HTTP POST request to the corresponding API extension:

Request Body

  • Example

API Response Specifications

  • Example

    • action=direct_output

    • action=overridden

Code Example

Below is an example src/index.ts code that can be deployed on Cloudflare. (For the complete usage method of Cloudflare, please refer to this document)

The code works by performing keyword matching to filter Input (user-entered content) and output (content returned by the large model). Users can modify the matching logic according to their needs.

Last updated