Dify
English
English
  • Getting Started
    • Welcome to Dify
      • Features and Specifications
      • List of Model Providers
    • Dify Community
      • Deploy with Docker Compose
      • Start with Local Source Code
      • Deploy with aaPanel
      • Start Frontend Docker Container Separately
      • Environment Variables Explanation
      • FAQs
    • Dify Cloud
    • Dify Premium on AWS
    • Dify for Education
  • Guides
    • Model
      • Add New Provider
      • Predefined Model Integration
      • Custom Model Integration
      • Interfaces
      • Schema
      • Load Balancing
    • Application Orchestration
      • Create Application
      • Chatbot Application
        • Multiple Model Debugging
      • Agent
      • Application Toolkits
        • Moderation Tool
    • Workflow
      • Key Concepts
      • Variables
      • Node Description
        • Start
        • End
        • Answer
        • LLM
        • Knowledge Retrieval
        • Question Classifier
        • Conditional Branch IF/ELSE
        • Code Execution
        • Template
        • Doc Extractor
        • List Operator
        • Variable Aggregator
        • Variable Assigner
        • Iteration
        • Parameter Extraction
        • HTTP Request
        • Agent
        • Tools
        • Loop
      • Shortcut Key
      • Orchestrate Node
      • File Upload
      • Error Handling
        • Predefined Error Handling Logic
        • Error Type
      • Additional Features
      • Debug and Preview
        • Preview and Run
        • Step Run
        • Conversation/Run Logs
        • Checklist
        • Run History
      • Application Publishing
      • Structured Outputs
      • Bulletin: Image Upload Replaced by File Upload
    • Knowledge
      • Create Knowledge
        • 1. Import Text Data
          • 1.1 Import Data from Notion
          • 1.2 Import Data from Website
        • 2. Choose a Chunk Mode
        • 3. Select the Indexing Method and Retrieval Setting
      • Manage Knowledge
        • Maintain Documents
        • Maintain Knowledge via API
      • Metadata
      • Integrate Knowledge Base within Application
      • Retrieval Test / Citation and Attributions
      • Knowledge Request Rate Limit
      • Connect to an External Knowledge Base
      • External Knowledge API
    • Tools
      • Quick Tool Integration
      • Advanced Tool Integration
      • Tool Configuration
        • Google
        • Bing
        • SearchApi
        • StableDiffusion
        • Dall-e
        • Perplexity Search
        • AlphaVantage
        • Youtube
        • SearXNG
        • Serper
        • SiliconFlow (Flux AI Supported)
        • ComfyUI
    • Publishing
      • Publish as a Single-page Web App
        • Web App Settings
        • Text Generator Application
        • Conversation Application
      • Embedding In Websites
      • Developing with APIs
      • Re-develop Based on Frontend Templates
    • Annotation
      • Logs and Annotation
      • Annotation Reply
    • Monitoring
      • Data Analysis
      • Integrate External Ops Tools
        • Integrate LangSmith
        • Integrate Langfuse
        • Integrate Opik
    • Extension
      • API-Based Extension
        • External Data Tool
        • Deploy API Tools with Cloudflare Workers
        • Moderation
      • Code-Based Extension
        • External Data Tool
        • Moderation
    • Collaboration
      • Discover
      • Invite and Manage Members
    • Management
      • App Management
      • Team Members Management
      • Personal Account Management
      • Subscription Management
      • Version Control
  • Workshop
    • Basic
      • How to Build an AI Image Generation App
    • Intermediate
      • Build An Article Reader Using File Upload
      • Building a Smart Customer Service Bot Using a Knowledge Base
      • Generating analysis of Twitter account using Chatflow Agent
  • Community
    • Seek Support
    • Become a Contributor
    • Contributing to Dify Documentation
  • Plugins
    • Introduction
    • Quick Start
      • Install and Use Plugins
      • Develop Plugins
        • Initialize Development Tools
        • Tool Plugin
        • Model Plugin
          • Create Model Providers
          • Integrate the Predefined Model
          • Integrate the Customizable Model
        • Agent Strategy Plugin
        • Extension Plugin
        • Bundle
      • Debug Plugin
    • Manage Plugins
    • Schema Specification
      • Manifest
      • Endpoint
      • Tool
      • Agent
      • Model
        • Model Designing Rules
        • Model Schema
      • General Specifications
      • Persistent Storage
      • Reverse Invocation of the Dify Service
        • App
        • Model
        • Tool
        • Node
    • Best Practice
      • Develop a Slack Bot Plugin
      • Dify MCP Plugin Guide: Connect Zapier and Automate Email Delivery with Ease
    • Publish Plugins
      • Publish Plugins Automatically
      • Publish to Dify Marketplace
        • Plugin Developer Guidelines
        • Plugin Privacy Protection Guidelines
      • Publish to Your Personal GitHub Repository
      • Package the Plugin File and Publish it
      • Signing Plugins for Third-Party Signature Verification
    • FAQ
  • Development
    • Backend
      • DifySandbox
        • Contribution Guide
    • Models Integration
      • Integrate Open Source Models from Hugging Face
      • Integrate Open Source Models from Replicate
      • Integrate Local Models Deployed by Xinference
      • Integrate Local Models Deployed by OpenLLM
      • Integrate Local Models Deployed by LocalAI
      • Integrate Local Models Deployed by Ollama
      • Integrate Models on LiteLLM Proxy
      • Integrating with GPUStack for Local Model Deployment
      • Integrating AWS Bedrock Models (DeepSeek)
    • Migration
      • Migrating Community Edition to v1.0.0
  • Learn More
    • Use Cases
      • DeepSeek & Dify Integration Guide: Building AI Applications with Multi-Turn Reasoning
      • Private Deployment of Ollama + DeepSeek + Dify: Build Your Own AI Assistant
      • Build a Notion AI Assistant
      • Create a MidJourney Prompt Bot with Dify
      • Create an AI Chatbot with Business Data in Minutes
      • Integrating Dify Chatbot into Your Wix Website
      • How to connect with AWS Bedrock Knowledge Base?
      • Building the Dify Scheduler
      • Building an AI Thesis Slack Bot on Dify
    • Extended Reading
      • What is LLMOps?
      • Retrieval-Augmented Generation (RAG)
        • Hybrid Search
        • Re-ranking
        • Retrieval Modes
      • How to Use JSON Schema Output in Dify?
    • FAQ
      • Self-Host
      • LLM Configuration and Usage
      • Plugins
  • Policies
    • Open Source License
    • User Agreement
      • Terms of Service
      • Privacy Policy
      • Get Compliance Report
  • Features
    • Workflow
Powered by GitBook
On this page
  • Setting the Indexing Method
  • Setting the Retrieval Setting
  • Reference
  1. Guides
  2. Knowledge
  3. Create Knowledge

3. Select the Indexing Method and Retrieval Setting

Previous2. Choose a Chunk ModeNextManage Knowledge

Last updated 4 months ago

After selecting the chunking mode, the next step is to define the indexing method for structured content.

Setting the Indexing Method

Similar to the search engines use efficient indexing algorithms to match search results most relevant to user queries, the selected indexing method directly impacts the retrieval efficiency of the LLM and the accuracy of its responses to knowledge base content.

The knowledge base offers two indexing methods: High-Quality and Economical, each with different retrieval setting options:

Note: The original Q&A mode (Available only in the Community Edition) is now an optional feature under the High-Quality Indexing Method.

High Quality

In High-Quality Mode, the Embedding model converts text chunks into numerical vectors, enabling efficient compression and storage of large volumes of textual information. This allows for more precise matching between user queries and text.

Once the text chunks are vectorized and stored in the database, an effective retrieval method is required to fetch the chunks that match user queries. The High-Quality indexing method offers three retrieval settings: vector retrieval, full-text retrieval, and hybrid retrieval. For more details on retrieval settings, please check .

After selecting High Quality mode, the indexing method for the knowledge base cannot be downgraded to Economical mode later. If you need to switch, it is recommended to create a new knowledge base and select the desired indexing method.

High Quality Indexing Mode

Enable Q&A Mode (Optional, Community Edition Only)

When this mode is enabled, the system segments the uploaded text and automatically generates Q&A pairs for each segment after summarizing its content.

Compared with the common Q to P strategy (user questions matched with text paragraphs), the Q&A mode uses a Q to Q strategy (questions matched with questions).

This approach is particularly effective because the text in FAQ documents is often written in natural language with complete grammatical structures.

The Q to Q strategy makes the matching between questions and answers clearer and better supports scenarios with high-frequency or highly similar questions.

When a user asks a question, the system identifies the most similar question and returns the corresponding chunk as the answer. This approach is more precise, as it directly matches the user’s query, helping them retrieve the exact information they need.

Economical

If the performance of the economical indexing method does not meet your expectations, you can upgrade to the High-Quality indexing method in the Knowledge settings page.

Setting the Retrieval Setting

Once the knowledge base receives a user query, it searches existing documents according to preset retrieval methods and extracts highly relevant content chunks. These chunks provide essential contextual for the LLM, ultimately affecting the accuracy and credibility of its answers.

Common retrieval methods include:

  1. Semantic Retrieval based on vector similarity—where text chunks and queries are converted into vectors and matched via similarity scoring.

  2. Keyword Matching using an inverted index (a standard search engine technique). Both methods are supported in Dify’s knowledge base.

Both retrieval methods are supported in Dify’s knowledge base. The specific retrieval options available depend on the chosen indexing method.

High Quality

In the High-Quality Indexing Method, Dify offers three retrieval settings: Vector Search, Full-Text Search, and Hybrid Search.

Vector Search

Definition: Vectorize the user’s question to generate a query vector, then compare it with the corresponding text vectors in the knowledge base to find the nearest chunks.

Vector Search Settings:

Rerank Model: Disabled by default. When enabled, a third-party Rerank model will sort the text chunks returned by Vector Search to optimize results. This helps the LLM access more precise information and improve output quality. Before enabling this option, go to Settings → Model Providers and configure the Rerank model’s API key.

Enabling this feature will consume tokens from the Rerank model. For more details, refer to the associated model’s pricing page.

TopK: Determines how many text chunks, deemed most similar to the user’s query, are retrieved. It also automatically adjusts the number of chunks based on the chosen model’s context window. The default value is 3, and higher numbers will recall more text chunks.

Score Threshold: Sets the minimum similarity score required for a chunk to be retrieved. Only chunks exceeding this score are retrieved. The default value is 0.5. Higher thresholds demand greater similarity and thus result in fewer chunks being retrieved.

The TopK and Score configurations are only effective during the Rerank phase. Therefore, to apply either of these settings, it is necessary to add and enable a Rerank model.

Full-Text Search

Definition: Indexing all terms in the document, allowing users to query any terms and return text fragments containing those terms.

Rerank Model: Disabled by default. When enabled, a third-party Rerank model will sort the text chunks returned by Full-Text Search to optimize results. This helps the LLM access more precise information and improve output quality. Before enabling this option, go to Settings → Model Providers and configure the Rerank model’s API key.

Enabling this feature will consume tokens from the Rerank model. For more details, refer to the associated model’s pricing page.

TopK: Determines how many text chunks, deemed most similar to the user’s query, are retrieved. It also automatically adjusts the number of chunks based on the chosen model’s context window. The default value is 3, and higher numbers will recall more text chunks.

Score Threshold: Sets the minimum similarity score required for a chunk to be retrieved. Only chunks exceeding this score are retrieved. The default value is 0.5. Higher thresholds demand greater similarity and thus result in fewer chunks being retrieved.

The TopK and Score configurations are only effective during the Rerank phase. Therefore, to apply either of these settings, it is necessary to add and enable a Rerank model.

Hybrid Search

Definition: This process combines full-text search and vector search, performing both simultaneously. It includes a reordering step to select the best-matching results from both search outcomes based on the user’s query.

In this mode, you can specify "Weight settings" without needing to configure the Rerank model API, or enable Rerank model for retrieval.

  • Weight Settings

    This feature enables users to set custom weights for semantic priority and keyword priority. Keyword search refers to performing a full-text search within the knowledge base, while semantic search involves vector search within the knowledge base.

    • Semantic Value of 1

      This activates only the semantic search mode. Utilizing embedding models, even if the exact terms from the query do not appear in the knowledge base, the search can delve deeper by calculating vector distances, thus returning relevant content. Additionally, when dealing with multilingual content, semantic search can capture meaning across different languages, providing more accurate cross-language search results.

    • Keyword Value of 1

      This activates only the keyword search mode. It performs a full match against the input text in the knowledge base, suitable for scenarios where the user knows the exact information or terminology. This approach consumes fewer computational resources and is ideal for quick searches within a large document knowledge base.

    • Custom Keyword and Semantic Weights

      In addition to enabling only semantic search or keyword search, we provide flexible custom weight settings. You can continuously adjust the weights of the two methods to identify the optimal weight ratio that suits your business scenario.


  • Rerank Model

    Disabled by default. When enabled, a third-party Rerank model will sort the text chunks returned by Hybrid Search to optimize results. This helps the LLM access more precise information and improve output quality. Before enabling this option, go to Settings → Model Providers and configure the Rerank model’s API key.

    Enabling this feature will consume tokens from the Rerank model. For more details, refer to the associated model’s pricing page.

The "Weight Settings" and "Rerank Model" settings support the following options:

TopK: Determines how many text chunks, deemed most similar to the user’s query, are retrieved. It also automatically adjusts the number of chunks based on the chosen model’s context window. The default value is 3, and higher numbers will recall more text chunks.

Score Threshold: Sets the minimum similarity score required for a chunk to be retrieved. Only chunks exceeding this score are retrieved. The default value is 0.5. Higher thresholds demand greater similarity and thus result in fewer chunks being retrieved.

Economical

In Economical Indexing mode, only the inverted index approach is available. An inverted index is a data structure designed for fast keyword retrieval within documents, commonly used in online search engines. Inverted indexing supports only the TopK setting.

TopK: Determines how many text chunks, deemed most similar to the user’s query, are retrieved. It also automatically adjusts the number of chunks based on the chosen model’s context window. The default value is 3, and higher numbers will recall more text chunks.


Reference

After specifying the retrieval settings, you can refer to the following documentation to review how keywords match with content chunks in different scenarios.

Q&A Chunk
Difference between Q to P and Q to Q indexing method

Using 10 keywords per chunk for retrieval, no tokens are consumed at the expense of reduced retrieval accuracy. For the retrieved blocks, only the inverted index method is provided to select the most relevant blocks. For more details, please refer to .

Economical mode
Retrieval Settings
Vector search settings
Full-Text Search Settings
Hybrid Retrieval Setting
Inverted index
Inverted Index
Retrieval Test / Citation and Attributions
"Retrieval Settings"