Dify
English
English
  • Getting Started
    • Welcome to Dify
      • Features and Specifications
      • List of Model Providers
    • Dify Community
      • Deploy with Docker Compose
      • Start with Local Source Code
      • Deploy with aaPanel
      • Start Frontend Docker Container Separately
      • Environment Variables Explanation
      • FAQs
    • Dify Cloud
    • Dify Premium on AWS
    • Dify for Education
  • Guides
    • Model
      • Add New Provider
      • Predefined Model Integration
      • Custom Model Integration
      • Interfaces
      • Schema
      • Load Balancing
    • Application Orchestration
      • Create Application
      • Chatbot Application
        • Multiple Model Debugging
      • Agent
      • Application Toolkits
        • Moderation Tool
    • Workflow
      • Key Concepts
      • Variables
      • Node Description
        • Start
        • End
        • Answer
        • LLM
        • Knowledge Retrieval
        • Question Classifier
        • Conditional Branch IF/ELSE
        • Code Execution
        • Template
        • Doc Extractor
        • List Operator
        • Variable Aggregator
        • Variable Assigner
        • Iteration
        • Parameter Extraction
        • HTTP Request
        • Agent
        • Tools
        • Loop
      • Shortcut Key
      • Orchestrate Node
      • File Upload
      • Error Handling
        • Predefined Error Handling Logic
        • Error Type
      • Additional Features
      • Debug and Preview
        • Preview and Run
        • Step Run
        • Conversation/Run Logs
        • Checklist
        • Run History
      • Application Publishing
      • Structured Outputs
      • Bulletin: Image Upload Replaced by File Upload
    • Knowledge
      • Create Knowledge
        • 1. Import Text Data
          • 1.1 Import Data from Notion
          • 1.2 Import Data from Website
        • 2. Choose a Chunk Mode
        • 3. Select the Indexing Method and Retrieval Setting
      • Manage Knowledge
        • Maintain Documents
        • Maintain Knowledge via API
      • Metadata
      • Integrate Knowledge Base within Application
      • Retrieval Test / Citation and Attributions
      • Knowledge Request Rate Limit
      • Connect to an External Knowledge Base
      • External Knowledge API
    • Tools
      • Quick Tool Integration
      • Advanced Tool Integration
      • Tool Configuration
        • Google
        • Bing
        • SearchApi
        • StableDiffusion
        • Dall-e
        • Perplexity Search
        • AlphaVantage
        • Youtube
        • SearXNG
        • Serper
        • SiliconFlow (Flux AI Supported)
        • ComfyUI
    • Publishing
      • Publish as a Single-page Web App
        • Web App Settings
        • Text Generator Application
        • Conversation Application
      • Embedding In Websites
      • Developing with APIs
      • Re-develop Based on Frontend Templates
    • Annotation
      • Logs and Annotation
      • Annotation Reply
    • Monitoring
      • Data Analysis
      • Integrate External Ops Tools
        • Integrate LangSmith
        • Integrate Langfuse
        • Integrate Opik
    • Extension
      • API-Based Extension
        • External Data Tool
        • Deploy API Tools with Cloudflare Workers
        • Moderation
      • Code-Based Extension
        • External Data Tool
        • Moderation
    • Collaboration
      • Discover
      • Invite and Manage Members
    • Management
      • App Management
      • Team Members Management
      • Personal Account Management
      • Subscription Management
      • Version Control
  • Workshop
    • Basic
      • How to Build an AI Image Generation App
    • Intermediate
      • Build An Article Reader Using File Upload
      • Building a Smart Customer Service Bot Using a Knowledge Base
      • Generating analysis of Twitter account using Chatflow Agent
  • Community
    • Seek Support
    • Become a Contributor
    • Contributing to Dify Documentation
  • Plugins
    • Introduction
    • Quick Start
      • Install and Use Plugins
      • Develop Plugins
        • Initialize Development Tools
        • Tool Plugin
        • Model Plugin
          • Create Model Providers
          • Integrate the Predefined Model
          • Integrate the Customizable Model
        • Agent Strategy Plugin
        • Extension Plugin
        • Bundle
      • Debug Plugin
    • Manage Plugins
    • Schema Specification
      • Manifest
      • Endpoint
      • Tool
      • Agent
      • Model
        • Model Designing Rules
        • Model Schema
      • General Specifications
      • Persistent Storage
      • Reverse Invocation of the Dify Service
        • App
        • Model
        • Tool
        • Node
    • Best Practice
      • Develop a Slack Bot Plugin
      • Dify MCP Plugin Guide: Connect Zapier and Automate Email Delivery with Ease
    • Publish Plugins
      • Publish Plugins Automatically
      • Publish to Dify Marketplace
        • Plugin Developer Guidelines
        • Plugin Privacy Protection Guidelines
      • Publish to Your Personal GitHub Repository
      • Package the Plugin File and Publish it
      • Signing Plugins for Third-Party Signature Verification
    • FAQ
  • Development
    • Backend
      • DifySandbox
        • Contribution Guide
    • Models Integration
      • Integrate Open Source Models from Hugging Face
      • Integrate Open Source Models from Replicate
      • Integrate Local Models Deployed by Xinference
      • Integrate Local Models Deployed by OpenLLM
      • Integrate Local Models Deployed by LocalAI
      • Integrate Local Models Deployed by Ollama
      • Integrate Models on LiteLLM Proxy
      • Integrating with GPUStack for Local Model Deployment
      • Integrating AWS Bedrock Models (DeepSeek)
    • Migration
      • Migrating Community Edition to v1.0.0
  • Learn More
    • Use Cases
      • DeepSeek & Dify Integration Guide: Building AI Applications with Multi-Turn Reasoning
      • Private Deployment of Ollama + DeepSeek + Dify: Build Your Own AI Assistant
      • Build a Notion AI Assistant
      • Create a MidJourney Prompt Bot with Dify
      • Create an AI Chatbot with Business Data in Minutes
      • Integrating Dify Chatbot into Your Wix Website
      • How to connect with AWS Bedrock Knowledge Base?
      • Building the Dify Scheduler
      • Building an AI Thesis Slack Bot on Dify
    • Extended Reading
      • What is LLMOps?
      • Retrieval-Augmented Generation (RAG)
        • Hybrid Search
        • Re-ranking
        • Retrieval Modes
      • How to Use JSON Schema Output in Dify?
    • FAQ
      • Self-Host
      • LLM Configuration and Usage
      • Plugins
  • Policies
    • Open Source License
    • User Agreement
      • Terms of Service
      • Privacy Policy
      • Get Compliance Report
  • Features
    • Workflow
Powered by GitBook
On this page
  • Why is Hybrid Search Needed?
  • Vector Search
  • Full-Text Search
  • Hybrid Search
  • Setting Retrieval Mode When Creating a Dataset
  • Modifying Retrieval Mode in Dataset Settings
  • Modifying Retrieval Mode in Prompt Arrangement
  1. Learn More
  2. Extended Reading
  3. Retrieval-Augmented Generation (RAG)

Hybrid Search

Why is Hybrid Search Needed?

The mainstream method in the retrieval phase of RAG (Retrieval-Augmented Generation) is vector search, which matches based on semantic relevance. The technical principle involves splitting the documents in the external knowledge base into semantically complete paragraphs or sentences, converting them into a series of numbers (multi-dimensional vectors) that the computer can understand, and performing the same conversion on the user's query.

The computer can detect subtle semantic relationships between the user's query and the sentences. For example, "cats chase mice" and "kittens hunt mice" will have a higher semantic relevance than "cats chase mice" and "I like eating ham." After finding the most relevant text content, the RAG system provides it as context for the user's query to the large model, helping it answer the question.

In addition to enabling complex semantic text retrieval, vector search has other advantages:

  • Understanding similar semantics (e.g., mouse/mousetrap/cheese, Google/Bing/search engine)

  • Multilingual understanding (cross-language understanding, such as matching English input with Chinese)

  • Multimodal understanding (support for similar matching of text, images, audio, video, etc.)

  • Fault tolerance (handling spelling errors and vague descriptions)

While vector search has clear advantages in the above scenarios, it performs poorly in certain situations, such as:

  • Searching for names of people or objects (e.g., Elon Musk, iPhone 15)

  • Searching for abbreviations or phrases (e.g., RAG, RLHF)

  • Searching for IDs (e.g., gpt-3.5-turbo, titan-xlarge-v1.01)

These weaknesses are precisely the strengths of traditional keyword search, which excels in:

  • Exact matching (e.g., product names, personal names, product numbers)

  • Matching with a few characters (vector search performs poorly with few characters, but many users tend to input only a few keywords)

  • Matching low-frequency words (low-frequency words often carry significant meaning in language, such as "Would you like to have coffee with me?" where "have" and "coffee" carry more importance than "you" and "like")

For most text search scenarios, the primary goal is to ensure that the most relevant potential results appear in the candidate results. Vector search and keyword search each have their advantages in the retrieval field. Hybrid search combines the strengths of both search technologies and compensates for their weaknesses.

In hybrid search, you need to establish vector indexes and keyword indexes in the database in advance. When a user query is input, the most relevant texts are retrieved from the documents using both retrieval methods.

"Hybrid search" does not have a precise definition. This article uses the combination of vector search and keyword search as an example. If we use other combinations of search algorithms, it can also be called "hybrid search." For instance, we can combine knowledge graph techniques for retrieving entity relationships with vector search techniques.

Different retrieval systems excel at finding various subtle relationships between texts (paragraphs, sentences, words), including exact relationships, semantic relationships, thematic relationships, structural relationships, entity relationships, temporal relationships, event relationships, etc. No single retrieval mode can be suitable for all scenarios. Hybrid search achieves complementarity between multiple retrieval technologies through the combination of multiple retrieval systems.

Vector Search

Definition: Generating query embeddings and querying the text segments most similar to their vector representations.

TopK: Used to filter the text fragments most similar to the user's query. The system will dynamically adjust the number of fragments based on the context window size of the selected model. The default value is 3.

Score Threshold: Used to set the similarity threshold for filtering text fragments, i.e., only recalling text fragments that exceed the set score. The system's default is to turn off this setting, meaning it does not filter the similarity values of recalled text fragments. When enabled, the default value is 0.5.

Rerank Model: After configuring the Rerank model's API key on the "Model Providers" page, you can enable the "Rerank Model" in the retrieval settings. The system will perform semantic re-ranking on the recalled document results after semantic retrieval to optimize the ranking results. When the Rerank model is set, the TopK and Score Threshold settings only take effect in the Rerank step.

Full-Text Search

Definition: Indexing all words in the document, allowing users to query any word and return text fragments containing those words.

TopK: Used to filter the text fragments most similar to the user's query. The system will dynamically adjust the number of fragments based on the context window size of the selected model. The default value is 3.

Rerank Model: After configuring the Rerank model's API key on the "Model Providers" page, you can enable the "Rerank Model" in the retrieval settings. The system will perform semantic re-ranking on the recalled document results after full-text retrieval to optimize the ranking results. When the Rerank model is set, the TopK and Score Threshold settings only take effect in the Rerank step.

Hybrid Search

Simultaneously performs full-text search and vector search, applying a re-ranking step to select the best results matching the user's query from both types of query results. Requires configuring the Rerank model API.

TopK: Used to filter the text fragments most similar to the user's query. The system will dynamically adjust the number of fragments based on the context window size of the selected model. The default value is 3.

Rerank Model: After configuring the Rerank model's API key on the "Model Providers" page, you can enable the "Rerank Model" in the retrieval settings. The system will perform semantic re-ranking on the recalled document results after hybrid retrieval to optimize the ranking results. When the Rerank model is set, the TopK and Score Threshold settings only take effect in the Rerank step.

Setting Retrieval Mode When Creating a Dataset

Set different retrieval modes by entering the "Dataset -> Create Dataset" page and configuring the retrieval settings.

Modifying Retrieval Mode in Dataset Settings

Modify the retrieval mode of an existing dataset by entering the "Dataset -> Select Dataset -> Settings" page.

Modifying Retrieval Mode in Prompt Arrangement

Modify the retrieval mode when creating an application by entering the "Prompt Arrangement -> Context -> Select Dataset -> Settings" page.

PreviousRetrieval-Augmented Generation (RAG)NextRe-ranking

Last updated 10 months ago

Hybrid Search
Vector Search Settings
Full-Text Search Settings
Hybrid Search Settings
Setting Retrieval Mode When Creating a Dataset
Modifying Retrieval Mode in Dataset Settings
Modifying Retrieval Mode in Prompt Arrangement