Predefined Model Integration
After completing the supplier integration, the next step is to integrate the models under the supplier.
First, we need to determine the type of model to be integrated and create the corresponding model type module in the directory of the respective supplier.
The currently supported model types are as follows:
llmText Generation Modeltext_embeddingText Embedding ModelrerankRerank Modelspeech2textSpeech to TextttsText to SpeechmoderationModeration
Taking Anthropic as an example, Anthropic only supports LLM, so we create a module named llm in model_providers.anthropic.
For predefined models, we first need to create a YAML file named after the model under the llm module, such as: claude-2.1.yaml.
Preparing the Model YAML
model: claude-2.1 # Model identifier
# Model display name, can be set in en_US English and zh_Hans Chinese. If zh_Hans is not set, it will default to en_US.
# You can also not set a label, in which case the model identifier will be used.
label:
en_US: claude-2.1
model_type: llm # Model type, claude-2.1 is an LLM
features: # Supported features, agent-thought supports Agent reasoning, vision supports image understanding
- agent-thought
model_properties: # Model properties
mode: chat # LLM mode, complete for text completion model, chat for dialogue model
context_size: 200000 # Maximum context size supported
parameter_rules: # Model invocation parameter rules, only LLM needs to provide
- name: temperature # Invocation parameter variable name
# There are 5 preset variable content configuration templates: temperature/top_p/max_tokens/presence_penalty/frequency_penalty
# You can set the template variable name directly in use_template, and it will use the default configuration in entities.defaults.PARAMETER_RULE_TEMPLATE
# If additional configuration parameters are set, they will override the default configuration
use_template: temperature
- name: top_p
use_template: top_p
- name: top_k
label: # Invocation parameter display name
zh_Hans: 取样数量
en_US: Top k
type: int # Parameter type, supports float/int/string/boolean
help: # Help information, describes the parameter's function
zh_Hans: 仅从每个后续标记的前 K 个选项中采样。
en_US: Only sample from the top K options for each subsequent token.
required: false # Whether it is required, can be omitted
- name: max_tokens_to_sample
use_template: max_tokens
default: 4096 # Default parameter value
min: 1 # Minimum parameter value, only applicable to float/int
max: 4096 # Maximum parameter value, only applicable to float/int
pricing: # Pricing information
input: '8.00' # Input unit price, i.e., Prompt unit price
output: '24.00' # Output unit price, i.e., return content unit price
unit: '0.000001' # Price unit, the above price is per 100K
currency: USD # Price currencyIt is recommended to prepare all model configurations before starting the implementation of the model code.
Similarly, you can refer to the YAML configuration information in the directories of other suppliers under the model_providers directory. The complete YAML rules can be found in: Schema.
Implementing Model Invocation Code
Next, create a Python file with the same name llm.py under the llm module to write the implementation code.
Create an Anthropic LLM class in llm.py, which we will name AnthropicLargeLanguageModel (name can be arbitrary), inheriting from the __base.large_language_model.LargeLanguageModel base class, and implement the following methods:
LLM Invocation
Implement the core method for LLM invocation, supporting both streaming and synchronous responses.
When implementing, note to use two functions to return data, one for handling synchronous responses and one for streaming responses. Since Python recognizes functions containing the
yieldkeyword as generator functions, returning a fixed data type ofGenerator, synchronous and streaming responses need to be implemented separately, like this (note the example below uses simplified parameters, actual implementation should follow the parameter list above):Precompute Input Tokens
If the model does not provide a precompute tokens interface, return 0 directly.
Model Credentials Validation
Similar to supplier credentials validation, this validates the credentials for a single model.
Invocation Error Mapping Table
When a model invocation error occurs, it needs to be mapped to the
InvokeErrortype specified by Runtime, facilitating Dify to handle different errors differently.Runtime Errors:
InvokeConnectionErrorInvocation connection errorInvokeServerUnavailableErrorInvocation service unavailableInvokeRateLimitErrorInvocation rate limit reachedInvokeAuthorizationErrorInvocation authorization failedInvokeBadRequestErrorInvocation parameter error
For interface method descriptions, see: Interfaces, and for specific implementation, refer to: llm.py.
Last updated