For the GenerationConfig_ResponseLLM field, enter values for the following properties:
Property
Description
temperature
Controls the randomness of the model's output. A lower value close to 0 makes the output more deterministic, while a higher value close to 1 increases randomness and creativity. For example, if temperature is set to 0.5, the model balances between deterministic and creative outputs.
topP
Determines the cumulative probability threshold for token selection. The model considers the smallest set of tokens whose cumulative probability meets or exceeds topP. For example, if topP is set to 0.1, the model considers only the top 10% most probable tokens at each step.
max_tokens
Defines the token count of your prompt. The value can't exceed the model's context length. Most of the models have a context length of 2048 tokens.
- In the Deployment_ID field, enter the name of the model deployment. You must first deploy a model before you can make calls.
- In the API_Version field, enter the API version to use for this operation. The API version must use the YYYY-MM-DD or YYYY-MM-DD-preview format. For example, 2024-02-15-preview.
- In the Evaluation_Instruction field, update the instruction for the second LLM. By default it contains an example behavior for the second LLM which evaluates the response from the first LLM. You can customize the criteria and descriptions, if required. The response from the LLM must be in a valid JSON format for further processing and output.
- In the GenerationConfig_EvaluationLLM field, enter the prompt instructions using the Expression Editor, as shown in the following sample code:
For the GenerationConfig_EvaluationLLM field, enter values for the following properties:
Property
Description
temperature
Controls the randomness of the model's output. A lower value close to 0 makes the output more deterministic, while a higher value close to 1 increases randomness and creativity. For example, if temperature is set to 0.5, the model balances between deterministic and creative outputs.
topP
Determines the cumulative probability threshold for token selection. The model considers the smallest set of tokens whose cumulative probability meets or exceeds topP. For example, if topP is set to 0.1, the model considers only the top 10% most probable tokens at each step.
maxOutputTokens
Defines the maximum number of tokens the model can generate in its response. Setting a limit ensures that the response is concise and fits within the desired length constraints.
- In the ModelID_EvaluationLLM field, enter the model ID for evaluating the response from LLM. For example, gemini-1.5-pro.
- In the Retry field, you can change the retry value to increase the number of attempts to call the LLM model if an error occurs.