Determines the cumulative probability threshold for token selection. The model considers the smallest set of tokens whose cumulative probability meets or exceeds top_p. For example, if top_p is set to 0.1, the model considers only the top 10% most probable tokens at each step.
temperature
Controls the randomness of the model's output. A lower value makes the output more deterministic, while a higher value increases randomness and creativity. For example, a temperature of 0.5 balances between deterministic and creative outputs.
max_tokens
Defines the maximum number of tokens that the model can generate in its response. Setting a limit ensures that the response is concise and fits within the desired length constraints.
For more information about these properties, see the Azure OpenAI documentation.