Inference Config
Table of contents
- Parameters Reference
- Level 1 — Global default (LMConfig)
- Level 2 — Per-agent default (AgentOptions)
- Level 3 — Per-request override
- Precedence
- Recommended Presets
InferenceConfig groups sampling and penalty parameters that control how the model generates text. It follows the same three-level override pattern as Reasoning Control. Only fields that are non-null are forwarded to the server — unset fields fall through to the next level.
Parameters Reference
| Field | Type | Description |
|---|---|---|
Temperature | double? | Sampling temperature. Higher values produce more varied output |
TopP | double? | Top-p (nucleus) sampling cutoff |
TopK | int? | Top-k — limits the candidate token pool |
MinP | double? | Minimum probability threshold for token selection |
PresencePenalty | double? | Penalises tokens that have already appeared in the output |
RepetitionPenalty | double? | Multiplier applied to logits of previously-seen tokens |
Level 1 — Global default (LMConfig)
var lm = new OpenAIBackend(new LMConfig
{
Endpoint = "http://localhost:1234",
ModelName = "your-model-name",
Inference = new InferenceConfig
{
Temperature = 0.7,
TopP = 0.8,
TopK = 20,
MinP = 0.0,
PresencePenalty = 1.5,
RepetitionPenalty = 1.0,
},
});
Level 2 — Per-agent default (AgentOptions)
var agent = new Agent(lm, new AgentOptions
{
SystemPrompt = "You are a creative writer.",
Inference = new InferenceConfig
{
Temperature = 1.2,
RepetitionPenalty = 1.1,
},
});
Level 3 — Per-request override
Pass inference: to any Run* / Chat* call:
// Deterministic output for structured extraction
var result = await agent.ChatStreamAsync(
"Extract the invoice number from this text.",
inference: new InferenceConfig { Temperature = 0.0 });
// Creative variation for brainstorming
var result = await agent.ChatStreamAsync(
"Give me 10 creative names for my product.",
inference: new InferenceConfig { Temperature = 1.4, TopP = 0.95 });
Precedence
per-request inference:
→ AgentOptions.Inference (field-by-field merge)
→ LMConfig.Inference (field-by-field merge)
→ not sent (model default)
Fields are merged individually. For example, if
LMConfigsetsTemperature = 0.7andAgentOptionssets onlyRepetitionPenalty = 1.1, the effective config usesTemperature = 0.7andRepetitionPenalty = 1.1.
Recommended Presets
Deterministic / structured extraction
new InferenceConfig { Temperature = 0.0, TopP = 1.0 }
Balanced (general purpose)
new InferenceConfig { Temperature = 0.7, TopP = 0.9 }
Creative writing
new InferenceConfig { Temperature = 1.2, TopP = 0.95, RepetitionPenalty = 1.1 }
Code generation
new InferenceConfig { Temperature = 0.2, TopP = 0.95, TopK = 40 }