Interface OllamaParameters<Self extends OllamaParameters<Self>>

All Known Implementing Classes:
OllamaChatRequest, OllamaCreateModelRequest, OllamaEmbedRequest, OllamaGenerateRequest, OllamaRequest.WithOptions

public interface OllamaParameters<Self extends OllamaParameters<Self>>
  • Method Details

    • numGpu

      default Integer numGpu()
    • numGpu

      default Self numGpu(Integer numGpu)
      Indicates to llama.cpp how many GPUs are available. A value of 0 will disable the use of GPU for the request, and a value greater than 1 can be use to force llama.cpp to allocate more VRAM. This is useful if ollama is offloading less layers to the GPU than possible, but can generate OOM CUDA errors.
    • mirostat

      default Integer mirostat()
    • mirostat

      default Self mirostat(Integer mirostat)
      Enable Mirostat sampling for controlling perplexity. (default: 0, 0 = disabled, 1 = Mirostat, 2 = Mirostat 2.0)
    • mirostatEta

      default Double mirostatEta()
    • mirostatEta

      default Self mirostatEta(Double mirostatEta)
      Influences how quickly the algorithm responds to feedback from the generated text. A lower learning rate will result in slower adjustments, while a higher learning rate will make the algorithm more responsive. (Default: 0.1)
    • mirostatTau

      default Double mirostatTau()
    • mirostatTau

      default Self mirostatTau(Double mirostatTau)
      Controls the balance between coherence and diversity of the output. A lower value will result in more focused and coherent text. (Default: 5.0)
    • numCtx

      default Integer numCtx()
    • numCtx

      default Self numCtx(Integer numCtx)
      Sets the size of the context window used to generate the next token. (Default: 2048)
    • repeatLastN

      default Integer repeatLastN()
    • repeatLastN

      default Self repeatLastN(Integer repeatLastN)
      Sets how far back for the model to look back to prevent repetition. (Default: 64, 0 = disabled, -1 = num_ctx)
    • repeatPenalty

      default Double repeatPenalty()
    • repeatPenalty

      default Self repeatPenalty(Double repeatPenalty)
      Sets how strongly to penalize repetitions. A higher value (e.g., 1.5) will penalize repetitions more strongly, while a lower value (e.g., 0.9) will be more lenient. (Default: 1.1)
    • temperature

      default Double temperature()
    • temperature

      default Self temperature(Double temperature)
      The temperature of the model. Increasing the temperature will make the model answer more creatively. (Default: 0.8)
    • seed

      default Integer seed()
    • seed

      default Self seed(Integer seed)
      Sets the random number seed to use for generation. Setting this to a specific number will make the model generate the same text for the same prompt. (Default: 0)
    • stop

      default List<String> stop()
    • stop

      default Self stop(Collection<String> stop)
      Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate stop parameters in a modelfile.
    • stop

      default Self stop(String... stop)
      Sets the stop sequences to use. When this pattern is encountered the LLM will stop generating text and return. Multiple stop patterns may be set by specifying multiple separate stop parameters in a modelfile.
    • tfsZ

      default Double tfsZ()
    • tfsZ

      default Self tfsZ(Double tfsZ)
      Tail free sampling is used to reduce the impact of less probable tokens from the output. A higher value (e.g., 2.0) will reduce the impact more, while a value of 1.0 disables this setting. (default: 1)
    • numPredict

      default Integer numPredict()
    • numPredict

      default Self numPredict(Integer numPredict)
      Maximum number of tokens to predict when generating text. (Default: 128, -1 = infinite generation, -2 = fill context)
    • topK

      default Integer topK()
    • topK

      default Self topK(Integer topK)
      Reduces the probability of generating nonsense. A higher value (e.g. 100) will give more diverse answers, while a lower value (e.g. 10) will be more conservative. (Default: 40)
    • topP

      default Double topP()
    • topP

      default Self topP(Double topP)
      Works together with top-k. A higher value (e.g., 0.95) will lead to more diverse text, while a lower value (e.g., 0.5) will generate more focused and conservative text. (Default: 0.9)
    • minP

      default Double minP()
    • minP

      default Self minP(Double minP)
      Alternative to the top_p, and aims to ensure a balance of quality and variety. The parameter p represents the minimum probability for a token to be considered, relative to the probability of the most likely token. For example, with p=0.05 and the most likely token having a probability of 0.9, logits with a value less than 0.045 are filtered out. (Default: 0.0)
    • parametersMap

      Map<String,Object> parametersMap()
    • parameter

      default <T> T parameter(String name)
    • parameter

      default Self parameter(String name, Object value)
    • self

      default Self self()