LlamaChat 🦙🦙🦙

LlamaChat is a C++ library designed for running language models using the llama.cpp framework. It provides an easy-to-use interface for loading models, querying them, and streaming responses in C++ applications.

Supported Systems:

MacOS
Windows
Linux

Installation

Add LlamaChat as a Submodule

First, add this library as a submodule in your project:

$ git submodule add https://github.com/developer239/llama-chat externals/llama-chat

Load the module's dependencies:

$ git submodule update --init --recursive

Update Your CMake

In your project's CMakeLists.txt, add the following lines to include and link the LlamaChat library:

add_subdirectory(externals/llama-chat)
target_link_libraries(<your_target> PRIVATE LlamaChat)

Usage

Basic Usage

To use the LlamaChat library, include the header and create an instance of the LlamaChat class. You can initialize the model and context separately, then run queries or stream responses.

#include "llama-chat.h"
#include <iostream>

int main() {
    LlamaChat llama;
    
    ModelParams modelParams;
    modelParams.nGpuLayers = 32;  // Adjust based on your GPU capabilities
    
    if (!llama.InitializeModel("path/to/model", modelParams)) {
        std::cerr << "Failed to initialize the model." << std::endl;
        return 1;
    }
    
    ContextParams ctxParams;
    ctxParams.nContext = 2048;
    
    if (!llama.InitializeContext(ctxParams)) {
        std::cerr << "Failed to initialize the context." << std::endl;
        return 1;
    }

    std::string systemPrompt = "You are a helpful AI assistant.";
    llama.SetSystemPrompt(systemPrompt);

    std::string userMessage = "How do I write hello world in C++?";

    llama.Prompt(userMessage, [](const std::string& piece) {
        std::cout << piece << std::flush;
    });

    return 0;
}

Streaming Responses

The Prompt method implements streaming responses by providing a callback function. This is useful for long outputs.

API Reference

LlamaChat Class

The LlamaChat class provides methods to interact with language models loaded through llama.cpp.

Public Methods

LlamaChat(): Constructor. Initializes the LlamaChat object.
~LlamaChat(): Destructor. Cleans up resources.
bool InitializeModel(const std::string& modelPath, const ModelParams& params): Initializes the model with the specified path and parameters.
bool InitializeContext(const ContextParams& params): Initializes the context with the specified parameters.
void SetSystemPrompt(const std::string& systemPrompt): Sets the system prompt for the conversation.
void ResetConversation(): Resets the conversation history.
void Prompt(const std::string& userMessage, const std::function<void(const std::string&)>& callback): Processes the user message and streams the response, invoking the callback function with each piece of the response.

Structs

LlamaToken: Represents a token in the model's vocabulary.
- tokenId (int): The unique identifier of the token.
ModelParams: Parameters for model initialization.
- nGpuLayers (int): Number of layers to offload to GPU. Set to 0 for CPU-only.
- vocabularyOnly (bool): Only load the vocabulary, no weights.
- useMemoryMapping (bool): Use memory mapping for faster loading.
- useModelLock (bool): Force system to keep model in RAM.
ContextParams: Parameters for context initialization.
- nContext (size_t): Size of the context window (in tokens).
- nThreads (int): Number of threads to use for computation.
- nBatch (int): Number of tokens to process in parallel.
SamplingParams: Parameters for text generation sampling.
- maxTokens (size_t): Maximum number of tokens to generate.
- temperature (float): Controls randomness in generation.
- topK (int32_t): Limits sampling to the k most likely tokens.
- topP (float): Limits sampling to a cumulative probability.
- repeatPenalty (float): Penalty for repeating tokens.
- frequencyPenalty (float): Penalty based on token frequency in generated text.
- presencePenalty (float): Penalty for tokens already present in generated text.
- repeatPenaltyTokens (std::vector): Tokens to consider for repeat penalty.

Name		Name	Last commit message	Last commit date
Latest commit History 30 Commits
externals		externals
src		src
.clang-format		.clang-format
.clang-tidy		.clang-tidy
.gitignore		.gitignore
.gitmodules		.gitmodules
CMakeLists.txt		CMakeLists.txt
LlamaCPPConfig.cmake.in		LlamaCPPConfig.cmake.in
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

LlamaChat 🦙🦙🦙

Installation

Add LlamaChat as a Submodule

Update Your CMake

Usage

Basic Usage

Streaming Responses

API Reference

LlamaChat Class

Public Methods

Structs

About

Releases

Packages

Languages

developer239/llama-chat

Folders and files

Latest commit

History

Repository files navigation

LlamaChat 🦙🦙🦙

Installation

Add LlamaChat as a Submodule

Update Your CMake

Usage

Basic Usage

Streaming Responses

API Reference

LlamaChat Class

Public Methods

Structs

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages