forked from konfuzio-ai/ai-comedy-club
-
Notifications
You must be signed in to change notification settings - Fork 0
Commit
This commit does not belong to any branch on this repository, and may belong to a fork outside of the repository.
Merge pull request konfuzio-ai#52 from onur-rgb/main
Add Llamastar the chatbot
- Loading branch information
Showing
7 changed files
with
590 additions
and
0 deletions.
There are no files selected for viewing
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,18 @@ | ||
# Use a base image with GPU support (e.g., NVIDIA CUDA) | ||
FROM pytorch/pytorch:latest | ||
|
||
# Install Git in the container | ||
RUN apt-get update && \ | ||
apt-get install -y git | ||
|
||
# Set the working directory | ||
WORKDIR /app | ||
|
||
# Copy your application code to the container (assuming your Python files are in the same directory as the Dockerfile) | ||
COPY . /app | ||
|
||
# Install Python dependencies using pip | ||
RUN pip install -r requirements.txt | ||
|
||
# Define the command to run your application | ||
CMD [ "python", "joke_bot.py" ] |
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,50 @@ | ||
|
||
# AI Comedy Club - Docker Setup with GPU Support | ||
|
||
Welcome to the AI Comedy Club, where humor meets technology! In this repository, you'll discover a Dockerfile that empowers you to run my AI comedian bot with GPU support. This README will serve as your trusty guide, walking you through the process of setting up and running your bot within a Docker container. | ||
|
||
## Prerequisites | ||
|
||
Before you begin, make sure you have the following prerequisites: | ||
|
||
- A system with an NVIDIA GPU. | ||
- NVIDIA Docker runtime (nvidia-docker2) installed. You can find installation instructions here: [NVIDIA Container Toolkit](https://github.com/NVIDIA/nvidia-container-runtime). | ||
|
||
## Step 1: Clone the Repository (if not already done) | ||
|
||
If you haven't already, clone the AI Comedy Club repository to your local machine: | ||
|
||
```bash | ||
git clone https://github.com/onur-rgb/ai-comedy-club.git | ||
|
||
``` | ||
|
||
## Step 2: Build the Docker Image | ||
|
||
|
||
1. Open a terminal and navigate to the project directory. | ||
|
||
2. Build the Docker image using the following command: | ||
|
||
```bash | ||
cd ai-comedy-club/bots/Llamastar | ||
docker build -t my-joke-bot . | ||
``` | ||
|
||
This command builds a Docker image named `my-joke-bot` with GPU support based on the contents of the Dockerfile. | ||
|
||
## Step 3: Run the Docker Container | ||
|
||
To run your AI comedian bot inside a Docker container with GPU support, use the following command: | ||
|
||
```bash | ||
docker run --gpus all -it my-joke-bot | ||
``` | ||
|
||
AI comedian bot is now running within the Docker container. You can interact with it as needed. | ||
|
||
## About the Chatbot - Meet Llamastar | ||
Let's introduce you to the star of the show, Llamastar! This AI comedian bot was crafted using TheBloke's Llama-2-7B-chat-GPTQ model. But that's not all; I've spiced it up with Gradio and Langchain applications to provide you with an interactive and entertaining experience. Feel free to explore Llamastar's comedic talents and have a blast at the AI Comedy Club! | ||
|
||
This README is your backstage pass to the world of Llamastar and GPU-supported AI comedy. It's time to enjoy the show and let the laughter begin! 🎤😄 | ||
|
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -0,0 +1,230 @@ | ||
{ | ||
"cells": [ | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Setup" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"!pip install -q -U trl transformers accelerate git+https://github.com/huggingface/peft.git\n", | ||
"!pip install -q datasets bitsandbytes einops wandb" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Dataset" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from datasets import load_dataset\n", | ||
"\n", | ||
"\n", | ||
"dataset_name = 'Fraser/short-jokes' # Short jokes dataset from Hugginface\n", | ||
"dataset = load_dataset(dataset_name, split=\"train\")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Loading the model" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"import torch\n", | ||
"from transformers import AutoModelForCausalLM, AutoTokenizer, BitsAndBytesConfig, AutoTokenizer\n", | ||
"\n", | ||
"model_name = \"TinyPixel/Llama-2-7B-bf16-sharded\"\n", | ||
"\n", | ||
"bnb_config = BitsAndBytesConfig(\n", | ||
" load_in_4bit=True,\n", | ||
" bnb_4bit_quant_type=\"nf4\",\n", | ||
" bnb_4bit_compute_dtype=torch.float16,\n", | ||
")\n", | ||
"\n", | ||
"model = AutoModelForCausalLM.from_pretrained(\n", | ||
" model_name,\n", | ||
" quantization_config=bnb_config,\n", | ||
" trust_remote_code=True\n", | ||
")\n", | ||
"model.config.use_cache = False" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": 2, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# Addingtokenizer \n", | ||
"tokenizer = AutoTokenizer.from_pretrained(model_name, trust_remote_code=True)\n", | ||
"tokenizer.pad_token = tokenizer.eos_token" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"#Adjustable peft config\n", | ||
"from peft import LoraConfig, get_peft_model\n", | ||
"\n", | ||
"lora_alpha = 16\n", | ||
"lora_dropout = 0.1\n", | ||
"lora_r = 64\n", | ||
"\n", | ||
"peft_config = LoraConfig(\n", | ||
" lora_alpha=lora_alpha,\n", | ||
" lora_dropout=lora_dropout,\n", | ||
" r=lora_r,\n", | ||
" bias=\"none\",\n", | ||
" task_type=\"CAUSAL_LM\"\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Loading the trainer" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from transformers import TrainingArguments\n", | ||
"\n", | ||
"output_dir = \"./results\"\n", | ||
"per_device_train_batch_size = 4\n", | ||
"gradient_accumulation_steps = 4\n", | ||
"optim = \"paged_adamw_32bit\"\n", | ||
"save_steps = 100\n", | ||
"logging_steps = 10\n", | ||
"learning_rate = 2e-4\n", | ||
"max_grad_norm = 0.3\n", | ||
"max_steps = 110\n", | ||
"warmup_ratio = 0.03\n", | ||
"lr_scheduler_type = \"constant\"\n", | ||
"\n", | ||
"training_arguments = TrainingArguments(\n", | ||
" output_dir=output_dir,\n", | ||
" per_device_train_batch_size=per_device_train_batch_size,\n", | ||
" gradient_accumulation_steps=gradient_accumulation_steps,\n", | ||
" optim=optim,\n", | ||
" save_steps=save_steps,\n", | ||
" logging_steps=logging_steps,\n", | ||
" learning_rate=learning_rate,\n", | ||
" fp16=True,\n", | ||
" max_grad_norm=max_grad_norm,\n", | ||
" max_steps=max_steps,\n", | ||
" warmup_ratio=warmup_ratio,\n", | ||
" group_by_length=True,\n", | ||
" lr_scheduler_type=lr_scheduler_type,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"from trl import SFTTrainer\n", | ||
"\n", | ||
"max_seq_length = 512\n", | ||
"\n", | ||
"trainer = SFTTrainer(\n", | ||
" model=model,\n", | ||
" train_dataset=dataset,\n", | ||
" peft_config=peft_config,\n", | ||
" dataset_text_field=\"text\",\n", | ||
" max_seq_length=max_seq_length,\n", | ||
" tokenizer=tokenizer,\n", | ||
" args=training_arguments,\n", | ||
")" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"# pre-process the model by upcasting the layer norms in float 32 for more stable training\n", | ||
"for name, module in trainer.model.named_modules():\n", | ||
" if \"norm\" in name:\n", | ||
" module = module.to(torch.float32)" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"Train the model" | ||
] | ||
}, | ||
{ | ||
"cell_type": "markdown", | ||
"metadata": {}, | ||
"source": [ | ||
"trainer.train()" | ||
] | ||
}, | ||
{ | ||
"cell_type": "code", | ||
"execution_count": null, | ||
"metadata": {}, | ||
"outputs": [], | ||
"source": [ | ||
"#save peft model weights\n", | ||
"model_to_save = trainer.model.module if hasattr(trainer.model, 'module') else trainer.model # Take care of distributed/parallel training\n", | ||
"model_to_save.save_pretrained(\"outputs\")" | ||
] | ||
} | ||
], | ||
"metadata": { | ||
"kernelspec": { | ||
"display_name": "joke_bot", | ||
"language": "python", | ||
"name": "python3" | ||
}, | ||
"language_info": { | ||
"codemirror_mode": { | ||
"name": "ipython", | ||
"version": 3 | ||
}, | ||
"file_extension": ".py", | ||
"mimetype": "text/x-python", | ||
"name": "python", | ||
"nbconvert_exporter": "python", | ||
"pygments_lexer": "ipython3", | ||
"version": "3.11.4" | ||
}, | ||
"orig_nbformat": 4 | ||
}, | ||
"nbformat": 4, | ||
"nbformat_minor": 2 | ||
} |
Oops, something went wrong.