Skip to content

Commit

Permalink
Merge branch 'fine-tuning' into data-loading
Browse files Browse the repository at this point in the history
# Conflicts:
#	.gitignore
  • Loading branch information
n-grieger committed Oct 4, 2023
2 parents 44c4a0e + 6d24f34 commit 9f2ee88
Show file tree
Hide file tree
Showing 10 changed files with 201,164 additions and 2 deletions.
4 changes: 3 additions & 1 deletion .gitignore
Original file line number Diff line number Diff line change
@@ -1,2 +1,4 @@
**/cache-*.arrow
**/.chroma/**
**/.chroma/**

.idea
35 changes: 34 additions & 1 deletion README.md
Original file line number Diff line number Diff line change
@@ -1,3 +1,36 @@
# GermEval 2023

In this repository, we will shortly share the code of our (Team CPAa) participation in Task 1 (Subtask 1 + 2) of the GermEval 2023 Shared Task.
In this repository, we will shortly share the code of our (Team CPAa) participation in Task 1 (Subtask 1 + 2) of the
GermEval 2023 Shared Task.

## Setup

install pytorch from here: https://pytorch.org/get-started/locally/

install remaining requirements with: `pip install -U -r requirements.txt`

## Fine-tuning

Prepare Llama 2 models in HF (Huggingface) format (either from Huggingface or
from https://github.com/facebookresearch/llama
converted with https://github.com/facebookresearch/llama-recipes/#model-conversion-to-hugging-face)

Prepare data with [parse_data_alpaca_format.ipynb](fine-tuning/scripts/parse_data_alpaca_format.ipynb)

set path to data and path to Llama 2 model in fine-tuning scripts in folder `fine-tuning/scripts/`

set `CUDA_VISIBLE_DEVICES` if you want to limit the used GPUs

set `per_device_train_batch_size` and `gradient_accumulation_steps` so
that `per_device_train_batch_size * gradient_accumulation_steps` is a multiple of 16 and the model fits on your GPU

set `max_steps` to control the length of training (`save_steps` determines when checkpoints are created)

If you want to use the scripts with you own data, you should check the parameters `source_max_len` and `target_max_len`. The [data parsing script](fine-tuning/scripts/parse_data_alpaca_format.ipynb) contains code to determine the maximum length of the source and target sequences in your data. Adapt the values used in the fine-tuning scripts accordingly.

run fine-tuning:

* of 7b cues model: `bash fine-tuning/scripts/finetune_spkatt_7b_cues.sh`
* of 70b cues model: `bash fine-tuning/scripts/finetune_spkatt_70b_cues.sh`
* of 7b roles model: `bash fine-tuning/scripts/finetune_spkatt_7b_roles.sh`
* of 70b roles model: `bash fine-tuning/scripts/finetune_spkatt_70b_roles.sh`
Loading

0 comments on commit 9f2ee88

Please sign in to comment.