GitHub - guolanting/mergekit at 4643a472fb305d3670497f2546503f2725318444

Branches Tags

Name		Name	Last commit message	Last commit date
Latest commit History 40 Commits
examples		examples
merge_methods		merge_methods
.gitignore		.gitignore
LICENSE		LICENSE
README.md		README.md
common.py		common.py
config.py		config.py
graph.py		graph.py
lazy_tensors.py		lazy_tensors.py
main.py		main.py
notebook.ipynb		notebook.ipynb
plan.py		plan.py
requirements.txt		requirements.txt

Repository files navigation

mergekit

mergekit is a toolkit for merging pre-trained language models, using a variety of merge methods including TIES, linear, and slerp merging. The toolkit also enables piecewise assembly of a language model from layers selected from other models using bakllama.py.

Merging Models with `main.py`

Usage

To merge models using the main.py script, specify the output directory for the final model and the models to be merged using the --merge option. Depending on the merge method chosen, other parameters such as --density, --weight, and --base-model might be necessary.

The script supports the following merge methods:

Resolving Interference When Merging Models (default method, 'ties')
- Requires a base model.
- Can specify per-model weights and densities.
Linear
- Does not require a base model.
- Must specify weights for all models being merged.
SLERP
- Requires exactly two models.
- Must specify a single weight to set the interpolation parameter between the two models.

Examples

Merging with TIES method and specifying per-model weights and densities:

python main.py ./output-model --base-model TheBloke/Llama-2-13B-fp16 --cuda \
    --merge WizardLM/WizardLM-13B-V1.2 --weight 0.3 --density 0.5 \
    --merge garage-bAInd/Platypus2-13B --weight 0.5 --density 0.5

Merging with linear method and setting model weights:

python main.py ./output-model --cuda --method linear \
    --merge garage-bAInd/Platypus2-13B --weight 0.6 \
    --merge WizardLM/WizardLM-13B-V1.2 --weight 0.2

Merging with SLERP method and setting interpolation parameter:

python main.py ./output-model --cuda --method slerp --base-model garage-bAInd/Platypus2-13B \
    --merge WizardLM/WizardLM-13B-V1.2 --weight 0.5

SLERP with layer weight interpolated gradient:

python main.py ./output-model --cuda --method slerp --base-model garage-bAInd/Platypus2-13B \
    --merge WizardLM/WizardLM-13B-V1.2 --layer-gradient "[0, 0.3, 0.7, 0, 0.1, 1.0]"

Refer to the script's help message (python main.py --help) for detailed information on all available options.

Piecewise layer combinations with `bakllama.py`

The bakllama.py script allows you to assemble a model piecewise with layers taken from other pre-trained models.

Configuration

To use the bakllama.py script, you need to create a YAML configuration file where you define the layers to be used from various source models, and optionally specify the sources for the embedding and LM head components.

The configuration file should have the following fields:

layer_slices: A list of layer slice objects, each specifying a range of layers to take from a source model.
- model: The identifier or path of the source model.
- start: The starting layer index (inclusive).
- end: The ending layer index (exclusive).
- scale: (Optional) A scaling factor for the weights of the layers.
embedding_source: (Optional) The model to take the embedding layer from. If not specified, it defaults to the first model listed in layer_slices.
lm_head_source: (Optional) The model to take the LM head from. If not specified, it defaults to the last model listed in layer_slices.

Usage

Once you have created the YAML configuration file, run bakllama.py script with the config file and output path as arguments:

python bakllama.py path/to/your/config.yml ./output-model-directory

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

mergekit

Merging Models with `main.py`

Usage

Examples

Piecewise layer combinations with `bakllama.py`

Configuration

Usage

About

Releases

Packages

Languages

License

guolanting/mergekit

Folders and files

Latest commit

History

Repository files navigation

mergekit

Merging Models with main.py

Usage

Examples

Piecewise layer combinations with bakllama.py

Configuration

Usage

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Languages

Merging Models with `main.py`

Piecewise layer combinations with `bakllama.py`

Packages