Blip2-Image-Captioning

For Mac (M1, M2, M3) Windows / Linux (CUDA), or CPU

Supported Formats • Installation • Usage

MR-XOTOX-NASSE-WAENDE-SDXL (https://civitai.com/models/448483?modelVersionId=499427)

Supported Formats

JPG, JPEG, PNG, BMP, GIF

Installation

Create a virtual Python environment in the same directory!

Clone the reposity

git clone https://github.com/MarkusR1805/blip2-image-captioning.git

Open the terminal in the directory, e.g. /path/captioning/Blip2-Image-Captioning

python3.12 -m venv env

source env/bin/activate

Install requirements.txt

pip install --upgrade pip

pip install -r requirements.txt

or

pip install language-tool-python
pip install nltk
pip install pillow
pip install psutil
pip install torch
pip install transformers

Use Model

Update the programme regularly

git pull

Start the programm in terminal with

python main.py

Salesforce blip2-opt-2.7b with ≈ 3.744.679.936 params

Approximately 15 GB in size. Either you use the programme as it is set (from Huggingface), or you load the model locally on your computer and have to change the path to "main.py". You must then adapt these lines of code!

#model_path = "/Volumes/SSD T7/Salesforce-blip2-opt-27b" # Local path

or model_path = "Salesforce/blip2-opt-2.7b" # Huggingface path

Usage

Attention! All text files in the directory will be deleted without being asked! All files with the suffix .txt!!!

You must be in the programme directory in the terminal, then start the programme with "python main.py"

This application is used via the terminal, here I show it using the example of a MacBook M3

First question: The path to the directory in which the images are located Second question: The path to ignore_list.txt (leave empty if no explicit file exists) Default is the programme path Third question: The path to allowed_list.txt (leave empty if no explicit file exists) Default is the programme path Fourth question: Additional keywords 2-3 or more at the very beginning of the image description (enter separated by a comma)

The programme creates text files with the same name as the image, example image1.png = image1.txt First, only image descriptions are created for all images, then keywords are filtered from the image description and placed in front of the image description in addition to the keywords you entered at the beginning. The following files are also created:

gesamt.txt / All image descriptions in one file, ideal for use as a wildcard
extracted_words.txt / all keywords of all images can be found here
t_extracted_words.txt / as in 2 but with the tokens added
a CSV table with image description and image path

You can adapt and change the python code and also change the parameters of the model. Just experiment with the changes, you can also use the larger Blip2 model but it has 33GB and takes longer to process the images.

Name		Name	Last commit message	Last commit date
Latest commit History 92 Commits
__pycache__		__pycache__
.DS_Store		.DS_Store
.gitignore		.gitignore
README.md		README.md
allowed_words.txt		allowed_words.txt
ignore_liste.txt		ignore_liste.txt
main.py		main.py
requirements.txt		requirements.txt
utils.py		utils.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Blip2-Image-Captioning

For Mac (M1, M2, M3) Windows / Linux (CUDA), or CPU

Supported Formats

Installation

Create a virtual Python environment in the same directory!

Install requirements.txt

Use Model

Salesforce blip2-opt-2.7b with ≈ 3.744.679.936 params

Usage

Attention! All text files in the directory will be deleted without being asked! All files with the suffix .txt!!!

This application is used via the terminal, here I show it using the example of a MacBook M3

You can adapt and change the python code and also change the parameters of the model. Just experiment with the changes, you can also use the larger Blip2 model but it has 33GB and takes longer to process the images.

Salesforce/blip2-opt-6.7b or little better Salesforce/blip2-opt-6.7b-coco

About

Releases

Packages

Languages

MarkusR1805/blip2-image-captioning

Folders and files

Latest commit

History

Repository files navigation

Blip2-Image-Captioning

For Mac (M1, M2, M3) Windows / Linux (CUDA), or CPU

Supported Formats

Installation

Create a virtual Python environment in the same directory!

Install requirements.txt

Use Model

Salesforce blip2-opt-2.7b with ≈ 3.744.679.936 params

Usage

Attention! All text files in the directory will be deleted without being asked! All files with the suffix .txt!!!

This application is used via the terminal, here I show it using the example of a MacBook M3

You can adapt and change the python code and also change the parameters of the model. Just experiment with the changes, you can also use the larger Blip2 model but it has 33GB and takes longer to process the images.

Salesforce/blip2-opt-6.7b or little better Salesforce/blip2-opt-6.7b-coco

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages