forked from Vision-CAIR/MiniGPT-4
-
Notifications
You must be signed in to change notification settings - Fork 0
/
modifications.txt
72 lines (54 loc) · 2.59 KB
/
modifications.txt
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
We made the following modifications from the original:
Inference files for the command lines:
- demo_v2_command_line.py
- demo_v1_command_line.py
We add dataset files inside of: ~/FaceGPT/Archive/MiniGPT-4/minigpt4/datasets/datasets :
- sharegpt_dataset.py
- chatgpt4vision_datasets.py
We add train configs files inside of: ~/FaceGPT/Archive/MiniGPT-4/train_configs :
- minigpt4_stage2_finetune_gpt4vision.yaml
- minigptv2_finetune_gpt4vision.yaml
We add folders to contain the configurations files for the training files path,
to be found at: ~/FaceGPT/Archive/MiniGPT-4/minigpt4/configs/datasets/ :
- chatgpt4vision
- sharegpt
Inside of the file MiniGPT-4/minigpt4/datasets/builders/image_text_pair_builder.py :
- we need a wrapper builder class for each of the finetuning dataset (ShareGPT_Face and FaceGPT4Vision)
We then put of our data, inside of /fast/vfourel/FaceGPT/Data/MiniFaceGPT4Data
For each dataset:
- we have the ShareGPT Data with images annotated by GPT4Vision and/or the ShareCaptioner by the ShareGPT:
Number of images in each file:
subset_objects_gpt4vision_100k.json: 19514 images
subset_objects_share-captioner_coco_lcs_sam_1246k_1107.json: 17754 images
Number of overlapping image paths: 17759
- We have the GPT4VisionDataset that we have collected by sending images to OpenAI's API for annotations:
We have the subsections:
Object counts in gpt4VisionCalls_Cleaned_BySection.json:
Pre-Prompt: 4 objects
Facial Expression due to contextual cues: 1497 objects
Facial Characteristics Long: 1511 objects
Factual Generalization: 1246 objects
Mouvement Description: 349 objects
Interpretative Generalization: 1346 objects
Intensity of the emotion: 1584 objects
Posture: 739 objects
Race: 1188 objects
Race and Gender: 192 objects
Facial Characteristics Short : 196 objects
Gender: 1205 objects
Human or not: 1182 objects
Feature by Feature Description: 766 objects
Fitzpatrick: 94 objects
Multi-Person Interaction: 0 objects
The train loop is executed inside of MiniGPT-4/minigpt4/task/base_task.py
We modify the files MiniGPT-4/train_configs/minigptv2_finetune_gpt4vision.yaml
to lower the number of workerrs for training to 1, and set the number of
gradient accumulations from 1 to 8
In the file: MiniGPT-4/minigpt4/models/minigpt_base.py
we modify to return {"loss": loss,"output s": outputs} # modifications by VF
Such that we can delete outputs to save RAM space
We can then delete it in MiniGPT-4/minigpt4/task/base_task.py
- we modify the function train_step(self, model, samples)
- _train_inner_loop
we try a last thing by adding this as export:
export PYTORCH_CUDA_ALLOC_CONF=max_split_size_mb:512