1S-Lab, Nanyang Technological University  2Microsoft Research, Redmond

This weight is for initilizing training for Otter-MPT1B.

It's directly converted from openflamingo/OpenFlamingo-3B-vitl-mpt1b-langinstruct.

You can load and try this model using

model = OtterForConditionalGeneration.from_pretrained("luodian/OTTER-MPT7B-Init", device_map="sequential")
model.text_tokenizer.padding_side = "left"
tokenizer = model.text_tokenizer
image_processor = transformers.CLIPImageProcessor()
model.eval()

You can also start training Otter via the commands

python -m accelerate.commands.launch --config_file=./pipeline/accelerate_configs/accelerate_config_fsdp.yaml \
pipeline/train/instruction_following.py \
--pretrained_model_name_or_path=luodian/OTTER-MPT1B-RPJama-Init \
--mimicit_path=/data/azure_storage/otter/mimicit/xx/xx_instructions.json \
--images_path=/data/azure_storage/otter/mimicit/xx/xx.json \
--batch_size=4 --num_epochs=1 --report_to_wandb \
--wandb_entity=ntu-slab \
--external_save_dir=/data/bli/checkpoints \
--save_hf_model \
--run_name=OTTER-MPT1B \
--wandb_project=OTTER-MPT1B \
--workers=4 \
--lr_scheduler=cosine \
--learning_rate=1e-5 \
--warmup_steps_ratio=0.01

If you wish to init a video instruction tuning, you should add

"max_num_frames": 128

to config.json inside the folder.

Leave us a message if you have any error or question. You can follow Otter code (see training section) to further tune your model on top of it.

Downloads last month
35
Inference Examples
This model does not have enough activity to be deployed to Inference API (serverless) yet. Increase its social visibility and check back later, or deploy to Inference Endpoints (dedicated) instead.