It is a simple application that builds on top of Alpaca-LORA by using BLIP-2 to analyze pictures, and send the captions and other prompts using VQA as context for ALPACA. This then allows ALPACA to use images as an additional input to guide its outputs.