A text generation model optimized for reasoning with image and text inputs.
Discovered on HuggingFace via HuggingFace:unknown