A model that converts image-text data into a text format for easier processing and analysis.
Discovered on HuggingFace via HuggingFace:unknown