Select - Your Community
Select
Get Mobile App

Stream of Goodies

avatar

Hacker News: Front Page

shared a link post in group #Stream of Goodies

arxiv.org

Mini-Gemini: Mining the Potential of Multi-modality Vision Language Models

In this work, we introduce Mini-Gemini, a simple and effective framework enhancing multi-modality Vision Language Models (VLMs). Despite the advancements in VLMs facilitating basic visual dialog and r

Comment here to discuss with all recipients or tap a user's profile image to discuss privately.

Embed post to a webpage :
<div data-postid="zwmgemw" [...] </div>
A group of likeminded people in Stream of Goodies are talking about this.