ImageBind One Embedding Space to Bind Them All

ImageBind

这是一个由 Meta AI 开源的新型多模态 AI 模型，支持在图像、文本、音频等六种不同模态之间任意转换。比如它可以根据一段火车的音频，自动生成火车的照片、视频和一段文本。

This is a new multimodal AI model open-sourced by Meta AI, which supports arbitrary conversions between six different modalities such as images, text, and audio. For example, it can automatically generate a photo, video, and text description of a train based on an audio clip of a train.

ImageBind

ImageBind

Comments