Multimodal Foundation Models by Chunyuan Li, Paperback, 9781638283362 | Buy online at The Nile
Departments
 Free Returns*

Multimodal Foundation Models

From Specialists to General-Purpose Assistants

Author: Chunyuan Li, Zhe Gan, Zhengyuan Yang, Jianwei Yang, Linjie Li, Lijuan Wang and Jianfeng Gao   Series: Foundations and Trends® in Computer Graphics and Vision

A comprehensive survey of the taxonomy and evolution of multimodal foundation models that demonstrate vision and vision-language capabilities, focusing on the transition from specialist models to general-purpose assistants. The focus encompasses five core topics, categorized into two classes.

Read more
Product Unavailable

PRODUCT INFORMATION

Summary

A comprehensive survey of the taxonomy and evolution of multimodal foundation models that demonstrate vision and vision-language capabilities, focusing on the transition from specialist models to general-purpose assistants. The focus encompasses five core topics, categorized into two classes.

Read more

Description

This monograph presents a comprehensive survey of the taxonomy and evolution of multimodal foundation models that demonstrate vision and vision-language capabilities, focusing on the transition from specialist models to general-purpose assistants.

The focus encompasses five core topics, categorized into two classes; (i) a survey of well-established research areas: multimodal foundation models pre-trained for specific purposes, including two topics – methods of learning vision backbones for visual understanding and text-to-image generation; (ii) recent advances in exploratory, open research areas: multimodal foundation models that aim to play the role of general-purpose assistants, including three topics – unified vision models inspired by large language models (LLMs), end-to-end training of multimodal LLMs, and chaining multimodal tools with LLMs.

The target audience of the monograph is researchers, graduate students, and professionals in computer vision and vision-language multimodal communities who are eager to learn the basics and recent advances in multimodal foundation models.

Read more

Product Details

Publisher
now publishers Inc
Published
6th May 2024
Pages
230
ISBN
9781638283362

Returns

This item is eligible for free returns within 30 days of delivery. See our returns policy for further details.

Product Unavailable