Novel Image Captioning
- Lijuan Wang, Microsoft
When does a machine “understand” an image? One definition is when it can generate a novel caption that summarizes the salient content within an image. This content may include objects that are present, their attributes, actions, or their relations with each other. Determining the salient content requires not only knowing the contents of an image, but also deducing which aspects of the scene may be interesting or novel through commonsense knowledge. This video demonstrates the quality of the latest image captioning model (after) compared with the old model (before).
-
-
Lijuan Wang
Principal Research Manager
-
-
다음 볼만한 동영상
-
-
Session: Compute & Trust (Systems)
- Ashish Panwar,
- Aditya Desai,
- Abhilash Jindal
-
Multimodal & Embodied Intelligence (Pt 1), Panel on Multimodal AI: Progress, Pitfalls, Possibilities
- Madhava Krishna,
- Sriram Ganapathy,
- Somak Aditya
-
Session on Compute & Trust (Security)
- Krishna Pillutla,
- Danish Pruthi
-
-
Session on Reasoning
- Hongxiang Fan,
- Nagarajan Natarajan
-
-
Session on Retrieval
- Lokesh Nagalapatti,
- Soumen Chakrabarti
-
-