news:ai
Table of Contents
Artificial Intelligence News Summaries
The killer app of Gemini Pro 1.5 is video
Gemini Pro 1.5 has a 1,000,000 token context size. This is huge—previously that record was held by Claude 2.1 (200,000 tokens) and gpt-4-turbo (128,000 tokens)—though the difference in tokenizer implementations between the models means this isn’t a perfectly direct comparison.
I’ve been playing with Gemini Pro 1.5 for a few days, and I think the most exciting feature isn’t so much the token count… it’s the ability to use video as an input.
The ability to extract structured content from text is already one of the most exciting use-cases for LLMs. GPT-4 Vision and LLaVA expanded that to images. And now Gemini Pro 1.5 expands that to video.
The ability to analyze video like this feels SO powerful. Being able to take a 20 second video of a bookshelf and get back a JSON array of those books is just the first thing I thought to try.
news/ai.txt · Last modified: 2024/02/21 22:12 by lmuszkie