Sure, you can slice a video up into images and process them separately - that's ... | Hacker News

Hacker Newsnew | past | comments | ask | show | jobs | submit

		simonw on Feb 21, 2024 \| parent \| context \| favorite \| on: The killer app of Gemini Pro 1.5 is using video as... Sure, you can slice a video up into images and process them separately - that's apparently how Gemini Pro works, it uses one frame from every second of video. But you still need a REALLY long context length to work with that information - the magic combination here is 1,000,000 tokens combined with good multi-model image inputs.

keefle on Feb 21, 2024 [–]

I see, but I was wondering about the partial transferability of this feature to other LLMs

But fair enough, context length is key in this scenario

Guidelines | FAQ | Lists | API | Security | Legal | Apply to YC | Contact