Long-Form Video Understanding: Bottlenecks and Design Choices – Part 1 (opens in new tab)
A field guide to long-form video understanding: the design space across two axes - memory, from discard to keep; and compute, from external agents to agentified models.
Read the original article