Incentivizing Tool-augmented Thinking with Images for Medical Image Analysis
arxiv.org·1d
👁️Computer Vision
Preview
Report Post

Title:Incentivizing Tool-augmented Thinking with Images for Medical Image Analysis

View PDF HTML (experimental)

Abstract:Recent reasoning based medical MLLMs have made progress in generating step by step textual reasoning chains. However, they still struggle with complex tasks that necessitate dynamic and iterative focusing on fine-grained visual regions to achieve precise grounding and diagnosis. We introduce Ophiuchus, a versatile, tool-augmented framework that equips an MLLM to (i) decide when additional visual evidence is needed, (ii) determine where to probe and ground within the medical image, and (iii) seamlessly weave the relevant sub-image content back into an interleaved, multimodal chain …

Similar Posts

Loading similar posts...