Decoding Finance’s Visual Language

A benchmark assesses the capacity of vision-language models to reason over complex financial documents-specifically French prospectuses, Key Information Documents, and PRIIPs-by evaluating responses to questions spanning textual, tabular, and chart-based information, utilizing a majority-vote protocol powered by a large language model as the arbiter of correctness.

New research reveals the challenges vision-language models face when interpreting complex financial documents, particularly those containing charts and tables.