Dmitry Stanishevskii, Nini Kamkia, Alexey Khoroshilov, Dmitry Zmitrovich, Denis Kokosinskii, Zhirayr Hayrapetyan, Andrei Kalmykov
Lime
FINESSE-Bench is a hierarchical benchmark suite for evaluating financial competencies in LLMs. It combines eight specialized datasets with 3,993 questions spanning foundational finance, exam-style reasoning, technical analysis, derivatives trading, and a Russian-language olympiad block. The suite is designed to measure domain breadth, performance degradation across difficulty levels, and transfer from classical public financial benchmarks to more professional finance tasks.
Three leaderboard views for three different kinds of financial competence.
Switch between aggregated groups and individual datasets. Search and sort by score.
| # | Model | Score | CI | Δ public→exam | Δ public→TA |
|---|
Select up to 10 models and compare them on one benchmark view. CI overlay appears automatically when ciLow/ciHigh are provided.
Eight specialized datasets covering financial exams, technical analysis, derivatives, and multilingual reasoning.
@article{stanishevskii2026finessebench,
title={FINESSE-Bench: A Hierarchical Benchmark Suite for Financial Domain Knowledge and Technical Analysis in Large Language Models},
author={Dmitry Stanishevskii and Nini Kamkia and Alexey Khoroshilov and Dmitry Zmitrovich and Denis Kokosinskii and Zhirayr Hayrapetyan and Andrei Kalmykov},
year={2026},
url={https://github.com/LimexAILab/FINESSE-Bench}
}