QuantCode-Bench FINESSE-Bench

FINESSE-Bench: A Hierarchical Benchmark Suite for Financial Domain Knowledge and Technical Analysis in Large Language Models

Dmitry Stanishevskii, Nini Kamkia, Alexey Khoroshilov, Dmitry Zmitrovich, Denis Kokosinskii, Zhirayr Hayrapetyan, Andrei Kalmykov

Lime

Paper Code Full leaderboard Compare models

3,993

questions

specialized datasets

benchmark groups

en / ru

languages

FINESSE-Bench is a hierarchical benchmark suite for evaluating financial competencies in LLMs. It combines eight specialized datasets with 3,993 questions spanning foundational finance, exam-style reasoning, technical analysis, derivatives trading, and a Russian-language olympiad block. The suite is designed to measure domain breadth, performance degradation across difficulty levels, and transfer from classical public financial benchmarks to more professional finance tasks.

Benchmark Groups

Three leaderboard views for three different kinds of financial competence.

Full Leaderboard

Switch between aggregated groups and individual datasets. Search and sort by score.

Benchmark view

Search model

#	Model	Score	CI	Δ public→exam	Δ public→TA

Compare Models

Select up to 10 models and compare them on one benchmark view. CI overlay appears automatically when ciLow/ciHigh are provided.

Selected models

Dataset Explorer

Eight specialized datasets covering financial exams, technical analysis, derivatives, and multilingual reasoning.

Citation

@article{stanishevskii2026finessebench,
    title={FINESSE-Bench: A Hierarchical Benchmark Suite for Financial Domain Knowledge and Technical Analysis in Large Language Models},
    author={Dmitry Stanishevskii and Nini Kamkia and Alexey Khoroshilov and Dmitry Zmitrovich and Denis Kokosinskii and Zhirayr Hayrapetyan and Andrei Kalmykov},
    year={2026},
    url={https://github.com/LimexAILab/FINESSE-Bench}
  }