Leaderboard | M3CoT

Home Dataset Download Method Paper Code Citation Authors Contact Leaderboard Explore Visualize

Main Statistics

Comparison with Other Datasets

Comparison of the distribution of steps in the rationale for existing benchmarks. Notably, the distributions for MMMU and VCR overlap.

Image diversity

Image diversity analysis (a) and the representation comparison (b) between M³CoT and ScienceQA, where the point area in Figure (b) represents the image semantics coverage in the semantic space.

Representation Distributions

Representation of rationale in two datasets via RoBERTa encoding and tSNE dimensionality reduction.

Rationale Quality Importance

Analysis of the correlation between multidimensional qualities for model-generated rationale and final accuracy performance. The rationale qualities are computed by ROSCOE.