AoT-PsyPhyBENCH:
Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision–Language Models

1 Kyoto University, Japan
2 Center for Information and Neural Networks, Japan
3 National Institute of Information and Communications Technology, Japan
4 The University of Osaka, Japan

A Forward playback

A Backward playback

If a video clip is played in reverse, humans can almost instantly detect violations of physical laws.

However, we found that:
  (1) most vision-language models cannot reliably tell whether a video is played forward or backward;
  (2) most models tend to answer “forward,” even when given a backward clip.

Abstract

We introduce AoT-PsyPhyBENCH, a psychophysically grounded benchmark that tests whether vision–language models (VLMs) can judge the arrow of time in natural videos (forward vs. backward) using the same stimuli and human baselines. Across open-weight and proprietary VLMs, most models perform near chance and fall well short of human accuracy, especially on physically irreversible processes and causal manual actions. These results indicate a gap in temporal and causal inductive biases despite strong visual–semantic capabilities.

Category Breakdown of AoT-PsyPhyBENCH

Category Description + Example Reversal easy for humans? Human F1 (Fwd/Bwd) # samples Included in AoT-PsyPhyBENCH?
(1) Proceed forward locomotion of people, animals, or vehicles

86.5 / 82.5 82 Yes
(2) Fall free-fall / ballistic motion under gravity

86.9 / 82.8 84 Yes
(3) Diffusion centrifugal diffusion or small-particle explosions

84.6 / 78.7 56 Yes
(4) Division division of material by hand or tool

86.0 / 80.6 37 Yes
(5) Put addition / construction of material by hand

84.1 / 77.4 67 Yes
(6) Reciprocal reciprocating (cyclic) motion

71.6 / 38.5 148 No

Leaderboard

🏆 Overall ranking

Rank Family Model Reasoning / Setting F1
Forward
F1
Backward
Acc.

BibTeX

@misc{matta2025waydoestimeflow,
      title={Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision-Language Models},
      author={Shiho Matta and Lis Kanashiro Pereira and Peitao Han and Fei Cheng and Shigeru Kitazawa},
      year={2025},
      eprint={2510.26241},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.26241},
}