AoT-PsyPhyBENCH:
Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision–Language Models

Shiho Matta¹, Lis K. Pereira^2,3,4, Peitao Han^2,3,4, Fei Cheng¹, and Shigeru Kitazawa^2,3,4

¹ Kyoto University, Japan
² Center for Information and Neural Networks, Japan
³ National Institute of Information and Communications Technology, Japan
⁴ The University of Osaka, Japan

Paper Code

A Forward playback

A Backward playback

If a video clip is played in reverse, humans can almost instantly detect violations of physical laws.

However, we found that:
(1) most vision-language models cannot reliably tell whether a video is played forward or backward;
(2) most models tend to answer “forward,” even when given a backward clip.

Abstract

We introduce AoT-PsyPhyBENCH, a psychophysically grounded benchmark that tests whether vision–language models (VLMs) can judge the arrow of time in natural videos (forward vs. backward) using the same stimuli and human baselines. Across open-weight and proprietary VLMs, most models perform near chance and fall well short of human accuracy, especially on physically irreversible processes and causal manual actions. These results indicate a gap in temporal and causal inductive biases despite strong visual–semantic capabilities.

Category Breakdown of AoT-PsyPhyBENCH

Category	Description + Example	Reversal easy for humans?	Human F1 (Fwd/Bwd)	# samples	Included in AoT-PsyPhyBENCH?
(1) Proceed	forward locomotion of people, animals, or vehicles	✅	86.5 / 82.5	82	Yes
(2) Fall	free-fall / ballistic motion under gravity	✅	86.9 / 82.8	84	Yes
(3) Diffusion	centrifugal diffusion or small-particle explosions	✅	84.6 / 78.7	56	Yes
(4) Division	division of material by hand or tool	✅	86.0 / 80.6	37	Yes
(5) Put	addition / construction of material by hand	✅	84.1 / 77.4	67	Yes
(6) Reciprocal	reciprocating (cyclic) motion	❌	71.6 / 38.5	148	No

Leaderboard

🏆 Overall ranking

Rank	Family	Model	Reasoning / Setting	F1 Forward	F1 Backward	Acc.

BibTeX

@misc{matta2025waydoestimeflow,
      title={Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision-Language Models},
      author={Shiho Matta and Lis Kanashiro Pereira and Peitao Han and Fei Cheng and Shigeru Kitazawa},
      year={2025},
      eprint={2510.26241},
      archivePrefix={arXiv},
      primaryClass={cs.CV},
      url={https://arxiv.org/abs/2510.26241},
}

AoT-PsyPhyBENCH: Which Way Does Time Flow? A Psychophysics-Grounded Evaluation for Vision–Language Models