Can Multimodal LLMs Perform Time Series Anomaly Detection?

This repo includes the official code and datasets for paper "Can Multimodal LLMs Perform Time Series Anomaly Detection?"

🕵️‍♂️ VisualTimeAnomaly

Left: the workflow of VisualTimeAnomaly. Right: the performance comparison across various setting.

🏆 Contributions

The first comprehensive benchmark for multimodal LLMs (MLLMs) in time series anomaly detection (TSAD), covering diverse scenarios (univariate, multivariate, irregular) and varying anomaly granularities (point-, range-, variate-wise).
Several critical insights significantly advance the understanding of both MLLMs and TSAD.
We construct a large-scale dataset including 12.4k time series images, and release the dateset and code to foster future research.

🔎 Findings

MLLMs detect range- and variate-wise anomalies more effectively than point-wise anomalies;
MLLMs are highly robust to irregular time series, even with 25% of the data missing;
Open-source MLLMs perform comparably to proprietary models in TSAD. While open-source MLLMs excel on univariate time series, proprietary MLLMs demonstrate superior effectiveness on multivariate time series.

⚙️ Getting Started

Environment

python 3.10.14
torch 2.4.1
numpy 1.26.4
transformers 4.49.0.dev0
huggingface-hub 0.24.7
openai 1.44.0
google-generativeai 0.8.3

Dataset

Enter src folder.

If you want to generate all datasets, execute the below script:

./generator.sh

If you want to generate a specific dataset, execute the below script:

python generator.py --category $category --scenario $scenario --anomaly_type $anomaly_type --num_ts $num_ts.

For example, generate 100 univaraite time series images for global anomalies:

python generator.py --category synthetic --scenario univariate --anomaly_type global --num_ts 100

Run

Enter src folder.

If you want to run MLLMs on all datasets, execute the below script:

./test.sh

If you want to run a MLLM on a specific dataset, execute the below script:

python main.py --category $category --scenario $scenario --model_name $model_name --data $data

For example, run GPT-4o on univaraite time series scenario with global anomalies:

python main.py --category synthetic --scenario univariate --model_name gpt-4o --data global

Acknowledgement

We sincerely appreciate the following github repo for the code base and datasets:

https://github.com/Rose-STL-Lab/AnomLLM

https://github.com/datamllab/tods/tree/benchmark

📝 Citation

If you find our work useful, please cite the below paper:

@article{xu2025can,
  title={Can Multimodal LLMs Perform Time Series Anomaly Detection?},
  author={Xu, Xiongxiao and Wang, Haoran and Liang, Yueqing and Yu, Philip S and Zhao, Yue and Shu, Kai},
  journal={arXiv preprint arXiv:2502.17812},
  year={2025}
}

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
src		src
LICENSE		LICENSE
README.md		README.md
teaser.png		teaser.png

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

Can Multimodal LLMs Perform Time Series Anomaly Detection?

🕵️‍♂️ VisualTimeAnomaly

🏆 Contributions

🔎 Findings

⚙️ Getting Started

Environment

Dataset

Run

Acknowledgement

📝 Citation

About

Uh oh!

Releases

Packages

Languages

License

mllm-ts/VisualTimeAnomaly

Folders and files

Latest commit

History

Repository files navigation

Can Multimodal LLMs Perform Time Series Anomaly Detection?

🕵️‍♂️ VisualTimeAnomaly

🏆 Contributions

🔎 Findings

⚙️ Getting Started

Environment

Dataset

Run

Acknowledgement

📝 Citation

About

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages