PUBLICATIONS | Da Song

2026

Evaluating LLMs on Sequential API Call Through Automated Test Generation

Yuheng Huang, Da Song, Zhenlan Ji, Shuai Wang, and Lei Ma

arXiv preprint arXiv:2507.09481, 2025
TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language Models

Ruoyu Sun, Da Song^†, Jiayang Song, Yuheng Huang, and Lei Ma

ASE 2025 Tool Demonstration Track, 2025
Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models

Zhijie Wang^*, Zijie Zhou^*, Da Song^*, Yuheng Huang, Shengmai Chen, Lei Ma, and Tianyi Zhang

In 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), 2025
Online safety analysis for llms: a benchmark, an assessment, and a path forward

Xuan Xie, Jiayang Song, Zhehua Zhou, Yuheng Huang, Da Song, and Lei Ma

IEEE Transactions on Artificial Intelligence, 2025
Testeval: Benchmarking large language models for test case generation

Wenhan Wang, Chenyuan Yang, Zhijie Wang, Yuheng Huang, Zhaoyang Chu, Da Song, Lingming Zhang, An Ran Chen, and Lei Ma

In Findings of the Association for Computational Linguistics: NAACL 2025, 2025

Luna: A model-based universal analysis framework for large language models

Da Song, Xuan Xie, Jiayang Song, Derui Zhu, Yuheng Huang, Felix Juefei-Xu, and Lei Ma

IEEE Transactions on Software Engineering, 2024
Promptcharm: Text-to-image generation through multi-modal prompting and refinement

Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, and Tianyi Zhang

In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2024

DeepLens: interactive out-of-distribution data detection in NLP models

Da Song, Zhijie Wang, Yuheng Huang, Lei Ma, and Tianyi Zhang

In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023
DeepSeer: Interactive RNN explanation and debugging via state abstraction

Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, and Tianyi Zhang

In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023