PUBLICATIONS

† Corresponding Author; * Contributed equally.

2026

  1. lecov.png
    LeCov: Multi-level testing criteria for large language models
    Xuan Xie, Jiayang Song, Yuheng Huang, Da Song, Felix Juefei-Xu, and Lei Ma
    Journal of Systems and Software, 2026

2025

  1. stategen.png
    Evaluating LLMs on Sequential API Call Through Automated Test Generation
    Yuheng Huang, Da Song, Zhenlan Ji, Shuai Wang, and Lei Ma
    arXiv preprint arXiv:2507.09481, 2025
  2. trustvis.png
    TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language Models
    Ruoyu Sun, Da Song, Jiayang Song, Yuheng Huang, and Lei Ma
    ASE 2025 Tool Demonstration Track, 2025
  3. code_error.png
    Towards Understanding the Characteristics of Code Generation Errors Made by Large Language Models
    Zhijie Wang*, Zijie Zhou*, Da Song*, Yuheng Huang, Shengmai Chen, Lei Ma, and Tianyi Zhang
    In 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), 2025
  4. online.png
    Online safety analysis for llms: a benchmark, an assessment, and a path forward
    Xuan Xie, Jiayang Song, Zhehua Zhou, Yuheng Huang, Da Song, and Lei Ma
    IEEE Transactions on Artificial Intelligence, 2025
  5. testeval.png
    Testeval: Benchmarking large language models for test case generation
    Wenhan Wang, Chenyuan Yang, Zhijie Wang, Yuheng Huang, Zhaoyang Chu, Da Song, Lingming Zhang, An Ran Chen, and Lei Ma
    In Findings of the Association for Computational Linguistics: NAACL 2025, 2025

2024

  1. luna.png
    Luna: A model-based universal analysis framework for large language models
    Da Song, Xuan Xie, Jiayang Song, Derui Zhu, Yuheng Huang, Felix Juefei-Xu, and Lei Ma
    IEEE Transactions on Software Engineering, 2024
  2. promptcharm.png
    Promptcharm: Text-to-image generation through multi-modal prompting and refinement
    Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, and Tianyi Zhang
    In Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2024

2023

  1. deeplens.png
    DeepLens: interactive out-of-distribution data detection in NLP models
    Da Song, Zhijie Wang, Yuheng Huang, Lei Ma, and Tianyi Zhang
    In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023
  2. deepseer.png
    DeepSeer: Interactive RNN explanation and debugging via state abstraction
    Zhijie Wang, Yuheng Huang, Da Song, Lei Ma, and Tianyi Zhang
    In Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023