PUBLICATIONS
† Corresponding Author; * Contributed equally.
2026
-
LeCov: Multi-level testing criteria for large language modelsJournal of Systems and Software, 2026
2025
-
Evaluating LLMs on Sequential API Call Through Automated Test GenerationarXiv preprint arXiv:2507.09481, 2025 -
TRUSTVIS: A Multi-Dimensional Trustworthiness Evaluation Framework for Large Language ModelsASE 2025 Tool Demonstration Track, 2025 -
Towards Understanding the Characteristics of Code Generation Errors Made by Large Language ModelsIn 2025 IEEE/ACM 47th International Conference on Software Engineering (ICSE), 2025 -
Online safety analysis for llms: a benchmark, an assessment, and a path forwardIEEE Transactions on Artificial Intelligence, 2025 -
Testeval: Benchmarking large language models for test case generationIn Findings of the Association for Computational Linguistics: NAACL 2025, 2025
2024
-
Luna: A model-based universal analysis framework for large language modelsIEEE Transactions on Software Engineering, 2024 -
Promptcharm: Text-to-image generation through multi-modal prompting and refinementIn Proceedings of the 2024 CHI Conference on Human Factors in Computing Systems, 2024
2023
-
DeepLens: interactive out-of-distribution data detection in NLP modelsIn Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023 -
DeepSeer: Interactive RNN explanation and debugging via state abstractionIn Proceedings of the 2023 CHI Conference on Human Factors in Computing Systems, 2023