Yi Li bio photo

Yi Li

Associate Professor

College of Computing and Data Science (CCDS)
Nanyang Technological University (NTU)

Address: Block N4-02b-63
50 Nanyang Avenue, Singapore 639798
Phone: +65 6790 4287

Email Twitter LinkedIn GitHub Bitbucket Google Scholar ORCID

A paper [1] in collaboration with Xiufeng Xu (my PhD student), Fuman Xie, Chenguang Zhu, Guangdong Bai, and Sarfraz Khurshid was accepted at ISSTA’25. A summary of the paper is below:

Modern AI- and Data-intensive software systems rely heavily on data science and machine learning libraries that provide essential algorithmic implementations and computational frameworks. These libraries expose complex APIs whose correct usage has to follow constraints among multiple interdependent parameters. Developers using these APIs are expected to learn about the constraints through the provided documentation and any discrepancy may lead to unexpected behaviors. However, maintaining correct and consistent multi- parameter constraints in API documentation remains a significant challenge for API compatibility and reliability. To address this challenge, we propose MPChecker for detecting inconsistencies between code and documentation, specifically focusing on multi-parameter constraints. MPChecker identifies these constraints at the code level by exploring execution paths through symbolic execution and further extracts corresponding constraints from documentation using large language models (LLMs). We propose a customized fuzzy constraint logic to reconcile the unpredictability of LLM outputs and detect logical inconsistencies between the code and documentation constraints. We collected and constructed two datasets from four popular data science libraries and evaluated MPChecker on them. The results demonstrate that MPChecker can effectively detect inconsistency issues with the precision of 92.8%. We further reported 14 detected inconsistency issues to the library developers, who have confirmed 11 issues at the time of writing.

This year, 107 out of 550 submissions were accepted at ISSTA, which gives an acceptance rate of 19.4%.

References

  1. Xu, X., Xie, F., Zhu, C., Bai, G., Khurshid, S., & Li, Y. (2025, June). Identifying Multi-Parameter Constraint Errors in Python Data Science Library API Documentations. Proceedings of the 34th ACM SIGSOFT International Symposium on Software Testing and Analysis (ISSTA).