Towards Understanding Third-Party Library Dependency in C/C++ Ecosystem
Wei Tang, Zhengzi Xu, Chengwei Liu, Jiahui Wu, Shouguo Yang, Yi Li, Ping Luo, and Yang Liu
In Proceedings of the 37th IEEE/ACM International Conference on Automated Software Engineering (ASE), 2022
Abstract: Third-Party libraries (TPLs) are frequently reused in software to reduce development cost and the time to market. However, external library dependencies may introduce vulnerabilities into host applications. The issue of library dependency has received considerable critical attention. Many package managers, such as Maven, Pip, and NPM, are proposed to manage TPLs. Moreover, a significant amount of effort has been put into studying dependencies in language ecosystems like Java, Python, and JavaScript except C/C++. Due to the lack of a unified package manager for C/C++, existing research has only few understanding of TPL dependencies in the C/C++ ecosystem, especially at large scale. Towards understanding TPL dependencies in the C/C++ ecosystem, we collect existing TPL databases, package management tools, and dependency detection tools, summarize the dependency patterns of C/C++ projects, and construct a comprehensive and precise C/C++ dependency detector. Using our detector, we extract dependencies from a large-scale database containing 24K C/C++ repositories from GitHub. Based on the extracted dependencies, we provide the results and findings of an empirical study, which aims at understanding the characteristics of the TPL dependencies. We further discuss the implications to manage dependency for C/C++ and the future research directions for software engineering researchers and developers in fields of library development, software composition analysis, and C/C++ package manager. Our dataset and source code used in this work are anonymously available at https://anonymous.4open.science/r/ccscanner-7491/.