In the realm of software development, maintaining high software quality is a persistent challenge, often impeded by the lack of comprehensive understanding of how specific code modifications influence quality metrics. This study ventures to bridge this gap through an innovative approach that aspires to assess and interpret the impact of code modifications.
The underlying hypothesis posits that code modifications inducing similar changes in software quality metrics can be grouped into distinct clusters. Further, these clusters can be effectively described using an AI language model, thus providing a nuanced understanding of code changes and their quality implications.
To validate this hypothesis, we analyzed a substantial dataset from popular GitHub repositories, segmented into individual code modifications. Each was evaluated based on software quality metrics pre and post-application. Machine learning techniques were utili