Journées GDR-GPL 2023

Le GT VL organise une session en parallèle des journées du GDR-GPL le lundi 5 juin 2023 de 15h00 à 17h30.

Programme

15:00–15:30
HyperAST: Enabling Efficient Analysis of Software Histories at Scale
Quentin Le Dilavrec (IRISA)
Abstract Syntax Trees (ASTs) are widely used beyond compilers in many tools that measure and improve code quality, such as code analysis, bug detection, mining code metrics, refactoring. With the advent of fast software evolution and multistage releases, the temporal analysis of an AST history is becoming useful to understand and maintain code. However, jointly analyzing thousands versions of ASTs independently faces scalability issues, mostly combinatorial, both in terms of memory and CPU usage. In this paper, we propose a novel type of AST, called HyperAST, that enables efficient temporal code analysis on a given software history by: 1/ leveraging code redundancy through space (between code elements) and time (between versions); 2/ reusing intermediate computation results. We show how the HyperAST can be built incrementally on a set of commits to capture all multiple ASTs at once in an optimized way. We evaluated the HyperAST on a curated list of large software projects. Compared to Spoon, a state-of-the-art technique, we observed that the HyperAST outperforms it with an order-of-magnitude difference from × 6 up to × 8076 in CPU construction time and from × 12 up to × 1159 in memory footprint. While the HyperAST requires up to 2 h 22 min and 7.2 GB for the biggest project, Spoon requires up to 93 h and 31 min and 2.2 TB. The gains in construction time varied from to and the gains in memory footprint varied from to . We further compared the task of finding references of declarations with the HyperAST and Spoon. We observed on average precision and recall without a significant difference in search time.
15:30–16:00
Une théorie des organisations communautaires de maintenance de paquets
Théo Zimmerman (LTCI, Télécom Paris, Institut Polytechnique de Paris)
Dans de nombreux écosystèmes de langages de programmation, les développeurs dépendent de plus en plus de dépendances externes en open source, disponibles via des gestionnaires de paquets. Les paquets clés qui ne sont pas maintenus présentent un risque pour les projets qui en dépendent ainsi que pour les écosystèmes. Par conséquent, des initiatives communautaires peuvent émerger au sein des écosystèmes pour résoudre ce problème en adoptant les paquets clés ayant des problèmes de maintenance. Dans mon exposé, je présenterai les résultats de l’article coécrit avec Jean-Rémy Falleri et récemment accepté pour publication dans Empirical Software Engineering intitulé “A Grounded Theory of Community Package Maintenance Organizations”. Le but de celui-ci était de construire une théorie de ces organisations (CPMO), notamment leur émergence et leur mode de fonctionnement. Pour ce faire, nous avons utilisé une méthodologie qualitative appelée Grounded Theory. Nous avons analysé des documents existants provenant de plusieurs CPMO, tels que des documentations et des discussions sur des forums publics, complétés par des entretiens avec des initiateurs de CPMO. Je parlerai également de mon application de ce modèle d’organisation à l’écosystème Coq, avec la création de l’organisation Coq-community, et de l’expérience acquise par ce biais.
16:30–17:00
On the Benefits and Limits of Incremental Build of Software Configurations: An Exploratory Study
Georges Aaron Randrianaina (IRISA)
Software projects use build systems to automate the compilation, testing, and continuous deployment of their software products. As software becomes increasingly configurable, the build of multiple configurations is a pressing need, but expensive and challenging to implement. The current state of practice is to build independently (a.k.a., clean build) a software for a subset of configurations. While incremental build has been studied for software evolution and relatively small changes of the source code, it has surprisingly not been considered for software configurations. In this exploratory study, we examine the benefits and limits of building software configurations incrementally, rather than always building them cleanly. By using five real-life configurable systems as subjects, we explore whether incremental build works, outperforms a sequence of clean builds, is correct w.r.t. clean build, and can be used to find an optimal ordering for building configurations. Our results show that incremental build is feasible in 100% of the times in four subjects and in 78% of the times in one subject. In average, 88.5% of the configurations could be built faster with incremental build while also finding several alternatives faster incremental builds. However, only 60% of faster incremental builds are correct. Still, when considering those correct incremental builds with clean builds, we could always find an optimal order that is faster than just a collection of clean builds with a gain up to 11.76%.
17:00–17:30
Guiding Feature Models Synthesis from User-Stories: An Exploratory Approach
Thomas Georges (LIRMM)
Throughout the software lifecycle, a huge amount of knowledge is accumulated around the source code. In our work, we focus on agile software requirements, more specifically on user stories, and on issues and merge requests of the version control platforms, opened for implementing user stories. In this paper, we present a method that leverages the use of this knowledge to guide an SPL migration. In addition to user stories and the source code itself, we exploit domain ontologies to enrich and better organize this knowledge. We consider merge requests in version control systems as the hub between user stories (requirements) and the source code (implementation). In this work, we aim to synthesize feature models by combining several approaches. Natural language processing and clustering of user stories are used to identify features (NLP step). Formal concept analysis is used to hierarchically classify them (FCA step). Logical rules generated by analyzing the results of NLP and FCA steps are used to refine feature constraints. We implemented and evaluated this method on a dataset from our industrial partner. The obtained results showed the efficiency of our method in properly synthesizing feature models towards an SPL migration of our partner’s code base.