Publications

2022

TestKnight: An Interactive Assistant to Stimulate Test Engineering Cristian Alexandru Botocan, Piyush Deshmukh, Pavlos Makridis, Jorge Romeu Huidobro, Mathanrajan Sundarrajan, Mauricio Aniche, Andy Zaidman Proceedings - 2022 ACM/IEEE 44th International Conference on Software Engineering, ICSE-Companion 2022, 2022

Software testing is one of the most important aspects of modern software development. To ensure the quality of the software, developers should ideally write and execute automated tests regularly as their code-base evolves. TestKnight, a plugin for the IntelliJ IDEA integrated development environment (IDE), aims to help Java developers improve th...

2021

Data-Driven Extract Method Recommendations: A Study at ING David van der Leij, Jasper Binda, Robbert van Dalen, Pieter Vallen, Yaping Luo, Maurício Aniche Proceedings of the 29th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE '21), 2021

The sound identification of refactoring opportunities is still an open problem in software engineering. Recent studies have shown the effectiveness of machine learning models in recommending methods that should undergo different refactoring operations. In this work, we experiment with such approaches to identify methods that should undergo an E...

Log-based Software Monitoring: A Systematic Mapping Study Jeanderson Candido, Maurício Aniche, Arie van Deursen PeerJ, 2021

Modern software development and operations rely on monitoring to understand how systems behave in production. The data provided by application logs and runtime environment are essential to detect and diagnose undesired behavior and improve system reliability. However, despite the rich ecosystem around industry-ready log solutions, monitoring com...

Atoms of Confusion in Java Chris Langhout, Maurício Aniche 29th IEEE/ACM International Conference on Program Comprehension (ICPC), 2021

Although writing code seems trivial at times, problems arise when humans misinterpret what the code actually does. One of the potential causes are “atoms of confusion”, the smallest possible patterns of misinterpretable source code. Previous research has investigated the impact of atoms of confusion in C code. Results show that developers make s...

How Developers Engineer Test Cases: An Observational Study Maurício Aniche, Christoph Treude, Andy Zaidman Transactions on Software Engineering (TSE), 2021

One of the main challenges that developers face when testing their systems lies in engineering test cases that are good enough to reveal bugs. And while our body of knowledge on software testing and automated test case generation is already quite significant, in practice, developers are still the ones responsible for engineering test cases manua...

An Exploratory Study of Log Placement Recommendation in an Enterprise System Jeanderson Cândido, Jan Haesen, Maurício Aniche, Arie van Deursen Mining Software Repositories Conference (MSR), 2021

Logging is a development practice that plays an important role in the operations and monitoring of complex systems. Developers place log statements in the source code and use log data to understand how the system behaves in production. Unfortunately, anticipating where to log during development is challenging. Previous studies show the feasibili...

Learning Off-By-One Mistakes: An Empirical Study Hendrig Sellik, Onno van Paridon, Georgios Gousios, Maurício Aniche Mining Software Repositories Conference (MSR), 2021

Mistakes in binary conditions are a source of error in many software systems. They happen when developers use, e.g., < or > instead of <= or >=. These boundary mistakes are hard to find and impose manual, labor-intensive work for software developers. While previous research has been proposing solutions to identify errors in boundary ...

Search-Based Software Re-Modularization: A Case Study at Adyen Casper Schröder, Adriaan van der Feltz, Annibale Panichella, Maurício Aniche IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2021

Deciding what constitutes a single module, what classes belong to which module or the right set of modules for a specific software system has always been a challenging task. The problem is even harder in large-scale software systems composed of thousands of classes and hundreds of modules. Over the years, researchers have been proposing differen...

Automatically Identifying Parameter Constraints in Complex Web APIs: A Case Study at Adyen Henk Grent, Aleksei Akimov, Maurício Aniche IEEE/ACM 43rd International Conference on Software Engineering: Software Engineering in Practice (ICSE-SEIP), 2021

Web APIs may have constraints on parameters, such that not all parameters are either always required or always optional. Moreover, the presence or value of one parameter could cause another parameter to be required, or parameters could have restrictions on what kinds of values are valid. Having a clear overview of the constraints helps API consu...

Grading 600+ students: A Case Study on Peer and Self Grading Maurício Aniche, Frank Mulder, Felienne Hermans 43rd International Conference on Software Engineering: Joint Track on Software Engineering Education and Training (ICSE-JSEET), 2021

Grading large classes has become a challenging and expensive task for many universities. The Delft University of Technology (TU Delft), located in the Netherlands, has observed a large increase in student numbers over the past few years. Given the large growth of the student population, grading all the submissions results in high costs. We made...

The Prevalence of Code Smells in Machine Learning projects Bart van Oort, Luís Cruz, Maurício Aniche, Arie van Deursen 1st Workshop on AI Engineering – Software Engineering for AI – WAIN'21, 2021

Artificial Intelligence (AI) and Machine Learning (ML) are pervasive in the current computer science landscape. Yet, there still exists a lack of software engineering experience and best practices in this field. One such best practice, static code analysis, can be used to find code smells, i.e., (potential) defects in the source code, refactorin...

Interactive Static Software Performance Analysis in the IDE Aaron Beigelbeck, Maurício Aniche, Jürgen Cito 29th IEEE/ACM International Conference on Program Comprehension (ICPC), 2021

Detecting performance issues due to suboptimal code during the development process can be a daunting task, especially when it comes to localizing them after noticing performance degradation after deployment. Static analysis has the potential to provide early feedback on performance problems to developers without having to run profilers with expe...

Logging Practices with Mobile Analytics: An Empirical Study on Firebase Julian Harty, Haonan Zhang, Lili Wei, Luca Pascarella, Maurício Aniche, Weiyi Shang 8th IEEE/ACM International Conference on Mobile Software Engineering and Systems (MOBILESoft), 2021

Software logs are of great value in both industrial and open-source projects. Mobile analytics logging enables developers to collect logs from the end users at the cost of recording and transmitting logs across the Internet to a centralised infrastructure. The goal of this paper is to make the first step in the characterisation of logging practi...

2020

The Effectiveness of Supervised Machine Learning Algorithms in Predicting Software Refactoring Maurício Aniche, Erick Maziero, Rafael Durelli, Vinicius Durelli Transactions on Software Engineering (TSE), 2020

Refactoring is the process of changing the internal structure of software to improve its quality without modifying its external behavior. Before carrying out refactoring activities, developers need to identify refactoring opportunities. Currently, refactoring opportunity identification heavily relies on developers’ expertise and intuition. In t...

Selecting third-party libraries: The practitioners' perspective Enrique Larios Vargas, Maurício Aniche, Christoph Treude, Magiel Bruntink, Georgios Gousios The ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2020

The selection of third-party libraries is an essential element of virtually any software development project. However, deciding which libraries to choose is a challenging practical problem. Selecting the wrong library can severely impact a software project in terms of cost, time, and development effort, with the severity of the impact depending ...

OffSide: Learning to Identify Mistakes in Boundary Conditions Jón Arnar Briem, Jordi Smit, Hendrig Sellik, Pavel Rapoport, Georgios Gousios, Maurício Aniche The 2nd Workshop on Testing for Deep Learning and Deep Learning for Testing (DeepTest), 2020

Mistakes in boundary conditions are the cause of many bugs in software. These mistakes happen when, e.g., developers make use of ‘<’ or ‘>’ in cases where they should have used ‘<=’ or ‘>=’. Mistakes in boundary conditions are often hard to find and manually detecting them might be very time-consuming for developers. While researcher...

Domain-Based Fuzzing for Supervised Learning of Anomaly Detection in Cyber-Physical Systems Herman Wijaya, Maurício Aniche, Aditya Mathur The 1st International Workshop on Engineering and Cybersecurity of Critical Systems (EnCyCriS), 2020

A novel approach is proposed for constructing models of anomaly detectors using supervised learning from the traces of normal and abnormal operations of an Industrial Control System (ICS). Such detectors are of value in detecting process anomalies in complex critical infrastructure such as power generation and water treatment systems. The traces...

2019

Comprehending Test Code: An Empirical Study Chak Shun Yu, Christoph Treude, Maurício Aniche 35th IEEE International Conference on Software Maintenance and Evolution (ICSME), 2019

Developers spend a large portion of their time and effort on comprehending source code. While many studies have investigated how developers approach these comprehension tasks and what factors influence their success, less is known about how developers comprehend test code specifically, despite the undisputed importance of testing. In this paper...

Current Challenges in Practical Object-Oriented Software Design Maurício Aniche, Joseph W. Yoder, Fabio Kon 41st ACM/IEEE International Conference on Software Engineering, short paper, 2019

According to the extensive 50-year-old body of knowledge in object-oriented programming and design, good software designs are, among other characteristics, lowly coupled, highly cohesive, extensible, comprehensible, and not fragile. However, with the increased complexity and heterogeneity of contemporary software, this might not be enough. This...

An Empirical Catalog of Code Smells for the Presentation Layer of Android Apps Suelen Goularte Carvalho, Maurício Aniche, Júlio Veríssimo, Rafael Durelli, Marco Aurélio Gerosa Empirical Software Engineering journal (EMSE), 2019

Software developers, including those of the Android mobile platform, constantly seek to improve their applications’ maintainability and evolvability. Code smells are commonly used for this purpose, as they indicate symptoms of design problems. However, although the literature presents a variety of code smells, such as God Class and Long Method, ...

Monitoring-Aware IDEs Jos Winter, Maurício Aniche, Jürgen Cito, Arie van Deursen 27th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2019

Engineering modern large-scale software requires software developers to not solely focus on writing code, but also to continuously examine monitoring data to reason about the dynamic behavior of their systems. These additional monitoring responsibilities for developers have only emerged recently, in the light of DevOps culture. Interestingly, so...

Tracing Back Log Data to its Log Statement: From Research to Practice Daan Schipper, Maurício Aniche, Arie van Deursen IEEE/ACM 16th International Conference on Mining Software Repositories (MSR), 2019

Logs are widely used as a source of information to understand the activity of computer systems and to monitor their health and stability. However, most log analysis techniques require the link between the log messages in the raw log file and the log statements in the source code that produce them. Several solutions have been proposed to solve th...

Pragmatic Software Testing Education Maurício Aniche, Felienne Hermans, Arie van Deursen 50th ACM Technical Symposium on Computer Science Education, 2019

Software testing is an important topic in software engineering education, and yet highly challenging from an educational perspective: students are required to learn several testing techniques, to be able to distinguish the right technique to apply, to evaluate the quality of their test suites, and to write maintainable test code. In this paper,...

Factors Affecting Cloud Infra-Service Development Lead Times: A Case Study at ING Hennie Huijgens, Eric Greuter, Jerry Brons, Evert A. van Doorn, Ioannis Papadopoulos, Francisco Morales Martinez, Maurício Aniche, Otto Visser, Arie van Deursen 41st ACM/IEEE International Conference on Software Engineering, Software Engineering in Practice (SEIP), 2019

The development of Cloud Infra-Services has shifted over the past decade in the direction of a software code development process, also known as infrastructure as code (IaC). Contemporary continuous delivery settings in industry require fast feedback. As a consequence, companies need insight in time spent, especially in the development of such se...

2018

Understanding Developers’ Needs on Deprecation as a Language Feature Anand Ashok Sawant, Maurício Aniche, Arie van Deursen, Alberto Bacchelli 40th International Conference on Software Engineering (ICSE), 2018

Deprecation is a language feature that allows API producers to mark a feature as obsolete. We aim to gain a deep understanding of the needs of API producers and consumers alike regarding deprecation. To that end, we investigate why API producers deprecate features, whether they remove deprecated features, how they expect consumers to react, and ...

When Testing Meets Code Review: Why and How Developers Review Tests Davide Spadini, Maurício Aniche, Margaret-Anne Storey, Magiel Bruntink, Alberto Bacchelli 40th International Conference on Software Engineering (ICSE), 2018

Automated testing is considered an essential process for ensuring software quality. However, writing and maintaining high-quality test code is challenging and frequently considered of secondary importance. For production code, many open source and industrial software projects employ code review, a well-established software quality practice, but ...

PyDriller: Python Framework for Mining Software Repositories Davide Spadini, Maurício Aniche, Alberto Bacchelli 26th ACM Joint European Software Engineering Conference and Symposium on the Foundations of Software Engineering (ESEC/FSE), 2018

Software repositories contain historical and valuable information about the overall development of software systems. Mining software repositories (MSR) is nowadays considered one of the most interesting growing fields within software engineering. MSR focuses on extracting and analyzing data available in software repositories to uncover interesti...

How Modern News Aggregators Help Development Communities Shape and Share Knowledge Maurício Aniche, Christoph Treude, Igor Steinmacher, Igor Wiese, Gustavo Henrique Lima Pinto, Margaret-Anne Storey, Marco Aurélio Gerosa 40th International Conference on Software Engineering (ICSE), 2018

Many developers rely on modern news aggregator sites such as reddit and hackernews to stay up to date with the latest technological developments and trends. In order to understand what motivates developers to contribute, what kind of content is shared, and how knowledge is shaped by the community, we interviewed and surveyed developers that part...

Mock objects for testing java systems: Why and how developers use them, and how they evolve Davide Spadini, Maurício Aniche, Magiel Bruntink, Alberto Bacchelli Empirical Software Engineering (EMSE), 2018

When testing software artifacts that have several dependencies, one has the possibility of either instantiating these dependencies or using mock objects to simulate the dependencies’ expected behavior. Even though recent quantitative studies showed that mock objects are widely used both in open source and proprietary projects, scientific knowled...

Where does Google find API documentation? Christoph Treude, Maurício Aniche 2nd International Workshop on API Usage and Evolution, 2018

The documentation of popular APIs is spread across many formats, from vendor-curated reference documentation to Stack Overflow threads. For developers, it is often not obvious from where a particular piece of information can be retrieved. To understand this documentation landscape, we systematically conducted Google searches for the elements of ...

An Exploratory Study on Faults in Web API Integration in a Large-Scale Payment Company Joop Aué, Maurício Aniche, Maikel Lobbezoo, Arie van Deursen ICSE-SEIP '18: 40th International Conference on Software Engineering: Software Engineering in Practice Track, 2018

Service-oriented architectures are more popular than ever, and increasingly companies and organizations depend on services offered through Web APIs. The capabilities and complexity of Web APIs differ from service to service, and therefore the impact of API errors varies. API problem cases related to Adyen’s payment service were found to have dir...

Search-Based Test Data Generation for SQL Queries Jeroen Castelein, Maurício Aniche, Mozhan Soltani, Annibale Panichella, Arie van Deursen 40th International Conference on Software Engineering (ICSE), 2018

Database-centric systems strongly rely on SQL queries to manage and manipulate their data. These SQL commands can range from very simple selections to queries that involve several tables, subqueries, and grouping operations. And, as with any important piece of code, developers should properly test SQL queries. In order to completely test a SQL q...

The Adoption of JavaScript Linters in Practice: A Case Study on ESLint Kristín Fjóla Tómasdóttir, Maurício Aniche, Arie van Deursen Transactions on Software Engineering (TSE), 2018

A linter is a static analysis tool that warns software developers about possible code errors or violations to coding standards. By using such a tool, errors can be surfaced early in the development process when they are cheaper to fix. For a linter to be successful, it is important to understand the needs and challenges of developers when using ...

Unusual Events in GitHub Repositories Christoph Treude, Larissa Leite, Maurício Aniche Journal of Systems and Software (JSS), 2018

In large and active software projects, it becomes impractical for a developer to stay aware of all project activity. While it might not be necessary to know about each commit or issue, it is arguably important to know about the ones that are unusual. To investigate this hypothesis, we identified unusual events in 200 GitHub projects using a comp...

2017

To Mock or Not To Mock? An Empirical Study on Mocking Practices Davide Spadini, Maurício Aniche, Magiel Bruntink, Alberto Bacchelli IEEE 14h International Conference on Mining Software Repositories (MSR), 2017

When writing automated unit tests, developers often deal with software artifacts that have several dependencies. In these cases, one has the possibility of either instantiating the dependencies or using mock objects to simulate the dependencies’ expected behavior. Even though recent quantitative studies showed that mock objects are widely used i...

A Collaborative Approach to Teaching Software Architecture Arie van Deursen, Maurício Aniche, Joop Aué, Rogier Slag, Michael de Jong, Alex Nederlof, Eric Bouwers 48th ACM Technical Symposium on Computer Science Education (SIGCSE), 2017

Teaching software architecture is hard. The topic is abstract and is best understood by experiencing it, which requires proper scale to fully grasp its complexity. Furthermore, students need to practice both technical and social skills to become good software architects. To overcome these teaching challenges, we developed the Collaborative Softw...

An Experience Report on Applying Passive Learning in a Large-Scale Payment Company Rick Wieman, Maurício Aniche, Willem Lobbezoo, Sicco Verwer, Arie van Deursen 33rd IEEE International Conference on Software Maintenance and Evolution (ICSME), 2017

Passive learning techniques infer graph models on the behavior of a system from large trace logs. The research community has been dedicating great effort in making passive learning techniques more scalable and ready to use by industry. However, there is still a lack of empirical knowledge on the usefulness and applicability of such techniques in...

Why and How JavaScript Developers Use Linters Kristín Fjóla Tómasdóttir, Maurício Aniche, Arie van Deursen 32nd IEEE/ACM International Conference on Automated Software Engineering (ASE), 2017

Automatic static analysis tools help developers to automatically spot code issues in their software. They can be of extreme value in languages with dynamic characteristics, such as JavaScript, where developers can easily introduce mistakes which can go unnoticed for a long time, e.g., a simple syntactic or spelling mistake. Although research has...

Code smells for Model-View-Controller architectures Maurício Aniche, Gabriele Bavota, Christoph Treude, Marco Gerosa, Arie van Deursen Empirical Software Engineering Journal (EMSE), 2017

Previous studies have shown the negative effects that low-quality code can have on maintainability proxies, such as code change- and defect-proneness. One of the symptoms for low-quality code are code smells, defined as sub-optimal implementation choices. While this definition is quite general and seems to suggest a wide spectrum of smells that ...

2016

A Validated Set of Smells in Model-View-Controller Architecture Maurício Aniche, Gabriele Bavota, Christoph Treude, Marco Gerosa, Arie van Deursen 32th International Conference on Software Maintenance and Evolution (ICSME), 2016

Code smells are symptoms of poor design and implementation choices that may hinder code comprehension, and possibly increase change- and defect-proneness. A vast catalogue of smells has been defined in the literature, and it includes smells that can be found in any kind of system (e.g., God Classes), regardless of their architecture. On the othe...

Developers' Perceptions on Object-Oriented Design and System Architecture Maurício Aniche, Christoph Treude, Marco Gerosa 30th Brazilian Symposium on Software Engineering (SBES), 2016

Software developers commonly rely on well-known software architecture patterns, such as MVC, to build their applications. In many of these patterns, classes play specific roles in the system, such as Controllers or Entities, which means that each of these classes has specific characteristics in terms of object-oriented class design and implement...

SATT: Tailoring Code Metric Thresholds for Different Software Architectures Maurício Aniche, Christoph Treude, Andy Zaidman, Arie van Deursen, Marco Gerosa 16th IEEE International Working Conference on Source Code Analysis and Manipulation (SCAM), 2016

Code metric analysis is a well-known approach for assessing the quality of a software system. However, current tools and techniques do not take the system architecture (e.g., MVC, Android) into account. This means that all classes are assessed similarly, regardless of their specific responsibilities. In this paper, we propose SATT (Software Arc...

2015-2010

Does test-driven development improve class design? A qualitative study on developers’ perceptions Maurício Aniche and Marco Aurélio Gerosa Journal of the Brazilian Computer Society, 2015

Background: Developers commonly affirm that writing unit tests improve internal quality of a software, besides a more obvious effect on external quality. This is particularly common among Test-Driven Development (TDD) pactitioners, who leverage the acting of writing tests to think about and improve class design. However, it is not clear how this...

Improving Code Quality on Automated Tests of Web Applications: A Set of Patterns Maurício Aniche, Eduardo Guerra, Marco A. Gerosa Conference on Pattern Languages of Programs (PLoP), 2014

There are different levels of automated testing, such as unit, integration or system testing. When dealing with web applications, the act of writing automated tests usually requires a huge effort from developers, like tests that usually open the browser and navigate through the web page. However, the more integrated the test, the more complicate...

Preparing for a Test Driven Development Session Eduardo Guerra, Maurício Aniche, Marco A. Gerosa, Joe Yoder Conference on Pattern Languages of Programs (PLoP), 2014

Test-driven development (TDD) is a development technique used to design classes in a software system by first creating tests before implementing the actual code. However, even before you start creating tests, there are some preparation tasks that the developer should do. This involves gathering information about the class(es) that will be worke...

Are the Methods in Your Data Access Objects (DAOs) in the Right Place? A Preliminary Study Maurício Aniche, Gustavo Oliva, Marco Aurélio Gerosa 6th International Workshop on Managing Technical Debt (MTD), 2014

Isolating code that deals with system infrastructure from code that deals with domain rules is a good practice when developing applications. Code that deals with the database, for example, is often isolated in classes following a Data Access Object (DAO) pattern. Developers often create a DAO for each domain entity in the system. However, as som...

MetricMiner: Supporting Researchers in Mining Software Repositories Francisco Zigmund Sokol, Maurício Finavaro Aniche, Marco Aurélio Gerosa IEEE 13th International Working Conference on Source Code Analysis and Manipulation (SCAM), 2013

Researchers use mining software repository (MSR) techniques for studying software engineering empirically, by means of analysis of artifacts, such as source code, version control systems meta data, etc. However, to conduct a study using these techniques, researchers usually spend time collecting data anddeveloping a complex infrastructure, which...

What Do the Asserts in a Unit Test Tell Us about Code Quality? A Study on Open Source and Industrial Projects Maurício Aniche, Gustavo Oliva, Marco Gerosa 17th European Conference on Software Maintenance and Reengineering (CSMR), 2013

Unit tests and production code are intrinsically connected. A class that is easy to test usually presents desirable characteristics, such as low coupling and high cohesion. Thus, finding hard-to-test classes may help developers identify problematic code. Many different test feedbacks that warn developers about problematic code were already catal...

Increasing Learning in an Agile Environment: Lessons Learned in an Agile Team Maurício Aniche, Guilherme Silveira Agile Conference, 2011

Learning is an important part of the software development process. There are many advantages for developers willing to learn: increased internal and external quality of the produced software, and a reduced learning curve as beginners become high-skilled developers much faster than usual. However, learning is not taken seriously by many teams. T...

Most common mistakes in test-driven development practice: Results from an online survey with developers Maurício Aniche, Marco Aurélio Gerosa 1st International Workshop on Test-Driven Development (TDD), 2010

Test-driven development (TDD) is a software development practice that supposedly leads to better quality and fewer defects in code. TDD is a simple practice, but developers sometimes do not apply all the required steps correctly. This article presents some ofthe most common mistakes that programmers makewhen practicing TDD, identified by an onli...