Micro Analyzer

The problem

Many performance issues are caused by low-level implementation details, which are also more likely to be the cause of failures in software-intensive systems than erroneous feature implementations. A microbenchmark is a type of performance test that focus on measuring the performance of smaller, yet critical, code sections as opposed to testing a whole system.

The development of microbenchmarks requires expertise. When creating a microbenchmark there are two primary tasks that affects its accuracy. The first is the identification and wrapping of a performance-critical code section into a payload, i.e., an independent program, which is supposed to recreate the execution conditions occurring in the application. The second task is running the payload a sufficient number of times to get an estimate of the execution time.

Wrong conclusions could be drawn from microbenchmarking results due to a number of reasons. The abstraction gap between applications and execution environments may affect a benchmark to produce different, or even contradictory, results on different hardware platforms. In addition, depending on which compiler is being used, various code optimizations could be applied for the same code segment. A compiler may recognize patterns, like dead code, which encourages elimination of parts of the benchmark code, leading to code that runs faster than expected.

There is also a warm-up phase to be aware of when developing microbenchmarks. Sources of influence on measurements need to be recognized and discarded if the goal is to examine sustainable performance. Two sources which influence the warm up are class loading and just-in-time (JIT) compilation. The initial iterations of benchmarked code include dynamic compilation. Later iterations are usually faster since the executed code has been compiled and optimized. Warm-up executions of the benchmarked code could be conducted to reduce the effect of such influences.

Code modifications related to performance are often introduced based on developers assumptions instead of actual performance observations. Such observations could be made if microbenchmarks were implemented. The complexity of designing useful and correct performance microbenchmarks, and in particular the lack of tool support, prevents adoption of microbenchmarking among a wider audience of developers.

Collecting data from a source

Cloning

Step 1

The cloning module is used by MicroAnalyzer to load or retrieve the data to be studied. The data may already be local to MicroAnalyzer. Otherwise the framework will copy the data from a remote location.
Pre-processing of source data

Preprocessing

Step 2

This module pre-processes the cloned projects. The output from running this module is two datasets, whose content is adapted to the internal data model of MicroAnalyzer.
Analyze the pre-processed data

Analyzing

Step 3

To make mining studies easier to reproduce and replicate the analysis tasks are made as plugins. Thus it is only necessary to have a copy of the analysis plugin and the dataset used in a study in order to reproduce the study.

Why Micro Analyzer?

To enable a wider adoption of microbenchmarking better tools need to be developed, making it less complex to create microbenchmarks. A goal of mining software repositories (MSR) research is to motivate the development of tools. To mine repositories means extracting knowledge from data in software projects. Mining microbenchmarks on a larger scale may allow learning about how developers design their microbenchmarks and what the complex and error-prone parts of that process is. By recognizing such patterns improvements of existing tools could be made and new tools be created. Facilitating the benchmarking process, and meeting the need for tool support, microbenchmarking practices are more likely to be widely adopted.

A mining study usually involves several steps such as the collection of raw data, pre-processing of the collected data, and analysis of the pre-processed data. The unstructured form of raw data makes it difficult to analyze, affecting the quality and accuracy of analysis results negatively. Therefore the raw data must be processed into a structured form suitable for analysis. Since pre-processing, as well as analyzing, the data is a time-consuming and complex task, especially when performed at large scale, it needs to be assisted by tools. Even though a number of mining tools exist, these are not always publicly available. And when publicly available they are often difficult to set up and use, restricting reuse of the tools in other mining studies. Also, such software is usually developed specifically for each study in which they are used. Consequently, the scripts and tools available are limited to certain pre-processing tasks and analyses.

Performing mining studies is often a difficult and time-consuming task. Typically only a small number of software repositories are studied due to the challenges of finding, retrieving, pre-processing, and analyzing large amounts of data. It is common that parts of this process is performed manually due to the lack of tool support. A drawback with analyzing a small number of repositories is that the generalizability of the studies are limited.

Micro Analyzer is an extensible framework for large-scale mining of software performance microbenchmarks. This framework solves the described problems. It has a plug-in architecture enabling more data sources, language parsers, and analyzes to be added as needed. A description about the framework can be found in my master thesis.