Using ASM to calculate test coverage in a new way. (Java)

 

When I’ve just started working at HP, my manager gave me a task of writing unit tests for some legacy code that my team owned.

After mapping the sensitive code fragments and some refactoring, I created tests for the legacy code. When finished, we used a code coverage tool on the tested code and to my amazement, I got around 30% coverage.

I got the idea to calculate coverage in a different way and started to check how the code coverage tool I chose worked (Cobertura).
Behind the scene Cobretura works with ASM.   

 

Introduction

The goal of the ASM library is to generate, transform and analyze compiled Java classes, represented as byte arrays (as they are stored on disk and loaded in the Java Virtual Machine).

For this purpose ASM provides tools to read, write and transform such byte arrays by using higher level concepts than bytes.

 

 

Program analysis, generation and transformation are useful techniques that can be used in many situations:

  • analysis, which can range from a simple syntactic parsing to a full semantic analysis.
  • Program transformation can be used to optimize or obfuscate programs, to insert debugging or performance monitoring code into applications

 

The Model

The ASM library provides two APIs for generating and transforming compiled

classes: the core API provides an event based representation of classes, while

the tree API provides an object based representation.

These two APIs can be compared to the Simple API for XML (SAX) and

Document Object Model (DOM) APIs for XML documents: the event based

API is similar to SAX, while the object based API is similar to DOM. The

object based API is built on top of the event based one, like DOM can be

provided on top of SAX.

 

Interfaces and components

The ASM API for generating and transforming compiled classes is based on

the ClassVisitorinterface. Simple sections are visited with a single method call whose arguments describe their content, and which returns void. Sections whose content can be

of arbitrary length and complexity are visited with a initial method call that

returns an auxiliary visitor interface.( for example, the visitAnnotation method returns an AnnotationVisitor).

 

ASM provides three core components based on the ClassVisitor interface to

generate and transform classes:

  • The ClassReader class parses a compiled class given as a byte array,
    and calls the corresponding visitXxx methods on the ClassVisitor instance passed as argument to its accept method. It can be seen as an event producer.
  • The ClassWriter class is an implementation of the ClassVisitor interface
    that builds compiled classes directly in binary form. It produces as output a byte array containing the compiled class, which can be retrieved with the toByteArray method. It can
    be seen as an event consumer.
  • The ClassAdapter class is a ClassVisitor implementation that delegates all the method calls it receives to another ClassVisitor instance. It can be seen as an event filter.

 

In my (Cobertura) case

In the Cobertura case, it works in two passes.
In the first pass we visit all the classes, methods and lines and collect all the data we’ll need, basically collect all the metadata. As a way of marking the weight of a method we added a new annotation @MethodCoverageWeight

In the first pass we added to the adaptor:

 public AnnotationVisitor visitAnnotation(String desc, boolean visible) {
        if ("Lcom/hp/coverage/MethodCoverageWeight;".equals(desc)) {
            return new AnnotationVisitor() {
 
                @Override
                public void visit(String name, Object value) {
                    weight = (Integer)value;
                } ...

 

And to the visit line we added:

public void visitLineNumber(int line, Label start)
{
// Record initial information about this line of code
currentLine = line;
classData.addLine(currentLine, myName, myDescriptor, this.weight);
...

 

 

 
in the second pass they instrument the code, after each row (in the visit line) they add a static function call. This function increments a counter.

public void visitLineNumber(int line, Label start)
            {
                        // Record initial information about this line of code
                        currentLine = line;
                        currentJump = 0;
 
                        instrumentOwnerClass();
 
                        // Mark the current line number as covered:
                        // classData.touch(line)
                        mv.visitIntInsn(SIPUSH, line);
                        mv.visitIntInsn(SIPUSH, this.firstPass.getWeight());
                        mv.visitMethodInsn(INVOKESTATIC,
                                                TOUCH_COLLECTOR_CLASS, "touch",
                                                "(Ljava/lang/String;II)V");
 
                        super.visitLineNumber(line, start);
            }

 

 


These instrumented classes are saved on the disk. When you run your tests (the already instrumented bytecode), after each line your tests pass, a method that increments a counter is called, thus we keep a record of all the lines we passed and give the coverage of the code you tested.

My idea was to give different weights to different parts of the code. Thus I added an annotation that allows the programmer to assign weights to parts of code (has an integer parameter).
In the annotation visitor I added a  some code that stores the weights of different parts of code and in the visit line function, I changed the instrumentation, that it calls a static function that can increment the weight of each line by its weight (instead of always incrementing by 1). and there were a few changes in the way the final calculation is performed.

f1.png

Figure 1

output of a covensual code covrage tool – 33% for testing the constractor and the importantCalculation method.

 

 

f2.png 

 

Figure 2

the output of our solution - 77% for the same test.

 

We tried this jar on 2 modules in the BSM project, we wrote tests for every important business logic in those modules.

The regular code coverage calculation gave us a 23% coverage while our calculation gave a 87% coverage, which represents better the coverage of the modules.

 

In conclusion, it is a very powerful technology, that allows the programmer to instrument code in different levels, if you want to read more about ASM, presshere.

 

This article has been written by Boaz Shor

Leave a Comment

We encourage you to share your comments on this post. Comments are moderated and will be reviewed
and posted as promptly as possible during regular business hours

To ensure your comment is published, be sure to follow the Community Guidelines.

Be sure to enter a unique name. You can't reuse a name that's already in use.
Be sure to enter a unique email address. You can't reuse an email address that's already in use.
Type the characters you see in the picture above.Type the words you hear.
Search
Showing results for 
Search instead for 
Do you mean 
About the Author
Featured


Follow Us
The opinions expressed above are the personal opinions of the authors, not of HP. By using this site, you accept the Terms of Use and Rules of Participation.