Testing: Eastern Philosophy and Testing can go Hand in Hand

During the preparation for a talk about practical testing and test coverage analysis I found in the large web a small booklet „The Way of Testivus“ (http://www.agitar.com/downloads/TheWayOfTestivus.pdf). This small booklet is a humorous summary of testing principles and wisdom packaged into a nice form for everybody to read who is interesting in testing to get inspired and to have a fun time.

This book clearly shows, that testing is not a matter of good or bad or right and wrong, it is about thinking of the right way. There is not standard recipe to use to write good tests, enough tests or what so ever.

For my talk mentioned above, I found an additional post (at www.artima.com/forums/flat.jsp?forum=106&thread=204677) with a nice story written in the same style:

Testivus On Test Coverage

Early one morning, a programmer asked the great master:

“I am ready to write some unit tests. What code coverage should I aim for?”The great master replied:

“Don’t worry about coverage, just write some good tests.”The programmer smiled, bowed, and left.

Later that day, a second programmer asked the same question.

The great master pointed at a pot of boiling water and said:

“How many grains of rice should put in that pot?”The programmer, looking puzzled, replied:

“How can I possibly tell you? It depends on how many people you need to feed, how hungry they are, what other food you are serving, how much rice you have available, and so on.”“Exactly,” said the great master.

The second programmer smiled, bowed, and left.

Toward the end of the day, a third programmer came and asked the same question about code coverage.

“Eighty percent and no less!” Replied the master in a stern voice, pounding his fist on the table.The third programmer smiled, bowed, and left.

After this last reply, a young apprentice approached the great master:

“Great master, today I overheard you answer the same question about code coverage with three different answers. Why?”The great master stood up from his chair:

“Come get some fresh tea with me and let’s talk about it.”After they filled their cups with smoking hot green tea, the great master began to answer:

“The first programmer is new and just getting started with testing. Right now he has a lot of code and no tests. He has a long way to go; focusing on code coverage at this time would be depressing and quite useless. He’s better off just getting used to writing and running some tests. He can worry about coverage later.”

“The second programmer, on the other hand, is quite experience both at programming and testing. When I replied by asking her how many grains of rice I should put in a pot, I helped her realize that the amount of testing necessary depends on a number of factors, and she knows those factors better than I do – it’s her code after all. There is no single, simple, answer, and she’s smart enough to handle the truth and work with that.”

“I see,” said the young apprentice, “but if there is no single simple answer, then why did you answer the third programmer ‘Eighty percent and no less’?”

The great master laughed so hard and loud that his belly, evidence that he drank more than just green tea, flopped up and down.

“The third programmer wants only simple answers – even when there are no simple answers … and then does not follow them anyway.”The young apprentice and the grizzled great master finished drinking their tea in contemplative silence.

My experience is, that the test coverage percentages are totally meaningless. The numbers do not tell anything meaningful at all, it is only a rough number on how much testing is done, but nothing about the test quality. The only really interesting information are the lines of code which are not tested at all. For these lines, one should think about adding some meaningful tests or about removing the code, because it is not used at all like dead code.

 

Common Anti-Patterns: String Usage for Number Manipulation

This is a short post about a topic which is frustrating to me from time to time. The topic is about number manipulation by utilizing string operations.

I do not know why, but from time to time I see code and talk to people who have some tasks to do with numbers and they do not have a better idea than printing them into a string or character stream and applying string manipulation functionality to get their needed results. For me, it is strange to think about something like that, because I think it is against common sense to solve number issue with strings. Mathematics is the way to do that.

I have to extreme examples here in this post which also caused major issues in the developed software. There were other examples, but the two were selected as examples.

Example 1: Rounding of Numbers

I have seen this twice in my career. Once implemented in C and once again in Java. Once it was needed to round to a plain integer and the other time with a give number of digits for a needed precision. In both cases the implementation was done the same way:

  1. The original floating point number was printed into a string.
  2. For the decimal point the position was looked up.
  3. The string was cut at the decimal point and number in front of the point was taken and saved into a new variable.
  4. For checking whether to round up or down, the digit after the point was checked to be ‚5‘ or larger (by character comparison) to know whether to round up or down.
  5. The number in front of the point was converted back into an integer.
  6. If it was to be rounded up, 1 was added.

For the arbitrary precision case, the position of the decimal point was shifted by the number of digits related to the precision.

When I see something like that, I get frustrated. It is a waste of calculation power, because string operations are much more expensive than mathematical calculations and the procedure has several flaws:

  1. What happens if there is an exponential notion of the number? We need to cut at the ‚E‘, too and take care of the number behind it.
  2. If localization is involved, what happens then? I live in Germany and we use a comma as decimal sign and the point is used for thousands.
  3. It is quite hard to assure correct behavior is arbitrary precision is to be applied.

The most simple solution for this issue is always to have a look for an already distributed function like Math.round in Java. A general procedure can be in C code:

It is assumed for that algorithm that a cast just truncates the digits right from the point. This rounding procedure is much fast than string manipulation.

To add a functionality for arbitrary precision rounding, it can be easiest achieved by multiplying the precision. For example assume the function:

To round num to an arbitrary precision we can do the following for two digits:

We have rounded now to two digits.

Attention: I see the discussion coming already, but I do not take floating point number effects into account here. I know, that floating point numbers are not 100% correct due to internal representation, but this is not solved by string manipulation either as soon as the numbers are put back into doubles for example.

2. Example: Calculating Mantissa and Exponent

The task is here to extract the mantissa and the exponent from floating point number into a 4 byte signed integer for the mantissa and a signed byte for the exponent to send everything over the network wire to systems which might have different binary floating point representation, no floating point representation due to usage of fixed decimal values or no exponential representation in strings. This may happen in integration projects. The separation into integers is not a bad idea and the target system can calculate its internal representation as needed.

The procedure I have seen in C code was:

  1. Print the number into a string (via sprintf with „%e“).
  2. Cut the string into two pieces at the ‚e‘ to separate mantissa and exponent.
  3. Convert the exponent into an integer assuming it fits into the byte.
  4. The mantissa string’s dot is removed.
  5. The length of the mantissa string is checked for length and shortened to 9 characters to later fit into the 4 byte integer.
  6. The mantissa is converted into the final 4 byte integer.
  7. The exponent is subtracted by the length of the mantissa string – 1 to get the correct exponent for the large mantissa.

There also some flaws except of the performance impact especially in applications which deal heavily with number which was/is the case in the observed situation:

  1. Again, localization: What happens in Germany with the dot?
  2. It is assumed that the exponential string representation has always one digit in front of the decimal point.
  3. The exponent may become greater than 127 or smaller than -128.

It is much better to do some math here. I have to admit, the math here is not trivial any more because the exponential and logarithm functions need to be applied and the mathematical rules need to be present, but this is assumed to be know by software developers and software engineers. They do not need to know them by heart, but they should know they are there and where to look them up. Here is the mathematical algorithm in words:

  1.  Calculate the logarithm of base 10 of the number to be converted. Remember: log_10 (x)  = log(x) / log(10). Most programming languages only offer the natural logarithm with base e.
  2. Round the exponent from 1. up to an integer. Up rounding is needed later to not exceed the capacity of the 4 byte integer mantissa.
  3. Calculate the mantissa by dividing the number by 10 up to the power of the exponent got from 2.
  4. The number from 3. is guaranteed to be smaller than 1 due to the up-rounding of the exponent in 2.
  5. Multiply the mantissa by 1 billion to get the maximum precision of our two integer exponent representation and round it into our 4 byte integer.
  6. Subtract 9 from the exponent in 2. to get the correct exponent to the new mantissa.

That’s it. It is maybe not that easy to understand mathematically, but it is optimized for computers. A computer is better in computing numbers than in manipulating string.

Conclusion

In our daily work we are confronted with a lot of tasks which are not so easy to solve on first glance. We always should look out for the best solution for the computer to perform, for the developer to implement correctly and for other developers to understand.

Since computers are optimized for computations, mathematical solutions are to be preferred for performance reasons. It can also be assumed that software developers and software engineers have some understanding of mathematics to find correct solutions. If not, there are also forums to ask people for help. It is not very probable that a problem at hand was not solved by someone else, yet. The most problems occurred already somewhere else and a proofed solution is always better than something new.

JUnit test reporting

Some time ago in 2011, I asked a question in stackoverflow about test report enrichment of JUnit tests. Because I was asked several times on this topic afterwards, and today again, I write down  the solution Ideveloped and used that time. Currently, I do not have a better idea, yet.

The issue to solve

For a customer detailed test reports are needed for integration tests which not only show, that everything is pass, but also what the test did and to which requirement it is linked to. These reports should be created automatically to avoid repetitive tasks to save money and to not frustrate valuable developers and engineers.

For that, I thought about a way how to document the more complex integration tests (which is also valuable for component tests). A documentation would be needed for classes and test methods. For the colleagues responsible for QA it is a good help to see to which requirement, Jira ticket or whatever the test is linked to and what the test actually tries to do to double check it. If the report is good enough, it may also be sent out to the customers to prove quality.

The big question now is: How can the documentation for each method and each test class put into the JUnit reports? JUnit 4.9 was used at the beginning of this issue with Maven 3.0 and not I am on JUnit 4.11 and Maven 3.1.0.

The solution

I did what was suggested in stackoverflow, because I did not find a way to enhance the surefire reports directly. The solution for me was the development of a JUnit run listener which collects some additional information about the tests which can be later used to aggregate the reports.

The JUnit listener was attached to maven-surefire-plugin  like:

The listener needs to extend org.junit.runner.notification.RunListener and there you can overwrite the methods testRunStarted, testRunFinished and so forth. The listener needs to be put into a separate Maven project, because a listener cannot be build and used in the same project, so another project is to be created separately.
For the tests itself, I also created an annotation where I put it in the requirements id and other information needed:

A test class can then look like that:
In the run listener one gets the reference to the actual test which can be scanned for annotations via reflection.The Description class is quite handy, one can retrieve the test class as easy as:

The information found there can be stored away, what I did in a simple CSV file. After the tests run, these information can be gathered into a fancy report. The information which can be collection reaches from start timestampes, over the actual test class calls, the result of the tests (assertions and assumptions) to the finish time stamps. The classes can be check via reflection at all times and all information can be written into a file or even to a database or server.
The reporting was done offline later on, when I first used this solution. But, if some more work effort is possible, a Maven report plugin (or a normal plugin) can be written which collects the information in the post-integration-test phase for example. Here the limitation is the imagination only. It depends whether only a technical report is needed as TXT file or a fancy marketing campaign needs to be build around it. In daily life it is something in between.