Schlagwort-Archive: development

Test Approches for Green and Brown Field Projects

The picture  on the right side show the relationships between the different test levels. The tests are a kind of test stages in green field projects. Usually, the stages are done from bottom to top and from different parties or members of the time.

 

 

Green Field Projects (and also new code in brown field if possible)

Here it is to be pointed out, that the usual order of testing in Green Field Projects (and code writing) is done in the following direction:

  1. Writing Code (developer)
  2. Checking Code (developer and CI system with plugins)
  3. Writing Unit Tests (performed as block in TDD with steps 1 and 2)
  4. Writing Component Tests (developer and QA)
  5. Writing Integration Tests (developer and QA)
  6. Setting up automated GUI tests (QA)
  7. Manual Testing (QA and if needed, developers conducted by QA)
  8. Endurance Tests (QA)
  9. Performance Tests
  10. Load Tests (Stress Tests)
  11. Decoupled: Penetration Tests (continuously by all, but systematically by QA)

The first three steps which go hand in hand and in TDD it seems that the numbering is reorder to 3, 1 and 2, but the code of the tests is also code.

The tests become from one step to the next less detailed, but go more into use cases and real world scenarios. The tests become also more and more realistic in the configuration and setup in relationship to the target environment. Even patch levels need to be controlled, if needed.

Brown Field Projects

In Brown Field Projects it is the other way around. A legacy code base is available and not under test. The approach can not go here to write unit tests first, because it is too much work to do and in most cases it takes a long time to understand the old legacy code. To get a good unit test suite, one calculates roughly to have at least 1 unit test for every 100 lines. In a legacy code base for only 100,000 lines it would mean to write at least 1,000 meaningful unit tests. This work would really suck…

The approach goes here to write integration tests first to assure the basic functionality. The basic use cases need to be right and the functionality is assured. With integration tests not all paths of working can be checked, but the basic behavior, the correct results for the most important use cases and the error handling for the most common error scenarios can be checked.

As soon as new functionality is developed, functionality changed or extended, new integration tests are to be implemented first to assure the still needed functionality stays unchanged and the new functionality is tested as it is required. Unit test and other tests are implemented as needed and practical.

As soon as an integration check fails, the failure analysis will reveal the details of the issue. These details are then check with additionally developed component and unit tests. As long as this procedure is used, more and more meaningful unit tests are developed and the more test coverage is reached.

The same is with GUI tests. At first one has only a chance to perform manual GUI tests. Later on the most common use cases and the general GUI behavior can be tested automatically. This also leads to a better test coverage.

In Brown Field Projects the tests in different tests groups are also developed in parallel. Integration tests are developed in parallel with the performing of manual testing which assures the current ability to deliver of the software. The current situation at hand dictates what is to be done.

Hadoop Client in WildFly – A Difficult Marriage

(This article was triggered by a question „Hadoop Jersey conflicts with Wildfly resteasy“ on StackOverflow, because I hit the same wall…)

For a current project, I evaluate the usage of Hadoop 2.7.1 for handling data. The current idea is to use Hadoop’s HDFS, HBase and Spark to handle bigger amount of data (1 TB range, no real Big Data).

The current demonstrator implementation uses Cassandra and Titan as databases. Due to some developments with Cassandra and Titan (Aurelius was aquired by DataStax.), the stack seems not to be future-proof. An alternative is to be evaluated.

The first goal is to use the Hadoop client

in WildFly 9.0.1. (The content of this artical should be also valid for WildFly >=8.1.0.) HDFS is to be used at first to store and retrieve raw files.

Setting up Hadoop in a pseudo distributed mode as it is described in „Hadoop: The Definitive Guide“ was a breeze. I was full of hope and added the dependency above to an EJB Maven module and wanted to use the client to connect to HDFS to store and retrieve single files. Here it is, where the problems started…

I cannot provide all stack traces and error messages anymore, but roughly, this is what happend (one after another; when I removed one obstacle, the next came up):

  • Duplicate providers for Weld where brought up as errors due to multiple providers in the Hadoop client. Several JARs are loaded as implicit bean archives, because JavaEE annotations are included. I did not expect that and it seems strange to have it in a client library which is mainly used in Java SE context.
  • The client dependency is not self contained. During compile time an issue arised due to missing libraries.
  • The client libraries contains depencencies which provide web applications. These applications are also loaded and WildFly try to initialize them, but fails due to missing libraries which are set to provided, but not included in WildFly (but maybe in Geronimo?). Again, I am puzzled, why something like that is packaged in a client library.
  • Due to providers delivered in sub-dependencies of Hadoop client, the JSON provider was switched from Jackson 2 (default since WildFly 8.1.0) back to Jackson 1 causing infinite recursions in trees I need to marshall into JSON, because the com.fasterxml.jackson.*  annotations were not recognized anymore and the org.codehaus.jackson.*  annotations were not provided.

The issues are manyfold and  is caused by a very strange, no to say messy packaging of Hadoop client.

Following are the solutions so far:

Broken implicte bean archives

Several JARs contain JavaEE annotations which leads to an implicit bean archive loading (see: http://weld.cdi-spec.org/documentation/#4). Implicit bean archive support needs to be switched off. For WildFly, it looks like this:

Change the Weld settings in WildFly’s standalone.conf from

to

This switches the implicit bean archive handling off. All libraries used for CDI need to have a /META-INF/beans.xml  file now. (After switching off the implicit archives, I found a lot of libraries with missing beans.xml  files.)

Missing dependencies

I added the following dependencies to fix the compile/linkage issues:

Services provided, but not working

After switching off the implicit bean archives and added new dependencies to get the project compilied, I run into issues during deployment. Mostly, the issues were missing runtime dependencies due to missing injection providers.

The first goal was to shut off all (hopefully) not needed stuff which was disturbing. I excluded the Hadoop MapReduce Client App and JobClient (no idea what these are for). Additionally, I excluded Jackson dependencies, beacause they are already provided in the WildFly container.

Broken JSON marshalling in RestEasy

After all the fixes above, the project compiled and deployed successfully. During test I found that JSON marshalling was broken due to infinite recursions I got during marshalling of my file trees. I drove me cracy to find out the issue. I was almost sure that WildFly 9 switched the default Jackson implementation back to Jackson 1, but I did not find any release note for that. After a long while and some good luck, I found a  YarnJacksonJaxbJsonProvider class which forces the container to use Jackson 1 instead of Jackson 2 messing up my application…

That was the final point to decide (maybe too late), that I need a kind of calvanic isolation. Hadoop client and WildFly need to talk through a proxy of some kind not sharing any dependencies except of one common interface.

Current Solution

I created now one Hadoop connector EAR archive which contains the above mentioned and fixed Hadoop client dependencies. Additionally, I create a Remote EJB and add it to the EAR which provides the proxy to use Hadoop. The proxy implements a Remote Interface which is also used by the client. The client performs a lookup on the remote interface of the EJB. That setup seems to work so far…

The only drawback in this scenario at the moment is, that I cannot use stream throug EJB, because streams cannot be serialized. I think about creating a REST interface for Hadoop, but I have no idea about the performance. Additionally, will the integration of HBase be as difficult as this!?

For the next versions, maybe a fix can be in place. I found a Jira ticket HDFS-2261 „AOP unit tests are not getting compiled or run“.

Software Engineering and the 4 Causes of Aristotle

I stumbled on the principle of the 4 causes of Aristotle some years ago when I did some reading about cause and effect, leadership and management. The question is: How do we get things done? When do we get things done? And also very important: How do we get things done in the right way?

The Four Causes of Aristotle

Everything what happens has four causes. There are no more and no less. Exactly four. Only if all four causes are present, something can come into existence. That is what Aristotle formulated in one of his most famous books: Physica. (Have also a look here: http://en.wikipedia.org/wiki/Four_causes)

In the next sections I describe the four causes. These causes appear in the order I present them. To give an accessible example, I use the picture of building a house. You will see how the four causes apply there.

Causa Finalis: The final cause

The causa finalis or final cause is the basic reason for anything to happen. You can also translate it as need, requirement or wish. Only with something like that a trigger is present to start some development.

For our house building example, we can think of it like the need to move into a new place because of a lack of space. For instance, a couple lives in a small flat and they are happy, but a child is to be born. They found out, that with a small child the place is too small, there is not a good chance to have a children room, the bath room is to narrow… They find that the current situation is changing and a need for more space is coming up.

This is the causa finalis. A need or requirement which is not detailed, yet. There is just an issue to be solved, but there is not a detailed plan, yet. But, the final result is formulated. In our case: More space.

Causa Formalis: The form

After the causa finalis is met, the second cause happens: The causa formalis or form. The final cause showed an issue and its final abstract solution and now it is thought about it and a plan or form is formulated. The causa formalis brings the vision for how the cause finalis can be solved.

In the house building example, our couple may have thought about renting the flat next to them and re-modelling a wall (what is maybe not allowed by the owner), moving into a bigger flat (which might be too expensive), or to build a house. Maybe, after deciding to build a house, they go to an architect and plan how large the house will be, what it will look like, and so forth. At the end of this process a clear vision exists how the situation is to be solved. After the abstract final result, more space, the plan has now a real, detailed form, but is still not physical, yet.

Causa Materialis: The material

After a clear vision exists due to the causa formalis, this vision needs to come to the final solution somehow. For that the last two causes are needed. The very next is the causa materialis or the material. For anything to happen, the material needs to be present or in other words: The bondary conditions need to be met.

In the house building example, the material is the material to build the house likes stones, concrete, wood and so forth, but also the knowledge on how to build it. The physical material is needed, what is obvious, but also the knowledge. Without these, there is no chance that a house can be build.

As boundary conditions other stuff is needed as well like some ground to put the house onto, time to build it and so forth. This is also part of the causa materialis.

Causa Efficiense: The execution

After the needs started the process, the vision was formed and the material was organized, the last cause is the execution or causa efficiense. A trainer of mine told me once: A vision without execution is just hallucination. He is right. One can dream of the best stuff, to have everything in place and so forth, but without execution nothing happens. That’s what the last cause is about. In this meaning, it also links to the block post I wrote some years before about The Trinity of Action.

In the house building example, this is the actual building of the house. We have everything in place now to perform the actual solution. We have the need to give as drive and energy, we have a vision and plan, we have the material and knowledge, and the last step is to put the material together with the knowledge we have to get the actual house build.

The Dependencies of the Four Causes

The four causes are sorted in another order in most cases, but in my opinion this is not optimal. There is a strict order of appearance in the order I described it above.

The first cause in my opinion is the causa finalis. It is a natures principle that without energy nothing takes place. Without the need, requirement or issue, there is no energy to change a current status quo. There is always a causa finalis needed as a seed for any change. I cannot think of a situation where it is different.

The second cause after the seeding by the causa finalis is always a kind of plan. A lot of actions in our world seem to be performed without plan, but the closer look reveals, that even there is a plan, but maybe not a well thought through one or just based on a pattern or experience, but there is one.

The cause materialis may be also the second reason, because the material might be already there for the solution, but it is not seen as such without a kind of plan. On the other side a plan also reveals what material might be missing which is needed to be organized or waited for.

At the very end the action, the causa efficiense can take place, but without a plan or material nothing can be done.

The principle of the Four Causes helps me a lot during my daily private and professional life, because it helps me to understand what happens on one site, but it also provides me a guideline on how to work in certain situation, because it gives me an order for what to work on.

The 4 Causes in Software Engineering

There is not so much to write anymore. In Software Engineering, this principles are at work as well.

At first a customer has a need to be solved which lead into some requirements which are the causa finalis. It is formulated what the final outcome has to be. After that architectural and design papers are written and planning done which are the causa formalis. With the organization of hardware, software, developers, office space and everything else, the boundary conditions for the actual development are met. The final development is the causa efficiense.

The trick is to reflect the current status of a software project from time to time and think about the four principles to find out, whether all causae are there. If something is missing, the project cannot be finished successfully:

  1. Is the causa finalis not met, no customer will pay due to a lack of need.
  2. Is the cause formalis not met, no customer will pay, because it is not useful.
  3. Is the cause materialis not met, the product cannot be developed, shipped or run. Again, nobody will pay.
  4. At least, if the causa efficiense is missing, the actual product is not build and cannot be sold either.

That’s all the secret in here…

JavaEE: Arquillian Tests support Multi-WAR-EARs

Before version 1.0.2 Arquillian did not support EAR deployments which contained multiple WAR files. The issue was the selection of the WAR into which the arquillian artifacts are to be placed for testing. The trick until then was to remove all WAR files not needed and to leave only one WAR within the EAR.

Starting from version 1.0.2 Arquillian does support Multi-WAR-EAR-Deployments as described in http://arquillian.org/blog/2012/07/25/arquillian-core-1-0-2-Final.

If you have an EAR which is used via

you can select a WAR with the following lines:

That’s it. After this selection Arquillian stops complaining about multiple WAR files within the EAR and the selected WAR is enriched and tested.

JavaEE WebSockets and Periodic Message Delivery

For a project I had the need to implement a monitoring functionality based on HTML5 and WebSockets. It is quite trivial with JavaEE 7, as I will explain below.

Let us assume the easy requirement of a simple monitoring which sends periodic status information to web clients. The web client shall show the information on a web page (inside a <div>…</div> for instance). For that scenario, the technical details are shown below…

The JavaScript code is quite easy and can be taken from JavaScript WebSocket books and tutorials. (A good introduction is for instance Java WebSocket Programming by Oracle Press). A simple client might look like:

The functions for onopen, onclose and onerror are neglected, because we want to focus on JavaEE. The important stuff is shown above: We connect with new WebSocket to the URL which shall provide the periodic updated and with onmessage we put the data somewhere into our web page. That’s it from the client site.

For JavaEE, there is a lot of documentation which shows how to create @ServerEndpoint classes. For instance:

But, how to make it send periodic messages easily? After some testing on WildFly 8.2, I came to this simple solution:

The trick is to make the @ServerEndpoint class also an EJB @Singleton. The @Singleton assures that only one instance is living at a time and this instance can keep also the session provided during @OnOpen. In other words: The actual server endpoint instance is exactly the same where the scheduler is running on. If it would not be @Singleton, multiple instances will or may exist and the session field is not set in @Schedule and might lead to a NullPointerException if not checked for.

Can Programs Be Made Faster?

Short answer: No. But, more efficient.

A happy new year to all of you! This is the first post in 2014 and it is a (not so) short post about a topic which follows me all the time during discussions about high performance computing. During discussions and in projects I get asked about how programs can be programmed to run faster. The problem is, that this mind set is misleading. It always takes me some minutes to explain the correct mind set: Programs cannot run faster, but more efficient to save time.

If we neglect that we can scale vertically by using faster CPUs, faster memory and faster disks, the speed of a computer is constant (by also neglecting CPUs which change there speed so save power). All programs run always with the same speed and we cannot do anything to speed them up by just changing the programming. What we can do is, to use the hardware we got as efficient as possible. The effect is: We get more done in less time. This reduces the program run time and the software seem to run faster. That is what people mean, but looking on efficiency brings the mind set to find the correct leverages on how to decrease run time.

A soon as a program returns the correct results it is effective, but there is also the efficiency which is to be looked at. Have a look to my post about effectiveness and efficiency for more details about the difference between effectiveness and efficiency. To gain efficiency, we can do the following:

Use all hardware available

All cores of a multi-core CPU can be utilized and all CPUs of the system if we have more than one CPU in the system. GPU or physical accelerator cards can be used for calculation if present.

Especially in brown field projects, where the original code comes from single core systems (before 2005 or so) or system which did not have appropriate GPUs (before 2009), developers did not pay attention multi-threaded, heterogeneous programming. These programs have a lot of potential for performance gains.

Look out for:

CPU utilization

Introduce mutli-thread programming into your software. Check the CPU utilization during an actual run and look for CPU idle tines. If there are any, check your software whether it can do something at the time the idle times occur.

GPU utilization

Introduce OpenCL or CUDA into your software to utilize the GPU board or physics accelerator cards if present. Check the utilization of the cards during calculation and look for optimizations.

Data partitioning for optimal hardware utilization

If a calculation does not need too much data, everything should be loaded into memory to have the data present there for efficient access. Data can also organized to have access in different modes for sake of efficiency. But, if there are calculations with amounts of data which do not fit into memory, a good strategy is needed for not to perform calculations on disk.

The data should be partitioned into smaller pieces. These pieces should fit into memory and the calculations on these pieces should run in memory completely. The bandwidth CPU to memory is about 100 to 1000 faster than CPU to disk. If you have done this, check with tools for cache misses and check whether you can optimize this.

Intelligent, parallel data loading

The bottle neck for calculations are CPU and/or GPU. They need to be utilized, because only they bring relevant results. All other hardware a facilities around that. So, do everything to keep the CPUs and/or GPUs busy. It is not a good idea to load all data into memory (and let CPU/GPU idle), then start a calcuation (everything is busy) to store the results afterwards (and have the CPU/GPU idle again). Develop you software with dynamic data loading. During the time calculations run, new data can be caught from disk to prepare the next calculations. The next calculations can run during the time the former results are written onto disk.This maybe keeps a CPU core busy with IO, but the other cores do meaningful work and the overall utilization increases.

Do not do unnecessary things

Have a look to my post about the seven muda to get an impression about wastes. All these wastes can be found in software and these lead into inefficiency. Everything which does not directly contribute to the expected results of the software needs to be questioned. Everything which uses CPU power, memory bandwidth and disk bandwidth, but is not directly connected to the requested calculation may be treated as potential waste.

To have a starter look for, check and optimize:

Decide early

Decide early, when to abort loops, what calculations to do and how to proceed. Some decisions are made in code on a certain position, but sometimes these checks can be done earlier in code or before loops, because the information is already present. This is something to be checked. During refactorings there might be other, more efficient positions for these checks. Look out for them.

Validate economically

Do not check in functions the validity of your parameters. Check the model parameters at the beginning of the calculations. Do it once and thoroughly. If these checks are sufficient, there should be no illegal state afterwards related to the input data. So they do not need to be checked permanently.

Let it crash

Check only input parameters of functions or methods if a fail of those be fatal (like returning wrong results). Let there be a NullPointerException, IllegalArgumentException and what so ever if something happens. This is OK and exceptions are meant for situations like that. The calculation can be aborted that way and the exception can be caught in a higher function to abort the software or the calculation gracefully, but the cost to check everything permanently is high. On the other side: What will you do when a negative value come into a square root function with double output or the matrix dimensions do not fit in a matrix multiplication? There is no meaningful way to proceed, but to abort the calculation. Check the input model and everything is fine.

Crash early

Include sanity checks in your calculations. As soon as the calculation is not bringing more precision, runs into a wrong result, gives the first nan or inf values or behaves strangely in any way, abort the calculation and let the computer compute something more meaningful. It is a total waste of resources to let a program run, which does not do anything meaningful anymore. It is also very social to let other people calculate stuff in the meantime.

Organize data for efficient access

I have seen software which looks up data in arrays element wise by scanning from the first element to the position where the data is found. This leads into linear time behavior O(n) for the search. This can be done with binary search for instance which brings logarithmic time behavior O(log(n)). Sometimes, it is also possible to hold data in memory in a not normalized way to have access to it in different ways. Sometimes a mapping is needed from index to data and sometimes the other way around. If memory is not an issue, think about keeping the data in memory twice for optimized access.

Conclusion

I hope, I could show how the focus on efficiency can bring the right insights on how to reduce software run times. The correct mind set helps to identify the weak points in software and the selection of the points above should point out some directions to look into software to find inefficiencies. A starting point is presented, but the way to go is different for every project.

5S Methodology and Software Development

The 5s methodology is used to keep the working environment clean, ordered and efficient. I came across this methodology when I was working in a production area for a semiconductor factory. This system was referenced from time to time when internal audits showed some weaknesses in regard to efficiency, order or cleanness. For a short reference on 5s methodology have a look to: http://en.wikipedia.org/wiki/5S_%28methodology%29.

The key topics Sorting, Straightening, Shine, Standardize, and Sustain are quite weak translations of the Japanese words Seiri, Seiton, Seiso, Seiketsu, and Shitsuke due to the wish to translate it into English words which also start with the S letter. Nevertheless, they express roughly the idea behind that and details are explained below. Again like for the post about the Seven Muda, I also translate these topics into the field of Software Engineering as I understand it.

Sorting (Seiri)

This principle has a strong relation to the Muda Inventory, Over-Processing and Over-Production.

Classic meaning:

The meaning here is: Remove everything unnecessary. Check all tools, materials, and machinery and remove everything which is not needed. This cleans out the workspace, makes spaces and removes distraction. The rate of defects decreases due to a lowered risk to use wrong tools or materials and more space means less incidents.

Software Engineering:

For software engineering it is the same, but it is twofold:

  1. For development process: Remove all tools and stuff in your workspace, IDE or PC which is not needed. These tools distract the developer, make the workspace cleaner as mentioned above and also sometimes more stable (everyone who uses Eclipse with a lot of plugins know, what I mean).
  2. For Architecture and Design: Remove all components, interfaces, libraries, and systems which are not needed. These are added as soon as they are needed. For example: To implement everything in patterns right from the beginning does not make much sense, when  not needed, yet. The requirements may change and what was though at the beginning is needed, will not be needed further on. Only use and implement, what is needed and postpone everything else into the future when needed.

Not hitting the correct meaning, but also part of it is duplicate code. Duplicate code is something which is redundant. Redundancy is also something which needs to be cleaned out. Duplicate code is a nightmare for maintainability and should be avoided in all means.

Straightening (Seiton)

This principle has a relationship to Muda Transport, Motion and Waiting.

Classic meaning:

This principle is about straightening the processes. Everything should be processed in an efficient way. Transport ways need to be shortened, motions to be avoided, unnecessary tasks to be eliminated and wait time to be reduced. This principle can only be applied in iterations with close observations.

Software Engineering:

In software engineering this principle can be used as a driving factor for lean architecture and design. The Muda Transport, Motion and Waiting in post about the Seven Muda give hints were to look out in software engineering.

Shine (Seiso)

Classic meaning:

This is about cleaning and ordering the workspace. As soon as everything is clean and ordered, the station is ready for usage. In each shift or on daily basis, cleaning and ordering should be scheduled. On such a workspace, process flaws and defects are better to find and the work is easier, cleaner and safer.

Software Engineering:

For software engineering, I would refer to Clean Code and Refactoring. Write clean code and clean the code as soon as bad smells are detected. This keeps the code clean and erosion is prevented. Bugs are easier to spot in clean code and also easier fixed.

With refactoring the architecture and design stays clean, too. This assures an understandable architecture which support bug fixing and improvements.

These actions should take place during normal work, but also scheduled at the end of sprints for instance. Code Reviews can also set in place in critical parts of the software to assure the right measure of quality. A time budget from 5% to 50% depending of the state of the code should be scheduled. In brown field projects with a lot of legacy code massive cleanup can help to improve the later development of new functionality dramatically. But, also in green field projects erosion takes place and should be fought with a 5% time budget at least. Have a look to the books Refactoring by Martin Fowler and Clean Code by Robert C. Martin for details, or Working Effectively with Legacy Code by Robert C. Martin.

Standardize (Seiketsu)

Classic meaning:

In fabrications all work stations should be standardized, what means that they should look, function and feel all the same. It is more easy to train people on a new station that way, the quality is higher due to a lower defect rate and it is also cheaper if it is possible to reuse procedures, tools and material.

Software Engineering:

For software engineering, there are two possible meanings:

  1. All developers should use the same tools for development. It is more easy then to maintain a development environment where only one IDE is present, one build system, on OS and so forth. Only for testing there may be some variation, but for pure development, it is easier to deal with one kind of tool for one purpose.
  2. Within the software everything should be handled in a standardized way. So the architecture should define standards and also design. For instance it could be a standard that all components of a larger system communicate to each other with a REST interface. There only one REST library is used. It would be worse if all components would talk with another protocol like SOAP, RMI, EJB and so forth. For design it is the same. Exception handling for instance should be defined how it should be done. How are files handled? Coding guide lines and so forth.

Standards help that people can identify parts in larger systems more easily. Understandability and Maintainability improve dramatically.

Sustain (Shitsuke)

The first four points are hard enough to accomplish, but this sustaining point is even harder. What you did and accomplish in the first four sections is a large step to an efficient production environment. This is something which is done in form of a project, but a project is time limited. The real art of 5S is now to keep the state what was accomplished and to even improve it. That’s a huge leap! The goal is to establish a control system which checks for instance from time to time the current situation in a kind of internal audit and to raise the issues found. The issues should be fixed as soon as possible. With a regular check of the other 4S and an improvement of the findings, the current state can be sustained and even improved. But, this needs a lot of attention and energy.

In Software Engineering the buzz words Code Smell (or just Smell), Refactoring and Architecture Refactoring come to mind. As soon as there are bad smells, an action needs to be taken to fix this issue. The longer the issues is present, the more it manifests itself and the harder is it to be fixed. As the Asian proverb says: It is easy to change the direction of a river at its source…

Further Enhancements

Sometimes some enhancements are added. These points express some enhancements which should be taken care of, too. I only explain them shortly, because they are quite self explanatory.

Enhancement to 6S: ‚Get Used to it‘ (Shukan)

This point is often added to the original 5S. With the checks and fixes in the Sustaining part, people get trained to keep an open eye on 5S. Over time everybody should develop a habit of fixing everything which is not in order to have an easier and more efficient life. It is good to create a company culture for 5S.

Safety

Classic meaning:

This is very easy: Keep everything and order and additionally, watch out for sources of accidents and prepare everything that this accidents cannot happen.

Software Engineering:

Here it is about the quality of source code and architecture for accidents like crashes, wrong results, stability and so forth.

Security

Classic meaning:

Keep people out which are not supposed to be in certain area. Keep secrets secret and confidential data confidential.

Software Engineering:

Build  your software in a way that only authenticated people are allowed to change settings which they are authorized to and keep data protected from people which are not allowed to see them.

The Seven Muda (Wastes) and Software Engineering

I was introduced into the term Muda (waste) when I was working for a semiconductor fab. I learned that there a seven of them and that  a close look out to these wastes can reduce costs and increase quality dramatically. A short introduction about it can be found here: http://en.wikipedia.org/wiki/Muda_%28Japanese_term%29.

I want to write about these wastes with a shifted focus. The seven Muda were formulated for classical production and ‚real world‘, but as we can see later on, these principles can be used for software engineering, too. Looking out for these wastes can help to make architecture, design and code lean and clean.

Transportation

Classic meaning:

The classic meaning is the reduction of transportation. This means, one should look out for everything what is transported and how these transports can be minimized. In fabrication it means for example that the transport for goods can be optimized by ordering larger quantities, to use vendors which are closer by and so forth. The transportation costs can be reduced and the margin can be increased. Transportation does not add value to the product, but it adds risk. During any transport operation the risk is there for breaking, loosing and delaying the product.

Software engineering:

Transportation can be translated here for example to IO. Avoid transportation over network, to disc and what so ever. If IO can be reduced, the timing is better, bandwidth is saved and other applications and systems are not affected negatively by a bandwidth exhaustion due to excessive use of IO of one system. To safe IO, good data models should be used for a high reuse of data. Double fetches could be avoided, only data which is needed is fetched from a DB for example and not the whole database is read just in case… Savings here leads directly into more responsive systems, reduced costs for IO facilities and high throughput.

Inventory

Classic meaning:

The meaning of Waste of Inventory is quite easy. It is about the waste to have raw material, finished products and work in progress laying around without the prospect to monetize it. It may or may not be sold. In this state it is a potential waste and should be avoided. A waste of inventory may lead into selling the products under value or even into dumping them.

Software engineering:

In software engineering inventory is twofold:

  1. Source Code: Writing source code which is not requested by the customer (directly or indirectly) may not lead to revenue. So it is a waste of time and also resources to produce it. Only functionality which brings in revenue is to be developed. Everything else is potential waste, like finished products laying in a warehouse without a demand by customers. A lot of money can be saved by letting developers produce software which adds value and brings in money. Also trial and error development (or Programming by Coincidence like described in The Pragmatic Programmer by Andrew Hunt et. al.) is waste. All development versions which are dumped on the way to the final version are waste. Some thoughts in advance can save a lot of time, money and trouble.
  2. Data: With the ‚Big Data‘ discussion the Waste of Inventory is put into public again. Every piece of data stored costs some money. Even one pays storage by cents per gigabyte, the big amount of data makes it expensive. Data should be stored only if needed and selected carefully. Costs for storage can be reduced. Due to date is transported to the storage facilities, transport costs are also reduced.

Motion

Classic meaning:

When transport is about bringing the goods from one facility to another or from one machine to another, than Motion is about the handling of products during the production process. The more handling is involved during production, the more time is needed for that action and risk is added for damage and low quality. Also, motion is mechanical motion and lead into a degeneration of the machines used. For people it is the same. To much handling of products lead to illnesses and other issues which also cost money in form of sick days. Avoiding motion reduces costs for maintenance, sick days and broken products.

Software engineering:

Motion is software engineering is not so easy to define. The closes equivalents in my opinion are:

  1. Motion = unnecessary things done in software. This may be an animation too much which is not needed, but might break the application by its presence and wastes CPU time. It may be a storage operations too much just to be save some data temporarily for a case of power failure, but it the data could be recalculated if needed. There might be a watchdog to much. Defensive programming is find, but too much is not needed. Things done unnecessarily lead into waste of time and resources. The software runs longer and wastes CPU time. Sorry, I do not have a better explanation, but maybe you get the point.
  2. Motion = unnecessary things done during software development process. This can mean unnecessary work due to Programming by Coincidence due to missing design sessions. It can also mean writing unnecessary documentation, design papers and such stuff. It can mean unnecessary meetings, conference calls and status presentations.

Waiting

Classic meaning:

Every product in Work which waits for something, does not add value, consumes space and the delays may lead to a bad reputation. All wait times need to be reduced. This can be done by queue managing. Have a look to the book The Principles of Product Development Flow: Second Generation Lean Product Development by Donald G. Reinertsen for more information.

Software engineering:

In software engineering, the most obvious waste due to waiting would be a programmed delay or sleep in a program to wait for something. This is obviously not a good design. Better do a design with asynchronous execution and notification. A program should always do something meaningful if possible. Do not wait for something to happen. Do something in meanwhile and wait for notification for example or have some processes in parallel which fill up the CPU time of a sleeping thread. All waits are a waste of customers time. This should be avoided, otherwise it leads into frustration without adding value to anything.

Over-processing

Classic meaning:

In classic fabrication this means: You do something better, more accurate or more beautiful than required. The customer is paying for a product with a negotiated specification. This specification needs to be met, but not more. More work on the product will lead into higher production costs, more time needed and more risk for damage without a monetary compensation.

Pay attention: Over processing can be a part of a marketing strategy and a customer satisfaction program. By over delivering a customer may be surprised positively which may lead to a returning customer, a higher order for the next time and so forth. This is not over processing as it is meant above. This is part of a strategy which brings higher revenue in future.

Software engineering:

Over processing is quite the same as in classic engineering. A software product calculates more accurately than needed. The performance tuning was done extensively to get the last microseconds out of the calculations. And there are much more things like that. As long as the product is good enough, we should stop working on features already done. It does not bring more value to the customer.

Here too: Please pay attention for over delivering. This is a magic tool if done right. See above at the classic engineering section.

Over-production

Classic meaning:

Over-production is simply the production of more pieces of a product than needed the time of production. There is a risk that not all products which were produced can be sold. The avoidance of over production reduces costs, reduces the amount of resources needed for production and is also good for the environment.

Software engineering:

Over-production has two meanings in software engineering, as far as I can see it:

  1. Over production of results: A software product which produces more results than needed, wastes resources and time. This is not what customers want and that is also nothing they want to pay for. At least, provide configuration possibilities.
  2. Over production of features: In software engineering (as in all engineering disciplines), engineers tend to over-engineering. Full blood engineers want to make the product perfect, feature rich, shiny and so forth. This might lead to feature bloat. Every feature which is not requested by the customer does not add value. A customer will not pay more money for functions they do not want to use. That’s why a lot of products come in different flavors like community, basic and enterprise version. The customer chooses what features are needed and pays for exactly them.

Defects

Classic meaning:

Defect products need repairing or if they can not be repaired, need to be dumped. Both choices cost money. Additionally, the reputation is influenced negatively which costs future money due to customers not wanting to pay again for a product from the same manufacturer. It becomes even worse as soon as the defect damages something on customer site and the customer asks (un-)politely for regress. A good customer support division can compensate a lot, but this is expensive, too. So: Defects should be avoided. They always waste a lot of money.

Software engineering:

For software defects the same facts are valid like for mechanical engineering. Defects cost money and reputation. So, the best is not to have any. Avoiding defects by excessive testing and quality control is cheaper than handling angry customers, doing failure analysis, bug fixing, patch releasing and loosing future customers.

Additionally as 8th Muda: Latent skill

There is an additional unofficial 8th Muda: Latent skill. Officially, it is spoken about utilizing the skills of employers. People which were hired to fill out a certain position might be able to do much more or more valuable work than what the position requires. These people should be given an oportunity to grow and do what they are capable of. Additionally, a lot of employees want to learn more and want to be trained. It is not only about getting a higher salary, but also about personal grow and satisfaction.

In my opinion there is another site of Latent Skill Waste: It is about machinery. Some high-tech machines are capable of doing more, than there were bought to do. They can be utilized if it is possible. In IT this is were cloud computing was invented. It is partly waste by waiting and waste by latent skill, when servers are not utilized due to too less work. With cloud computing utilization of servers can be increased. This utilization comes in two flavors: Doing more of the same work (reducing waste of waiting) or running other services in parallel (reducing waste due to latent skill). A higher utilization means more revenue and therefore more profit, because the deprecation costs are the same.

A Final Thought

The Muda are not meant to be used for cost reduction in first place. The mind set is not correct, in my opinion. The Muda are about efficiency. To do cost reduction, efficiency needs to be increased, that is correct. But, to think about cost reduction only leads into decisions which might hurt quality and effectiveness. About the difference in mind set and practical approach, I might write about later on in another post.

Thoughts on the Agile Manifesto

From time to time, I discuss the agile methodology with clients and friends. The Agile Manifesto was published at http://agilemanifesto.org about 12 years ago. It was debated a lot and the debates still are going on. I try present here a small inside I had during the last years.

The main statement is

„Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on
the right, we value the items on the left more.

Interestingly, the focus is shifted from the product, its documentation and the technical process (the planning) to customer focus and customer satisfaction. This is also part of another system which is called Total Quality Management (http://en.wikipedia.org/wiki/Total_quality_management). The focus shift is very obvious and necessary, when we think about the only income source each company has: The customer. The customer (or different customers if an organization has different services to offer) is the only source for income and therefore for revenue, profit and growth. Any revenue increase can only happen, when customers pay more for services or products or more customers are willing to spend money on the companies services or products. It is therefore obvious, that the focus needs to be on the customer and that the organization needs to be aligned to meet the needs and expectations of customers.

That’s why the first point about ‚Individuals and interactions‘ is the most important point. Translated to easy actions it means: Identify your customers, treat them individually and implement processes for easy communication and interaction. Only customers can tell you what they need, what they expect and what they are ready to pay for. Individual customers treated well, will tell you more detailed, what they need and bring new business ideas. Ask a group of people and there is no detail. But, ask a single individuals and listen closely. You might get a lot of insights.

In software development the main reason of organizations is delivery working software and systems. These are the primary needs of their customers. They do not need a fancy manual to read, what the software might be able to do after spending hours to read the manual and trying things out in tutorials, but they need a software which brings business value. That’s why the second point is important. Have a running, valuable software which is self-explanatory and the customer is willing to pay for it. You can save a tree by dropping the printed documentation. Have a look to Apple products. How much manuals are sold with this complex and feature rich software? Some only help and a self-explanatory UI and everything is fine. This is one of the fundamentals of Total Quality Management: Only the customer can tell you what she wants and she is also the one who pays. Is there another way to work with the knowledge?

The third point is the enhancement of the first point. If the chance is there, try to work with your customer closely together. In Scrum and XP it is done by short release cycles and demos to show the customer progress on regular basis after each release cycle and ask for critics, comments and new ideas. It helps to deliver software which is valuable for the customer and therefore, which is paid for. An even better idea is to embed a representative of the customer into the development team. The responses are immediate and customer’s acceptance testing is done the whole time. The possibility of developing software for what the customer does not want to pay, is reduced dramatically. And again: The customers pays for the product. There is no way to make a better product than to build the product together with your customer and when the customer is part of the team, she is even more engaged and willing to help for development. At the very end, the willing to pay is much higher, when the product was kind of custom built.

By doing all this, be prepared: With each demo, feedback session and communication to the customer, there might be new ideas, comments and critics. The requirements are about to change on daily basis. That’s what the fourth value is about. Be open for changes. Customer only have vague idea at the beginning, about what they want. But, during the development, more ideas arise, some faulty ideas are dropped and new wishes pop up. That’s kind of normal and part of the process. This helps to make the product better at the end and the business value is increased. A product like that can be priced higher, though. What is better than that?

Is „good enough“ good enough?

I thought about the term „good enough“ lately. It came into my mind, that the term is used in a kind of way which is misleading.

In a lot of books about quality and economics (books which deal with the combination of the two) it is written, that development should not be perfect, because it is economically not meaningful, but it should be good enough. The hint is right, but misleading. It is not explained in detail what ‚goo enough‘ really means, at least not in the books I wrote about that topic.

The term good enough is treated in almost all cases I witnessed as: Good enough = Good enough to be sell-able. This leads to a focus on external design, usability and feature bloat to increase market value. The market value in most cases is only judged by the external attributes. Most buyers do not look into the products and judge the internal value, what in most cases also would not make much sense. That’s why colorful products, well-designed and products with a ton of features can be sold best.

There is nothing wrong about the fact that such products can be sold best and most easy, but from the business point of view, this is not enough. What is good enough for the end customer, is not necessarily good enough for the producer of the product. All products have more needs than just to be sell-able. Products need to be recyclable, maintainable, ecological and maybe also upgradable (and much more). These thoughts lead to the conclusion, that there is more to think about than just the market and to be good enough for the market.

In software industry for instance, the source code of a current product is used as foundation for all further products. As soon as the source base gets unmaintainable, all future products are in danger. Software is very complex to develop and therefore, very expensive. A complete new development of a source base of a product family is very expensive and only large companies can handle and survive that.

It is therefore very important, right from the beginning, to spend some effort on code quality. The application of metrics, defect checks and conventions is crucial. Otherwise, the future of a company can be in danger.