Schlagwort-Archive: software

Can Programs Be Made Faster?

Short answer: No. But, more efficient.

A happy new year to all of you! This is the first post in 2014 and it is a (not so) short post about a topic which follows me all the time during discussions about high performance computing. During discussions and in projects I get asked about how programs can be programmed to run faster. The problem is, that this mind set is misleading. It always takes me some minutes to explain the correct mind set: Programs cannot run faster, but more efficient to save time.

If we neglect that we can scale vertically by using faster CPUs, faster memory and faster disks, the speed of a computer is constant (by also neglecting CPUs which change there speed so save power). All programs run always with the same speed and we cannot do anything to speed them up by just changing the programming. What we can do is, to use the hardware we got as efficient as possible. The effect is: We get more done in less time. This reduces the program run time and the software seem to run faster. That is what people mean, but looking on efficiency brings the mind set to find the correct leverages on how to decrease run time.

A soon as a program returns the correct results it is effective, but there is also the efficiency which is to be looked at. Have a look to my post about effectiveness and efficiency for more details about the difference between effectiveness and efficiency. To gain efficiency, we can do the following:

Use all hardware available

All cores of a multi-core CPU can be utilized and all CPUs of the system if we have more than one CPU in the system. GPU or physical accelerator cards can be used for calculation if present.

Especially in brown field projects, where the original code comes from single core systems (before 2005 or so) or system which did not have appropriate GPUs (before 2009), developers did not pay attention multi-threaded, heterogeneous programming. These programs have a lot of potential for performance gains.

Look out for:

CPU utilization

Introduce mutli-thread programming into your software. Check the CPU utilization during an actual run and look for CPU idle tines. If there are any, check your software whether it can do something at the time the idle times occur.

GPU utilization

Introduce OpenCL or CUDA into your software to utilize the GPU board or physics accelerator cards if present. Check the utilization of the cards during calculation and look for optimizations.

Data partitioning for optimal hardware utilization

If a calculation does not need too much data, everything should be loaded into memory to have the data present there for efficient access. Data can also organized to have access in different modes for sake of efficiency. But, if there are calculations with amounts of data which do not fit into memory, a good strategy is needed for not to perform calculations on disk.

The data should be partitioned into smaller pieces. These pieces should fit into memory and the calculations on these pieces should run in memory completely. The bandwidth CPU to memory is about 100 to 1000 faster than CPU to disk. If you have done this, check with tools for cache misses and check whether you can optimize this.

Intelligent, parallel data loading

The bottle neck for calculations are CPU and/or GPU. They need to be utilized, because only they bring relevant results. All other hardware a facilities around that. So, do everything to keep the CPUs and/or GPUs busy. It is not a good idea to load all data into memory (and let CPU/GPU idle), then start a calcuation (everything is busy) to store the results afterwards (and have the CPU/GPU idle again). Develop you software with dynamic data loading. During the time calculations run, new data can be caught from disk to prepare the next calculations. The next calculations can run during the time the former results are written onto disk.This maybe keeps a CPU core busy with IO, but the other cores do meaningful work and the overall utilization increases.

Do not do unnecessary things

Have a look to my post about the seven muda to get an impression about wastes. All these wastes can be found in software and these lead into inefficiency. Everything which does not directly contribute to the expected results of the software needs to be questioned. Everything which uses CPU power, memory bandwidth and disk bandwidth, but is not directly connected to the requested calculation may be treated as potential waste.

To have a starter look for, check and optimize:

Decide early

Decide early, when to abort loops, what calculations to do and how to proceed. Some decisions are made in code on a certain position, but sometimes these checks can be done earlier in code or before loops, because the information is already present. This is something to be checked. During refactorings there might be other, more efficient positions for these checks. Look out for them.

Validate economically

Do not check in functions the validity of your parameters. Check the model parameters at the beginning of the calculations. Do it once and thoroughly. If these checks are sufficient, there should be no illegal state afterwards related to the input data. So they do not need to be checked permanently.

Let it crash

Check only input parameters of functions or methods if a fail of those be fatal (like returning wrong results). Let there be a NullPointerException, IllegalArgumentException and what so ever if something happens. This is OK and exceptions are meant for situations like that. The calculation can be aborted that way and the exception can be caught in a higher function to abort the software or the calculation gracefully, but the cost to check everything permanently is high. On the other side: What will you do when a negative value come into a square root function with double output or the matrix dimensions do not fit in a matrix multiplication? There is no meaningful way to proceed, but to abort the calculation. Check the input model and everything is fine.

Crash early

Include sanity checks in your calculations. As soon as the calculation is not bringing more precision, runs into a wrong result, gives the first nan or inf values or behaves strangely in any way, abort the calculation and let the computer compute something more meaningful. It is a total waste of resources to let a program run, which does not do anything meaningful anymore. It is also very social to let other people calculate stuff in the meantime.

Organize data for efficient access

I have seen software which looks up data in arrays element wise by scanning from the first element to the position where the data is found. This leads into linear time behavior O(n) for the search. This can be done with binary search for instance which brings logarithmic time behavior O(log(n)). Sometimes, it is also possible to hold data in memory in a not normalized way to have access to it in different ways. Sometimes a mapping is needed from index to data and sometimes the other way around. If memory is not an issue, think about keeping the data in memory twice for optimized access.

Conclusion

I hope, I could show how the focus on efficiency can bring the right insights on how to reduce software run times. The correct mind set helps to identify the weak points in software and the selection of the points above should point out some directions to look into software to find inefficiencies. A starting point is presented, but the way to go is different for every project.

5S Methodology and Software Development

The 5s methodology is used to keep the working environment clean, ordered and efficient. I came across this methodology when I was working in a production area for a semiconductor factory. This system was referenced from time to time when internal audits showed some weaknesses in regard to efficiency, order or cleanness. For a short reference on 5s methodology have a look to: http://en.wikipedia.org/wiki/5S_%28methodology%29.

The key topics Sorting, Straightening, Shine, Standardize, and Sustain are quite weak translations of the Japanese words Seiri, Seiton, Seiso, Seiketsu, and Shitsuke due to the wish to translate it into English words which also start with the S letter. Nevertheless, they express roughly the idea behind that and details are explained below. Again like for the post about the Seven Muda, I also translate these topics into the field of Software Engineering as I understand it.

Sorting (Seiri)

This principle has a strong relation to the Muda Inventory, Over-Processing and Over-Production.

Classic meaning:

The meaning here is: Remove everything unnecessary. Check all tools, materials, and machinery and remove everything which is not needed. This cleans out the workspace, makes spaces and removes distraction. The rate of defects decreases due to a lowered risk to use wrong tools or materials and more space means less incidents.

Software Engineering:

For software engineering it is the same, but it is twofold:

  1. For development process: Remove all tools and stuff in your workspace, IDE or PC which is not needed. These tools distract the developer, make the workspace cleaner as mentioned above and also sometimes more stable (everyone who uses Eclipse with a lot of plugins know, what I mean).
  2. For Architecture and Design: Remove all components, interfaces, libraries, and systems which are not needed. These are added as soon as they are needed. For example: To implement everything in patterns right from the beginning does not make much sense, when  not needed, yet. The requirements may change and what was though at the beginning is needed, will not be needed further on. Only use and implement, what is needed and postpone everything else into the future when needed.

Not hitting the correct meaning, but also part of it is duplicate code. Duplicate code is something which is redundant. Redundancy is also something which needs to be cleaned out. Duplicate code is a nightmare for maintainability and should be avoided in all means.

Straightening (Seiton)

This principle has a relationship to Muda Transport, Motion and Waiting.

Classic meaning:

This principle is about straightening the processes. Everything should be processed in an efficient way. Transport ways need to be shortened, motions to be avoided, unnecessary tasks to be eliminated and wait time to be reduced. This principle can only be applied in iterations with close observations.

Software Engineering:

In software engineering this principle can be used as a driving factor for lean architecture and design. The Muda Transport, Motion and Waiting in post about the Seven Muda give hints were to look out in software engineering.

Shine (Seiso)

Classic meaning:

This is about cleaning and ordering the workspace. As soon as everything is clean and ordered, the station is ready for usage. In each shift or on daily basis, cleaning and ordering should be scheduled. On such a workspace, process flaws and defects are better to find and the work is easier, cleaner and safer.

Software Engineering:

For software engineering, I would refer to Clean Code and Refactoring. Write clean code and clean the code as soon as bad smells are detected. This keeps the code clean and erosion is prevented. Bugs are easier to spot in clean code and also easier fixed.

With refactoring the architecture and design stays clean, too. This assures an understandable architecture which support bug fixing and improvements.

These actions should take place during normal work, but also scheduled at the end of sprints for instance. Code Reviews can also set in place in critical parts of the software to assure the right measure of quality. A time budget from 5% to 50% depending of the state of the code should be scheduled. In brown field projects with a lot of legacy code massive cleanup can help to improve the later development of new functionality dramatically. But, also in green field projects erosion takes place and should be fought with a 5% time budget at least. Have a look to the books Refactoring by Martin Fowler and Clean Code by Robert C. Martin for details, or Working Effectively with Legacy Code by Robert C. Martin.

Standardize (Seiketsu)

Classic meaning:

In fabrications all work stations should be standardized, what means that they should look, function and feel all the same. It is more easy to train people on a new station that way, the quality is higher due to a lower defect rate and it is also cheaper if it is possible to reuse procedures, tools and material.

Software Engineering:

For software engineering, there are two possible meanings:

  1. All developers should use the same tools for development. It is more easy then to maintain a development environment where only one IDE is present, one build system, on OS and so forth. Only for testing there may be some variation, but for pure development, it is easier to deal with one kind of tool for one purpose.
  2. Within the software everything should be handled in a standardized way. So the architecture should define standards and also design. For instance it could be a standard that all components of a larger system communicate to each other with a REST interface. There only one REST library is used. It would be worse if all components would talk with another protocol like SOAP, RMI, EJB and so forth. For design it is the same. Exception handling for instance should be defined how it should be done. How are files handled? Coding guide lines and so forth.

Standards help that people can identify parts in larger systems more easily. Understandability and Maintainability improve dramatically.

Sustain (Shitsuke)

The first four points are hard enough to accomplish, but this sustaining point is even harder. What you did and accomplish in the first four sections is a large step to an efficient production environment. This is something which is done in form of a project, but a project is time limited. The real art of 5S is now to keep the state what was accomplished and to even improve it. That’s a huge leap! The goal is to establish a control system which checks for instance from time to time the current situation in a kind of internal audit and to raise the issues found. The issues should be fixed as soon as possible. With a regular check of the other 4S and an improvement of the findings, the current state can be sustained and even improved. But, this needs a lot of attention and energy.

In Software Engineering the buzz words Code Smell (or just Smell), Refactoring and Architecture Refactoring come to mind. As soon as there are bad smells, an action needs to be taken to fix this issue. The longer the issues is present, the more it manifests itself and the harder is it to be fixed. As the Asian proverb says: It is easy to change the direction of a river at its source…

Further Enhancements

Sometimes some enhancements are added. These points express some enhancements which should be taken care of, too. I only explain them shortly, because they are quite self explanatory.

Enhancement to 6S: ‚Get Used to it‘ (Shukan)

This point is often added to the original 5S. With the checks and fixes in the Sustaining part, people get trained to keep an open eye on 5S. Over time everybody should develop a habit of fixing everything which is not in order to have an easier and more efficient life. It is good to create a company culture for 5S.

Safety

Classic meaning:

This is very easy: Keep everything and order and additionally, watch out for sources of accidents and prepare everything that this accidents cannot happen.

Software Engineering:

Here it is about the quality of source code and architecture for accidents like crashes, wrong results, stability and so forth.

Security

Classic meaning:

Keep people out which are not supposed to be in certain area. Keep secrets secret and confidential data confidential.

Software Engineering:

Build  your software in a way that only authenticated people are allowed to change settings which they are authorized to and keep data protected from people which are not allowed to see them.

The Seven Muda (Wastes) and Software Engineering

I was introduced into the term Muda (waste) when I was working for a semiconductor fab. I learned that there a seven of them and that  a close look out to these wastes can reduce costs and increase quality dramatically. A short introduction about it can be found here: http://en.wikipedia.org/wiki/Muda_%28Japanese_term%29.

I want to write about these wastes with a shifted focus. The seven Muda were formulated for classical production and ‚real world‘, but as we can see later on, these principles can be used for software engineering, too. Looking out for these wastes can help to make architecture, design and code lean and clean.

Transportation

Classic meaning:

The classic meaning is the reduction of transportation. This means, one should look out for everything what is transported and how these transports can be minimized. In fabrication it means for example that the transport for goods can be optimized by ordering larger quantities, to use vendors which are closer by and so forth. The transportation costs can be reduced and the margin can be increased. Transportation does not add value to the product, but it adds risk. During any transport operation the risk is there for breaking, loosing and delaying the product.

Software engineering:

Transportation can be translated here for example to IO. Avoid transportation over network, to disc and what so ever. If IO can be reduced, the timing is better, bandwidth is saved and other applications and systems are not affected negatively by a bandwidth exhaustion due to excessive use of IO of one system. To safe IO, good data models should be used for a high reuse of data. Double fetches could be avoided, only data which is needed is fetched from a DB for example and not the whole database is read just in case… Savings here leads directly into more responsive systems, reduced costs for IO facilities and high throughput.

Inventory

Classic meaning:

The meaning of Waste of Inventory is quite easy. It is about the waste to have raw material, finished products and work in progress laying around without the prospect to monetize it. It may or may not be sold. In this state it is a potential waste and should be avoided. A waste of inventory may lead into selling the products under value or even into dumping them.

Software engineering:

In software engineering inventory is twofold:

  1. Source Code: Writing source code which is not requested by the customer (directly or indirectly) may not lead to revenue. So it is a waste of time and also resources to produce it. Only functionality which brings in revenue is to be developed. Everything else is potential waste, like finished products laying in a warehouse without a demand by customers. A lot of money can be saved by letting developers produce software which adds value and brings in money. Also trial and error development (or Programming by Coincidence like described in The Pragmatic Programmer by Andrew Hunt et. al.) is waste. All development versions which are dumped on the way to the final version are waste. Some thoughts in advance can save a lot of time, money and trouble.
  2. Data: With the ‚Big Data‘ discussion the Waste of Inventory is put into public again. Every piece of data stored costs some money. Even one pays storage by cents per gigabyte, the big amount of data makes it expensive. Data should be stored only if needed and selected carefully. Costs for storage can be reduced. Due to date is transported to the storage facilities, transport costs are also reduced.

Motion

Classic meaning:

When transport is about bringing the goods from one facility to another or from one machine to another, than Motion is about the handling of products during the production process. The more handling is involved during production, the more time is needed for that action and risk is added for damage and low quality. Also, motion is mechanical motion and lead into a degeneration of the machines used. For people it is the same. To much handling of products lead to illnesses and other issues which also cost money in form of sick days. Avoiding motion reduces costs for maintenance, sick days and broken products.

Software engineering:

Motion is software engineering is not so easy to define. The closes equivalents in my opinion are:

  1. Motion = unnecessary things done in software. This may be an animation too much which is not needed, but might break the application by its presence and wastes CPU time. It may be a storage operations too much just to be save some data temporarily for a case of power failure, but it the data could be recalculated if needed. There might be a watchdog to much. Defensive programming is find, but too much is not needed. Things done unnecessarily lead into waste of time and resources. The software runs longer and wastes CPU time. Sorry, I do not have a better explanation, but maybe you get the point.
  2. Motion = unnecessary things done during software development process. This can mean unnecessary work due to Programming by Coincidence due to missing design sessions. It can also mean writing unnecessary documentation, design papers and such stuff. It can mean unnecessary meetings, conference calls and status presentations.

Waiting

Classic meaning:

Every product in Work which waits for something, does not add value, consumes space and the delays may lead to a bad reputation. All wait times need to be reduced. This can be done by queue managing. Have a look to the book The Principles of Product Development Flow: Second Generation Lean Product Development by Donald G. Reinertsen for more information.

Software engineering:

In software engineering, the most obvious waste due to waiting would be a programmed delay or sleep in a program to wait for something. This is obviously not a good design. Better do a design with asynchronous execution and notification. A program should always do something meaningful if possible. Do not wait for something to happen. Do something in meanwhile and wait for notification for example or have some processes in parallel which fill up the CPU time of a sleeping thread. All waits are a waste of customers time. This should be avoided, otherwise it leads into frustration without adding value to anything.

Over-processing

Classic meaning:

In classic fabrication this means: You do something better, more accurate or more beautiful than required. The customer is paying for a product with a negotiated specification. This specification needs to be met, but not more. More work on the product will lead into higher production costs, more time needed and more risk for damage without a monetary compensation.

Pay attention: Over processing can be a part of a marketing strategy and a customer satisfaction program. By over delivering a customer may be surprised positively which may lead to a returning customer, a higher order for the next time and so forth. This is not over processing as it is meant above. This is part of a strategy which brings higher revenue in future.

Software engineering:

Over processing is quite the same as in classic engineering. A software product calculates more accurately than needed. The performance tuning was done extensively to get the last microseconds out of the calculations. And there are much more things like that. As long as the product is good enough, we should stop working on features already done. It does not bring more value to the customer.

Here too: Please pay attention for over delivering. This is a magic tool if done right. See above at the classic engineering section.

Over-production

Classic meaning:

Over-production is simply the production of more pieces of a product than needed the time of production. There is a risk that not all products which were produced can be sold. The avoidance of over production reduces costs, reduces the amount of resources needed for production and is also good for the environment.

Software engineering:

Over-production has two meanings in software engineering, as far as I can see it:

  1. Over production of results: A software product which produces more results than needed, wastes resources and time. This is not what customers want and that is also nothing they want to pay for. At least, provide configuration possibilities.
  2. Over production of features: In software engineering (as in all engineering disciplines), engineers tend to over-engineering. Full blood engineers want to make the product perfect, feature rich, shiny and so forth. This might lead to feature bloat. Every feature which is not requested by the customer does not add value. A customer will not pay more money for functions they do not want to use. That’s why a lot of products come in different flavors like community, basic and enterprise version. The customer chooses what features are needed and pays for exactly them.

Defects

Classic meaning:

Defect products need repairing or if they can not be repaired, need to be dumped. Both choices cost money. Additionally, the reputation is influenced negatively which costs future money due to customers not wanting to pay again for a product from the same manufacturer. It becomes even worse as soon as the defect damages something on customer site and the customer asks (un-)politely for regress. A good customer support division can compensate a lot, but this is expensive, too. So: Defects should be avoided. They always waste a lot of money.

Software engineering:

For software defects the same facts are valid like for mechanical engineering. Defects cost money and reputation. So, the best is not to have any. Avoiding defects by excessive testing and quality control is cheaper than handling angry customers, doing failure analysis, bug fixing, patch releasing and loosing future customers.

Additionally as 8th Muda: Latent skill

There is an additional unofficial 8th Muda: Latent skill. Officially, it is spoken about utilizing the skills of employers. People which were hired to fill out a certain position might be able to do much more or more valuable work than what the position requires. These people should be given an oportunity to grow and do what they are capable of. Additionally, a lot of employees want to learn more and want to be trained. It is not only about getting a higher salary, but also about personal grow and satisfaction.

In my opinion there is another site of Latent Skill Waste: It is about machinery. Some high-tech machines are capable of doing more, than there were bought to do. They can be utilized if it is possible. In IT this is were cloud computing was invented. It is partly waste by waiting and waste by latent skill, when servers are not utilized due to too less work. With cloud computing utilization of servers can be increased. This utilization comes in two flavors: Doing more of the same work (reducing waste of waiting) or running other services in parallel (reducing waste due to latent skill). A higher utilization means more revenue and therefore more profit, because the deprecation costs are the same.

A Final Thought

The Muda are not meant to be used for cost reduction in first place. The mind set is not correct, in my opinion. The Muda are about efficiency. To do cost reduction, efficiency needs to be increased, that is correct. But, to think about cost reduction only leads into decisions which might hurt quality and effectiveness. About the difference in mind set and practical approach, I might write about later on in another post.

Thoughts on the Agile Manifesto

From time to time, I discuss the agile methodology with clients and friends. The Agile Manifesto was published at http://agilemanifesto.org about 12 years ago. It was debated a lot and the debates still are going on. I try present here a small inside I had during the last years.

The main statement is

„Manifesto for Agile Software Development

We are uncovering better ways of developing
software by doing it and helping others do it.
Through this work we have come to value:

Individuals and interactions over processes and tools
Working software over comprehensive documentation
Customer collaboration over contract negotiation
Responding to change over following a plan

That is, while there is value in the items on
the right, we value the items on the left more.

Interestingly, the focus is shifted from the product, its documentation and the technical process (the planning) to customer focus and customer satisfaction. This is also part of another system which is called Total Quality Management (http://en.wikipedia.org/wiki/Total_quality_management). The focus shift is very obvious and necessary, when we think about the only income source each company has: The customer. The customer (or different customers if an organization has different services to offer) is the only source for income and therefore for revenue, profit and growth. Any revenue increase can only happen, when customers pay more for services or products or more customers are willing to spend money on the companies services or products. It is therefore obvious, that the focus needs to be on the customer and that the organization needs to be aligned to meet the needs and expectations of customers.

That’s why the first point about ‚Individuals and interactions‘ is the most important point. Translated to easy actions it means: Identify your customers, treat them individually and implement processes for easy communication and interaction. Only customers can tell you what they need, what they expect and what they are ready to pay for. Individual customers treated well, will tell you more detailed, what they need and bring new business ideas. Ask a group of people and there is no detail. But, ask a single individuals and listen closely. You might get a lot of insights.

In software development the main reason of organizations is delivery working software and systems. These are the primary needs of their customers. They do not need a fancy manual to read, what the software might be able to do after spending hours to read the manual and trying things out in tutorials, but they need a software which brings business value. That’s why the second point is important. Have a running, valuable software which is self-explanatory and the customer is willing to pay for it. You can save a tree by dropping the printed documentation. Have a look to Apple products. How much manuals are sold with this complex and feature rich software? Some only help and a self-explanatory UI and everything is fine. This is one of the fundamentals of Total Quality Management: Only the customer can tell you what she wants and she is also the one who pays. Is there another way to work with the knowledge?

The third point is the enhancement of the first point. If the chance is there, try to work with your customer closely together. In Scrum and XP it is done by short release cycles and demos to show the customer progress on regular basis after each release cycle and ask for critics, comments and new ideas. It helps to deliver software which is valuable for the customer and therefore, which is paid for. An even better idea is to embed a representative of the customer into the development team. The responses are immediate and customer’s acceptance testing is done the whole time. The possibility of developing software for what the customer does not want to pay, is reduced dramatically. And again: The customers pays for the product. There is no way to make a better product than to build the product together with your customer and when the customer is part of the team, she is even more engaged and willing to help for development. At the very end, the willing to pay is much higher, when the product was kind of custom built.

By doing all this, be prepared: With each demo, feedback session and communication to the customer, there might be new ideas, comments and critics. The requirements are about to change on daily basis. That’s what the fourth value is about. Be open for changes. Customer only have vague idea at the beginning, about what they want. But, during the development, more ideas arise, some faulty ideas are dropped and new wishes pop up. That’s kind of normal and part of the process. This helps to make the product better at the end and the business value is increased. A product like that can be priced higher, though. What is better than that?

Is „good enough“ good enough?

I thought about the term „good enough“ lately. It came into my mind, that the term is used in a kind of way which is misleading.

In a lot of books about quality and economics (books which deal with the combination of the two) it is written, that development should not be perfect, because it is economically not meaningful, but it should be good enough. The hint is right, but misleading. It is not explained in detail what ‚goo enough‘ really means, at least not in the books I wrote about that topic.

The term good enough is treated in almost all cases I witnessed as: Good enough = Good enough to be sell-able. This leads to a focus on external design, usability and feature bloat to increase market value. The market value in most cases is only judged by the external attributes. Most buyers do not look into the products and judge the internal value, what in most cases also would not make much sense. That’s why colorful products, well-designed and products with a ton of features can be sold best.

There is nothing wrong about the fact that such products can be sold best and most easy, but from the business point of view, this is not enough. What is good enough for the end customer, is not necessarily good enough for the producer of the product. All products have more needs than just to be sell-able. Products need to be recyclable, maintainable, ecological and maybe also upgradable (and much more). These thoughts lead to the conclusion, that there is more to think about than just the market and to be good enough for the market.

In software industry for instance, the source code of a current product is used as foundation for all further products. As soon as the source base gets unmaintainable, all future products are in danger. Software is very complex to develop and therefore, very expensive. A complete new development of a source base of a product family is very expensive and only large companies can handle and survive that.

It is therefore very important, right from the beginning, to spend some effort on code quality. The application of metrics, defect checks and conventions is crucial. Otherwise, the future of a company can be in danger.

Software Development and Licensing Issues: The License Maven Plugin

There a some common pitfalls in software development. Most can be overcome by the developers them self, but when it comes to software licensing, they are completely lost (all developers I met at least). For me, it is the same and I tried to find a lawyer who would know about that, but I did not find any, yet. I still have some questions, but I cannot get them answered professionally, but some website and books give at least some information and hints…

One good source is the book „Understanding Open Source and Free Software Licensing“ (http://shop.oreilly.com/product/9780596005818.do). Here some information is presented on what licenses can be used to produce Open Source Software and proprietary software and which are to be avoided. Terms like Copyright and copy-left are explained. But, do you always know what licenses are used libraries you use and link against?Are you sure, that the licenses of these libraries are correct? May it be possible that an Apache License licensed library uses a GPL licensed library? In that case, the used library is also copy-left… 🙁

At least for Java in combination with Maven there is a possibility to check that automatically, and it is advised to do this regularly with the build system. Maven and maven artifacts should contain a license hint in their poms about the used licenses and the dependencies of them should contain the same hint. So it is possible to check the licenses transitively for all dependencies in your project.

For an easy implementation have a look to https://github.com/RickRainerLudwig/license-maven-plugin and http://oss.puresol-technologies.com/license-maven-plugin.

Quality Assurance in Software Development from its Root

I checked the V-Model of software development again a couple of days ago during my holidays. (For example have a look to Wikipedia: http://en.wikipedia.org/wiki/V-Model_%28software_development%29). The model in general is very helpful to explain different levels of testing and their meaning. It is a difference whether I use unit testing to check different functions of the software or integration testing to check the system as whole to check the different parts working together. From functional point of view this V-Model is a good model, but it lacks the most basic building block of software: The source code itself.

Let us take car manufacturing as an example. The V-Model tells us to specify the basic requirements like the car needs to be able to go from A to B with four passengers and a big trunk. We specify then the system design like it is a person car which has a certain size and shape and go down via engine design to details like the different parts of the engine. We take the requirements and design everything from top to bottom by separating more and more smaller building blocks which are separated again until we end up with the most basic building blocks like screws, nuts and bolts. The test goes in production the other way around. First the screws, nuts and bolts are checked for their correct size before they are put into an engine for example. The engine is tested before it is put into the car and so on.

But, when we look to the car manufacturing, there is something which is still missing. It is the part where engineers think about the maintainability of the car. They already think during design phase of new car models about their maintainability. It would be a catastrophe if a mechanic has to dis-assemble the whole car just to change the front light bulb. (In my car I need to walk my way up to the front light starting from the front wheel! So they did not do a good job here…)

In software development it is the same. We think about the correct behavior of the code, but we forget, that we need to maintain the code in future and we also need to develop it further. So maintainability is a big deal, too. Here we need to pay more attention on software architecture, design and source quality to not block our development for further functionality. It is even more obvious, that we need to do this, when we think about the fact, that the source code of the current product is the base for all following products.

The Killer Difference between Mechanical Engineering and Software Engineering

I always run into the same discussions when it comes to project management and quality assurance in software development. Some people (mostly people without technical background in software development) do not really understand why classic project management and quality assurance methods fail in software development. Here is the killer argument why this does not work:

In hardware engineering for each piece of hardware product the same procedure is applied repeatedly. In software  the procedure is applied once and then the product is copied.

This is now my killer argument in most discussions related to that topic to stop discussions why these methods do fail. The difference should be easy to understand. For a deeper understanding, I explain a little more:

  1. Hardware engineering has a lot of physical constraints which limit the choices and there for timelines are more predictable.
  2. In hardware engineering it is not so easy to define new requirements and the requirements do not change on regular basis like in software development. A car needs to go fast and safe, but it is not to be intended to dive from one day to another. Nobody would think about requesting something like this and expects the car to be able to dive in the next release within a short period of time. For software it is a common pattern, because it is just easily put in, there is only some code to be written…
  3. In hardware production, due to its repeating nature, one can track cp and cpk values and drive the process to engineering. After some prototypes (which are fixed manually later or disposed), the production process is optimized and controlled. The products become better in quality over time due to continuous improvements. That’s classic quality assurance. For software: A product is produced once (and hopefully well tested), but shipped not just once, but up to thousands of time until a bug is reported. Hardware products are all a kind of special. Software is copied.

In this way, I could go on. I just take the classic methodologies and apply them to software development with the difference in mind.

Scrum and Extreme Programming (XP) for Software Quality Assurance

I talked to someone who owned a software company for outsourcing in Asia. The idea was, as always, to provide software developers with lower hourly rates for customer’s software projects. The specialty was a management of Europeans who drove a tight, Scrum based project and company management.

The question about quality assurance arose from my site. All I got as answer was, that the Software is developed with Scrum, with 2 week sprints, the source code and the results are shared on daily basis with the customer and that daily stand up meetings are held, sprint demos are organized together with the customer and some tools were in place for project monitoring. But, there was nothing told about source code quality, testing or code reviews. Assuming I was a customer who outsources its development projects to a company like, I wonder what source code and quality I get. The source code produced with outsourcing, is the property of the customer who pays for that. The customer wants high quality, but what is the interest of the development team in Asia about the quality of the source and the test coverage? I have no concise answer for that.

Scrum, as I understand it, is mainly a kind of project management. Scrum is not for quality management per se, but for satisfying stakeholders. Quality is not build into software, but into the process in form of retrospectives, demos and daily stand ups. Scrum assures deliveries in time and a transparent software development.

If you look into Extreme Programming (XP), that is quite different. Except of the terms, the Scrum methodology is also built into XP, but there is much more. XP also has Quality Assurance potential build in, in form of Pair-Programming, Test-Driven-Development (TDD), dedicated testers and other elements. Would it not be more suitable for companies providing outsourcing services?

Similarities in Quality Assurance between Cambodian Silk Painting and Software Development

I am currently travelling South-East-Asia and have some time to think and to absorb new ideas and impressions. Beside the personal interest and impressions, I have some time to think about technical and quality stuff as well.

For some clients, I have trouble to explain the difference in the quality processes between software development and classical production processes. Today, while visiting a workshop in Siem Reap, Cambodia, I saw how silk paintings are painted. The interesting fact was how the paintings are copied to sell a picture multiple times and to be able to sell it as originally hand-painted. There was a master painting standing at the wall. The painters used it as template and each painter draws her pictures as similar to the template as possible. A quality inspection checks the work afterwards on the final product to select work to be sold as class A or class B, which is to be reworked and what is to be disposed.

I found it a good metaphor for software development, with one minor tweak. The metaphor (and the tweak) is want I want to explain in the next paragraphs.

Quality Management was developed originally for building heavy machinery like cars. All statistics and processes are developed for a single process building a single product in mass production. The better the process is, the better the outcome quality is. Therefore, the focus is on the process and the measurement of the process capabilities and dealing with the facts coming out of the measurement process. Quality Management for heavy machinery is very well-developed and mature and has proven to lead to high quality if used accordingly. The main focus is therefore on the manufacturing process.

In Semiconductor Industry, the process is still present, but the products are produced in parallel with thousands of chips per wafer in a single production process. The statistics become more difficult and needs to be adopted. The chips which are on a single wafer are not statistically independent and the statistics need to be applied on per wafer level. The process is still in focus, but there is already an issue to use the production process which is applied once for a lot of chips in parallel as quality measure. The process and its process parameters are not enough to know the quality for all chips on the wafer. There are too much uncertainties. A detailed final test on each chip is needed to be sure about the quality of each chip.

It is the complexity of the parallel production process which breaks our quality process neck here. One process builds in parallel the same product, but the process can have issues which are not measured with process parameters. The process is monitored classically, but the the parallel production can not be monitored detailed enough to use the process parameters alone for quality assurance.

The software development process fits neither the car production nor the semiconductor process. It is even more complex and no recurring production process is present. The process can be compared with the silk painting mentioned at the beginning: Each ticket (It can also be called work item, work package, enhancement or change on source code. It doesn’t matter how you call it.)  is worked on manually by an individual and all of these changes are totally independent to each other. There is no common process for creating source and no statistics can be applied. A process can also not be introduced because each ticket is different and there is no common work pattern for the source code change itself.

In painting, I also have seen pictures painted by two persons in parallel. It is more complex since you need to control two people without a fixed manufacturing process. A final product inspection is the only chance to assure quality.

In Software industry there is nothing like a practical production process which is applied regularly and the first result coming out of the process is copied hundreds, thousands or even millions of times and shipped to the final customer. There is also not an easy statistical approach to the software making procedure. It is true for silk painting, too.

The difference and point of increased difficulty in software development is the fact that there is even no template for the source code change to make. This is the twaek in the silk painting metaphor. The source is built every time completely independently without any practical constraints during the work.

For silk painting the quality control is a check afterwards on the final product where the quality department checks for the correctness of the work. This is something which can be done and is done by manual and automated source code reviews. This is a first step in the right direction, but what is the right measure? What is the template to apply?

In software development the template are coding conventions, API rules, design rules and architecture constraints. These constraints need to enforced as rigorously as possible. For Java as example, there are tools available like Findbugs, Checkstyle, PMD and others. These tools need to be set in process to bring a certain quality into the actual work. This targets the maintainability of the software. This is comparable with the silk painting process itself and how good the painters are in copying the template. How good the template is, is a question of the template painter. In software development it is the quality of the design, the architecture and the rules for the source code to be check automatically.

I find this similarities quite interesting and something to think about it in the next time. Maybe there is something coming up…