What is the problem?
Is it a reliability problem?
If you suffer year after year from reliability problems, then it is too easy to blame the bottom-uppers like:
- The developers. True, they create the reliability. But the main question is if it is really a reliability problem? If it is two solutions could be:
- Replace the responsible developers by better ones
- Give them better development tools
- The reliability engineers. Because they were not able to use their crystal ball in the right way. In other words:
- Replace the responsible engineers by better ones
- Give them better reliability tools
Or is it an unreliability problem?
If unreliability is the primary cause then the problem is the top-down chain and not the bottom-up. Therefore solutions have to be found in that top-down chain. A problem is that:
- This chain can be long and remember that any chain is as strong as their weakest link
- The same responsible persons should replace themselves.
The strange thing is that top-down is fully unaware of their contribution to unreliability. Therefore a solution could be to follow a basic training for just those aspects important to take the right decisions. In general the top-down chain is only interested in two items:
- Time to market. The faster, the better.
- Cost price. The lower, the better.
It looks like sitting on a chair with two legs only. This is not stable and sooner or later one will fall. To create some stability one needs a third leg, called Reliability. Problem is that time to market and cost price has a big influence on the size of that third leg. Often it is a tiny leg. It is present, but has no use.
To make it more complicated, in many cases it is a combination of reliability and unreliability. Both can be cured with the right treatment.