No root-cause of production failures
Statement 4
We have failures during manufacturing with unknown root-causes
41% of the participant recognizes this problem. The question is of course if knowledge on root-causes is needed.
Let us have a look at a company that makes complete systems. Nowadays that type of production process is more or less an assembly factory. One takes a bunch of units, mount them according to procedures, f needed followed by an adjustment procedure and one ends up with a final working system. That is how it should be. There is no reason why a good developed product would fail after assembly and adjustment. Each failure has a specific reason.
In reality faults found during manufacturing are usually analyzed and solved right away. There are many types of failure.
- A component need rework because of burrs.
- A PCB board is replaced
- Bad contacts
- Broken cable
- Broken spring
- Interface problems
- etc.
Of course replace / repair is a good corrective action to have a working system now. Important questions are:
- "Why do we need to repair or replace? We don't want it"
- "What do we do with the defect?"
Depending on the product they are sent back to the supplier or just scrapped. And after some time they come back. Repaired. Or a new one. Or it is silent for ever. And no one, at least 41% asks what the root-cause of the defect was. Why is that important?
Failures found during manufacturing, especially during the start up of a new product, are the very first signals indicating a possible weak link in the design. These failures are no accidents or incidents. They are treasures. Diamonds. This is the ultimate chance to solve a potential problem before the bulk goes onto the market. Once it is on the market it costs a lot more money to solve it. This is the last opportunity to make a better product. An example of a product which was on the market for quite some time.
- "Why do you have so many rejects in the factory?"
- "We have a yield of 90% and that is part of the price"
- "So, the rejects are not analyzed?"
- "Of course not, we have thousands of them, far too many"
- "Why? Analyze a sample"
- So we took a sample of 200 rejects
- In the analysis we found 4 main causes reflecting about 90% of all rejectsl
- 2 of these main causes had a direct relation with increasing customer complaints!!
- The defective part of €3 costs €350 on repair within warranty, the equivalent of half of million.
- After a redesign the yield raised to 98% and hardly any customer complaints were heard
These main causes could have been found and taken care of much earlier since the same failures on a much lower scale also occurred during startup. No specific action was done since the factory must start. Time to market is important. OK, but in the mean time the root cause should be analyzed as well.
"I put my foot with 90 kg body weight onto it" is a good root-cause.
"There was a blur on that plate and therefore it jammed" is not a good root-cause. Of course it was the reason it didn't work. But is was not root-cause! "We did the rework after which it was fine". And that was the end of it. Even if it occurred more often. That blur was caused by the supplier. But why? No final control there? Inadequate processing? Educate your suppliers. If blurs are present but not severe enough to cause jamming how sure can you be it will not cause a failure with the customer. Well they did.
"There was a defective capacitor on the PCB. It is replaced and now it works". "Why did you have a defective capacitor. There must be a reason for it?
Ask 4 times the WHY question and you will know
- "The filling mechanism was broken in a new product, found at start up of the production"
- "WHY?"
- "The motor jammed"
- "WHY?"
- "The lever could not retract"
- "WHY?"
- "The spring was broken"
- "WHY?"
- "Because of metal fatigue"
Aha. A expensive complete filling mechanism had to be replaced because of a cheap broken spring caused by metal fatigue. Is that bad? Yes, because of metal fatigue. Metal fatigue is a phenomenon (Physics Of Failure) that always can happen in metal parts on long term. Because this was happening so early in the life cycle alarm bells should ring. A diamond has been found. All products are suspected, even the good ones delivered to the field. Analysis proved that the spring was overstressed.
By the lack of a root-cause analysis the filing mechanism had to be replaced many times until someone found out that something is rotten. The fact that it occurred so early in the factory was just luck. "You hit the jackpot". Usually it takes a little longer before a customer will find out. Then it depends on the service logistics if the alarm bell rings at the right place. Usually it takes a much longer time. Let alone the solution. And what about the remainder of the products present at (often unknown) customers?
This type of analysis sounds nice, but "it doesn't work since you depend on the corporation of your suppliers". Right. See also article <ST09 Reliability of suppliers>. A reliability aware supplier is always interested in the quality / reliability of his products. Maybe a little pressure is needed since a supplier does not always knows the impact of those defects.
It is also a very good idea to try to receive all defects, even a bolt or a nut, from a customer during the first year of production. The biggest problem is the service logistics and related costs. In those cases where it was successful a lot of reliability information was gathered, the products were improved and the future warranty costs dropped rapidly. Also a controlled pilot area can be helpful if logistics is a problem.
The same is valid during the development track.
- A prototype didn't work
- Caused by a defective transistor
- After replacement the product worked fine, wasted the defect and one just continued.
- "Why was that transistor defect?"
- That was not important since the product worked again after repair. Just bad luck.
- A missed opportunity, because later during normal production this transistor caused problems.
- The defect was just a weak sample and dying because the normal stress was a little too much.
- During normal production one gets transistors with a variety of strengths and the weakest will fail too early. This again was an example of hitting the jackpot. For free
- The design proved to be a little critical as indicated by the first failure of this kind.
- Now the same information was received from the field for a much higher price.
Certainly during start up of the production it is important that developers pay attention. A common excuse is that development has no time. In fact top-down does not give it the right priority, otherwise time to market of the next product under development is endangered. But wait a minute. Something is wrong here. If development has no time to solve it NOW, why can they afford to use later on much more time once the customer complains enough to get it fixed. Because they are customers. And now they use often more time to solve problem which shouldn't be there. For more money. Because now there can be a lot of products outside suffering from the same potential problems. In the end it results in a longer Time to market for the new product and a higher price for the current product or at least less profit or even loss. The next action could be a cost price reduction but also that costs money and time. A no win situation. Prevention is better.