I tried to resist writing this – not because I think it’s not worth to mention, but because I don’t really like Apple … and because I’m biased, I don’t want to talk bad about them. But I can’t ignore that one of most emerging software enterprises did fail in a really bad way.
The mistake was a really simple one (Sophos did a really good job in describing it here):
Even without deep knowledge of C++, you should be able to see the duplicated “goto fail;” line – that’s why it’s called the “goto fail bug”, now.
I don’t blame the developer to accidentally duplicate a line of code (shit happens), but I really have an issue with the fact that this bug has not been detected by a unit test. The whole purpose of the method is to check an SSL certificate and the call to “final” is essential to test the certificate. Unfortunately (because of the duplication) this line cannot be reached under ANY circumstances.
There are 4 “things” that should have detected the bug before it went to production:
- Compiler warnings: in C#, when you compile some code, the compiler will complain about “unreachable code”. You have to actively insert compiler directives to silence that warning. But may be the compiler Apple uses does not have such a clever compiler – I don’t know.
- Unit tests a): someone must have been inserted the line with the call to SSLHashSHA1.final(). And this person must have done that intentionally – it’s an additional and really important check to make sure the certificate is valid. So: Why the hell is there no unit test with an invalid certificate that tests for exactly this condition?
- Unit tests b): with unit tests you usually have something called “code coverage”, so you can see what code is tested and what is not. When there is code that has not been tested in a unit test, there should be someone who does have a look if that is ok – that person would clearly see the issue and would be able to fix it.
- Manual testing: obviously there is no manual test for a web page that does contain a certificate with a manipulated private key.
So my issue with this bug is NOT that someone did a mistake by duplicating one single line of code – as I already said: shit happens. But it’s really, really bad to not find that kind of issue in a security related area when there are 4 different checks/tests that should have been done and each of them would identify that problem. I really don’t want to think about all the other issues that might still be there inside that code, if they don’t fix compiler warnings, don’t try to write unit tests with high code coverage and they don’t test the scenarios that code should prevent.
There are also some “NSA conspiracy theories” out there, but: “Don’t assume malice when stupidity is an adequate explanation”.
What should you learn from this?
- Never ignore compiler warnings – review your code and fix them!
- Write Unit-Tests for your code, and don’t stop at 100% coverage – you should evaluate if your tests do include the use cases (and misuse cases) you did write the code for.
- Do peer reviews.