Static code analysis failures are costing enterprises money and reputation.
White-box security testing is inherently a flawed proposition for many reasons -but it all comes down to a very simple concept:
Machines do not execute source code, they execute machine code (compiled code). --Paul Anderson (GrammaTech)
If you think this through for a minute you realize that there are a few specific reasons why the above statement fundamentally changes the way that people look at white-box testing, and why this is a losing proposition. Let's analyze this in the context of a web application project for a mythical online bank. Consider that the use-case here is that we are dealing with a bank that has an online presence (currently being analyzed) which will be integrated with a series of existing legacy applications, partners, and external 3rd party components. Given this information let's analyze why white-box analysis (or static source-code analysis) is doomed to fail this project with respect to security.
- Compiler Optimizers Break Things - Think of it this way, compilers are designed to make machine code from your source code. That compiler's sole purpose (in most cases) is to create machine code that will be optimized, extremely fast-executing, but not necessarily secure. Often times security functions that people build into source code can be removed by compiler optimizers and most often without our knowledge. These actions often undo many of the advanced security features that developers may consciously insert into their code. Consider the following example:
- Developer is paranoid about data-persistence in memory space, and wants to be doubly-sure that variables are expired and destroyed
- Developer writes a routine whereby the variable will have a null value written to it before the memory is freed
- Compiler optimizer sees this as a double-work scenario, and removes the null-value portion and simply opts to free memory
- A potential security vulnerability is created with variable persistence in freed memory space
This example ideally demonstrates how a security vulnerability can be inserted in spite of the developer's best efforts to write secure code. Standard static-code analysis tools which are used to "scan code" at the static-file level will fail to catch this vulnerability. Quite simply - static code analysis fails if it is not supplemented with dynamic analysis.
- 3rd Party Library Integrations - There is another threat to developing and scanning static code in a white-box format. Inevitably, 3rd party libraries are used to complement features or functionality that are not natively provided by the local development effort. After all, no one re-invents the whole wheel everytime - we simply build what we cannot reuse from someone else's work, then use the publicly available libraries from 3rd parties to fill in the functionality and features that have already been written and (hopefully) tested before. White-box testing (or static code analysis) will absolutely fail in finding flaws when it comes to pulling in 3rd party libraries. By the definition of this type of issue, 3rd party libraries rarely provide you the source to be scanned and checked for weaknesses that will affect your application. What you're left with is someone else's code (in machine-compiled format!) which will be interacting with your application. Would you trust that model?
- Static Code Analysis Rarely Understands Data-Flow Modeling (Data Tracing) - If you're scanning your application with a source-code-only analysis tool, you're going to not only miss things that will almost certainly come back to haunt you - but you may also be over-working yourself without a real purpose. Consider the following example to illustrate my point. Before I get into that example though, allow me to explain this idea of "data-flow modeling" for those that are not familiar with this idea.Data-flow modeling seeks to understand how data moves through your application, not just how the application code is written. After all, that's the whole pointn of the application, to work with data. Vulnerabilities lie in manipulating data either to or from the end users or the server(s). Data-flow modeling maps out the data in your appliaction from it's instantiation (maybe when the user types it in) to its resting state (maybe when it's finally written to a database, or handed off to another application or service for additional work). That being said let's consider a web application that has 1,000 forms across 100 pages written in the language of your choice, built to be AJAX. While each page does nothing individually to validate user input (the data source) all variables (data) are filtered through a central validation module deep within the application logic. A standard source-code analysis tool (I have evaluated this and can honestly say this is a real use-case but will not mention the tool) will flag on each and every input that is not validated (within the page) as vulnerable to hudreds of vulnerabilities ranging from XSS (Cross-Site Scripting) to SQL Injection and other attack types. What you are left with is a very lengthy report with hundreds of critical and high vulnerabilities that you now obviously must address... unless you do some dynamic analysis on the code and realize that *none* of those theoretical vulnerabilities are exploitable due to the fact that the application filters all data through the central validator/scrubber.
So, there you have it. Static code analysis is inherently doomed to fail. White-box testing of source-only is flawed. The sky is falling, global warming will kill us all. In my next installment of this column, I'll give you what you need to know to avoid failing in your security initiatives at the development step of the SDLC - remember, knowing is half the battle.
If this information disturbs you, and you would like to talk about it directly please don't hesitate to email me directly. I am not a sensationalist, and pride myself on presenting practical solutions to real-world problems which are realistically attainable. Thanks for reading.