CIS 4615 meeting -*- Outline -*- * overview of static analysis ** background ------------------------------------------ STATIC VS. DYNAMIC def: *analysis* is a process of discovering def: *static analysis* is an analysis that is done def: *dynamic analysis* is an analysis that is done ------------------------------------------ ... properties of a program's behavior ideally properties that are true of all possible executions e.g., does the program have a buffer overflow vulnerability? does the program have a format string vulnerability? could the program use memory after it has freed it? Code audits or code reviews are doing analysis ... before (or without) running the program from the text of the program itself e.g., virus scanning type checking gcc's format string warnings (show that!) ... while the program is running e.g., seeing what files a program reads seeing what network traffic a program generates testing of any sort, including fuzz testing Q: If you do a code review, is that static or dynamic analysis? static Q: If you do testing, is that static or dynamic analysis? dynamic ** motivation We have done code reviews, in homework 3... ------------------------------------------ WHY DO PROGRAM ANALYSIS? Because code reviews/audits are: ------------------------------------------ ... - expensive, Why? because need expert programmers - time consuming Why? because reading code is hard (impossible, actually) - prone to missing vulnerabilities Why? because people get tired Q: What are some advantages of code reviews? They can be done without waiting for new theory and software tools They can leverage human creativity and insight But it may be cheaper and more rigorous to have a program do the analysis *** static analysis ------------------------------------------ STATIC ANALYSIS Advantages: Disadvantages: ------------------------------------------ ... - can give guarantees about all program behaviors - for malware, can give warning *before* running program - techniques useful for compilers (optimization, critical systems) ... - will have false positives (says there is a problem when not) (think of virus scanning or type checking) - can be computationally expensive (O(n^3), where n is size of program) (there is a tradeoff between false positives and efficiency) ------------------------------------------ PRECISION AND RECALL def: the *precision* of an analysis is the fraction of def: the *recall* of an analysis is the fraction of Example: Suppose a program has 10 vulnerabilities and a tool identifies 8 places but only 6 of those are actual ones The precision is recall is ------------------------------------------ ... identified problems that are actually problems ... the actual problems that are found ... 6/8 (true positives/number output) ... 6/10 (true positives/possible outputs) Q: What is the goal for precision? For recall? ideally would like both precision and recall to be 1 Q: Can we do that? No, that is impossible in general for program analysis Q: Which is worse for analysis of security vulnerabilities: poor precision or poor recall? Poor recall is certainly bad, means that you miss vulnerabilities But poor precision is also bad, means that people won't pay attention *** dynamic analysis ------------------------------------------ DYNAMIC ANALYSIS Advantages: Disadvantages: ------------------------------------------ ... - will work on obfuscated software - never has false positives (precision is 1) ... - puts the machine doing the analysis at risk - may have many false negatives (low recall, may miss a lot) (consider testing smartphone apps, if the app waits for 3 days, then unlikely that testing will catch it) Q: Do we have to decide between static and dynamic analysis? No, we can use both ** capabilities *** taint checking ------------------------------------------ TAINT CHECKING Attacks on integrity often involve Example Attacks? ------------------------------------------ ... trusting user input ... SQL injection cross-site scripting format string command injection ------------------------------------------ TAINT CHECKING Desired guarantee: What is trust? How could a program know about trust? ------------------------------------------ ... program does not trust user input ... for SQL injection: passing user input to SQL interpreter for XSS: sending user input to web page for format string: passing user input as a format string argument for command injection: passing user input to command interpreter **** format string analysis Q: What would be a very simple possible static analysis for format string vulnerabilities? Make sure all format strings are given literally in the code like printf("%s\n", input); Q: What would be the problem with that? Low precision (i.e., too many false positives) How could we do better? Need to know when the format argument could have user inputs ------------------------------------------ STATIC ANALYSIS FOR FORMAT STRING ATTACKS Analysis information: ------------------------------------------ ... The analysis has: 1. a set of var-arg functions (printf, fprintf, ...) 2. a set of user input functions 3. track where the user-influenced information is fgets(linebuf, sizeof(linebuf), stdin); strncpy(buf, linebuf, sizeof(buf)); val = sscanf(buf, "%d"); Give errors when a var-arg function is called with user input Example tool: gcc with -Wformat=2 (show that) The key to precision is to not give errors when such a function is called with input that doesn't come from the user. This is called taint analysis... **** SQL injection attacks ------------------------------------------ STATIC ANALYSIS FOR SQL INJECTION ATTACKS Analysis: ------------------------------------------ ... 1. have a set of SQL primitives that can't be called with user input, What if some function calls them through another function? then that function should be flagged as a vulnerability 2. have a set of user input functions (e.g., fgets, scanf, ...) 3. track where the user-influenced information is Q: What's an error that the SQL injection analysis can find? Calling an SQL primitive with input (information) from the user Example tool: CheckLT, JIF Q: What could be done to improve precision? track sanitization routines or allow an annotation Could we do that for the format string analysis also? Yes! **** XSS attacks ------------------------------------------ STATIC ANALYSIS FOR XSS Analysis for type 1 (reflected XSS): ------------------------------------------ ... for XSS: 1. have a set of primitives that output to the web page 2. have a set of user input functions 3. track where the user-influenced information is Q: What's an error that the XSS analysis can find? User input is output to a web page. ** limitations There are some fundamental (theoretical limitations to be aware of) *** perfectly precise static analysis is impossible **** background ------------------------------------------ BACKGROUND: PROBLEMS A *problem* is a specification of a procedure's desired relation Examples: ------------------------------------------ between inputs and outputs (possibly with some resource constraints) given two integers, return their sum given a set of temperature, pressure, and humidity readings today, predict the weather for tomorrow ------------------------------------------ BACKGROUND: EFFECTIVE PROCEDURE An effective procedure is "a set of rules which tell us, from moment to moment, precisely how to behave." -- M. Minsky, Computation: Finite and Infinite Machines (p. 106) Practical definition: ------------------------------------------ ... a program (if we accept the Church-Turing thesis) ------------------------------------------ EXAMPLE PROGRAM In Java: public class hailstone { public static long steps(long x) { if (x <= 1) { return 0; } else { return 1+steps(h(x)); } } public static long h(long x) { if (odd(x)) { return (3*x+1) / 2; } else { return x/2; } } public static boolean odd(long x) { return x%2 == 1; } } ------------------------------------------ Q: What is steps(1)? 0 Q: What is steps(2)? 1 Q: What is steps(3)? 5 Q: What is steps(7)? 11 Q: What is steps(27)? 70 Q: Is it easy to predict the value of steps(x) for a given x? No, seems very difficult, this is an infamous mathematical problem. ------------------------------------------ CAN PROGRAMS DO EVERYTHING? program = effective procedure must always terminate If some problems can't be solved by programs, then set of problems ------------------------------------------ Q: Was the code for steps() a program? Maybe, but we don't know if it always terminates! ... is not equal to what programs can do! Q: Can you think of something that is impossible for a program to do? Make a perfectly precise analysis of some interesting property... such as solving the halting problem... ------------------------------------------ WHAT DOES "IMPOSSIBLE" MEAN? Mathematical meaning: inconsistent with accepted axioms E.g., - find fraction equal to square root of 2 - solving the "halting problem" What we DON'T mean: - very hard - socially unacceptable - illegal - against physical laws ------------------------------------------ ... like whether steps() terminates ... like wearing something weird ... like murdering someone ... like traveling faster than light **** the halting problem ------------------------------------------ THE HALTING PROBLEM Write a program, halts(P,I), that takes code P, and input string I, and: halts(P, I) = true, if P halts on input I, halts(P, I) = false, if P does not halt on input I, halts always returns true or false in a finite amount of time. ------------------------------------------ This has to work for *all* programs and all inputs Q: Are we asking for precision or recall? 100% precision Q: What do we mean by "halts"? finishes, stops running Q: Would this be useful? Sure! Q: How could you solve it? What would be an idea for an algorithm? **** impossibility result The work of Alan Turing, around 1937. Alan Turing, On computable numbers, with an application to the Entscheidungsproblem, Proceedings of the London Mathematical Society, Series 2, Volume 42 (1937), pp 230--265, doi:10.1112/plms/s2-42.1.230. Alan Turing, On Computable Numbers, with an Application to the Entscheidungsproblem. A Correction, Proceedings of the London Mathematical Society, Series 2, Volume 43 (1938), pp 544--546, doi:10.1112/plms/s2-43.6.544 . ------------------------------------------ SOLVING THE HALTING PROBLEM IS IMPOSSIBLE! Proof by contradiction: - Suppose program F solves the halting problem. - Let T be the program: T(P) = if F(P,P) then return loop() else return false end - Consider T(T) By definition, T must either loop forever or return false. If it loops forever, If it returns false, ------------------------------------------ ... then F said T would halt on input T, but it didn't (since it is looping forever)! ... then F said T wouldn't halt on input T, but it did halt! So in both cases, F doesn't fulfill its specification, which contradicts its existence. Note it's important that we expect a solution (like F) to always halt and be correct It has to be 100% precise **** other impossible problems ------------------------------------------ OTHER IMPOSSIBLE PROBLEMS Write AlwaysHalts(P) to decide if P halts on all inputs Write HaltsOnEmpty(P) to decide if P halts on empty input ------------------------------------------ Always halts is ... Unsolvable, since it includes the halting problem as a special case! HaltsOnEmpty is ... Unsolvable, since can construct from HaltsOnEmtpy a procedure F that would solve the halting problem! F(P,I) = define N(E) = return P(I); % ignore E and run P on I return HaltsOnEmpty(N) ------------------------------------------ RICE'S THEOREM There is no program that can decide any non-trivial property of programs. ------------------------------------------ This follows from the Church-Turing thesis. Here non-trivial means that there is a program (Turing machine) that recognizes the langauage that characterizes the property, and also that there exists one that recognizes a language that does not have that property. Q: What does this mean for static analysis? It can't be done with 100% percision for anything we care about **** What can we do? Q: So can programs help programmers? Yes, but we have to cheat a bit... ***** an analogy ------------------------------------------ ANALOGY: DO I NEED AN UMBRELLA? Website tells if you need an umbrella What should it do? Need umbrella? | rains | no rain ===============|==========|========== "YES" | | | | "NO" | | ------------------------------------------ There used to be a website, umbrellatoday.com Q: What are the possible outcomes? ... good :-) ok :-| bad :-( good :-) Q: What should the web site say if there is only a 50% chance of rain in the forecast? "YES"! Customers shouldn't be wet = unhappy So safety (conservatism) says to err on the side of telling people to bring an umbrella Want to avoid them getting wet ***** virus checking ------------------------------------------ VIRUS CHECKING Program tells if another program has a virus What should it do? Has virus? | malicious | safe ===============|===========|========== "YES" | | | | "NO" | | ------------------------------------------ Q: What are the possible outcomes? good :-) ok ? bad :-( good :-) Q: So what should the tool do if it's not sure? better tell the user that it might have a virus, that is the safe thing to do However, will people tolerate that? It's unclear. Traditionally there has to be be a very low rate of false positives **** summary ------------------------------------------ SUMMARY OF STATIC ANALYSIS LIMITS Can't have 100% precise solutions to interesting analysis questions But we can do something less precise extreme: "your program is wrong!" goal: get higher precision with 100% recall ------------------------------------------ *** prefect recall won't happen with dynamic analysis ------------------------------------------ LIMITS OF TESTING To have airtight assurance about safety need to Problem: input space is practically Consider testing this: #define BYTE_BITS 8 int sign(long v) { /* return -1 if v <0, otherwise return +1 */ return +1 | (v >> (sizeof(long) * BYTE_BITS - 1)); } "Testing can only reveal the presence of errors, not their absence" -- E. Dijkstra ------------------------------------------ ... test all possible inputs ... infinite (so testing all inputs takes too long) The code is from Bit Twiddling Hacks By Sean Eron Anderson http://graphics.stanford.edu/~seander/bithacks.html#CopyIntegerSign Q: Why does testing work in other areas of engineering? Because laws that govern how things work are continuous, so some kind of continuity argument can be used to extrapolate from a small number of test results.