CIS 4615 meeting -*- Outline -*- * Stack-Smashing Attacks Based on chapter 5 of the book: 24 Deadly Sins of Software Security by M. Howard, D. LeBlanc, and J. Viega (McGraw-Hill, 2010) and on chapter 7 of the book: Building Secure Software by John Viega and Gary McGraw (Addison-Wesley, 2002) and on a web page from OWASP. ** background on machine architecture *** memory layout ------------------------------------------ MEMORY LAYOUT (Intel x86) high |----------------| addresses | command line | |args & env vars | |----------------| | stack | local vars |(grows downwards) esp ---> |----------------| | v | | | | (unused) | | | | ^ | |----------------| | heap | dynamic mem. | (grows upwards)| |----------------| | .bss | uninit globals |----------------| | .data | init. globals |----------------| | code | low | (read only) | addresses |----------------| ------------------------------------------ *** stack frames Q: What is a stack frame? Space on the stack (also called an activation record) that stores - local variables for that call which is deallocated when the function returns. Q: Is there one stack frame per function call? Why? Yes, so that each call has its own locals (allows recursion) ------------------------------------------ STACK FRAMES IN CONTEXT int main() { f(); } void f() { g(); } void g() { /* ... */ } +---------------------+ | frame for main() | | | +---------------------+ | frame for f() | | | | | +---------------------+ | frame for g() | stack | | pointer | | esp --> +---------------------+ ------------------------------------------ Q: If g() calls a function h(), what does the stack look like then? Like the above, but with a new frame for h() pushed (on bottom) Q: If g() returns, what does the stack look like then? Like the above, but with g() popped from the stack. ------------------------------------------ INDIVIDUAL STACK FRAME (Intel x86) high +---------------------+ addresses | caller's stack | | frame | | | +=====================+ | | +---------------------+ | | | | +---------------------+ | | +---------------------+ frame ptr | | EBP---> +---------------------+ | | | | stack ptr | | ESP---> +=====================+ low addresses ------------------------------------------ ... call pushes: space for return value, actual arguments, the return address, the frame pointer of the caller (dynamic link), and space for local variables Q: Are the roles of the information stored in stack frames separated? No, and this can lead to... ** the attack ------------------------------------------ ATTACK BACKGROUND Affects C and C++ code plus other languages that do not check array accesses. In C (and C++): - buffers have - buffers can be In many machine architectures: - return addresses are stored on the - data can be ------------------------------------------ ... a fixed size (determined statically) ... allocated on the stack ... stack ... executed (run as code) ------------------------------------------ STACK SMASHING ATTACKS Attack Idea: 1. Find a stack-allocated buffer 2. Design malicious code to: a. contain b. overwrite a 3. Give the program What happens when the program runs? ------------------------------------------ ... that can be overflowed by user input (this could be any array that is stack allocated, done by using a disassembler or the source Q: What is a buffer? An array. ... backdoor code (e.g., start a shell), and ... return address (on the stack) with a value that will run the backdoor code. All you really need to know for the first 2 steps is that it can be done. See the Building Secure Software book for more details. ... the encoded malicious code as input which overwrites the buffer and the return address ... The attacker gives the malicious input, which overwrites the buffer's contents and the return address, which jumps to the malicious code Q: What key security service was violated? Integrity! Trusting the user input without checking it. Q: Why do we still use programming languages like C (and C++) that don't check array bounds? It's more efficient... Q: Why do we use machine architectures that mix control and data? Historical reasons *** affected code ------------------------------------------ AFFECTED CODE The Morris finger worm: char buf[20]; gets(buf); The blaster worm: while (*pwszTemp != L'\\') *pwszServerName++ = *pwszTemp++; ------------------------------------------ gets is always dangerous! It keeps going (trusts user), and has no way of knowing the size of the buffer... In the blaster example, there is no checking of the size of pwszServerName; this is like what strcpy does ------------------------------------------ MORE PROBLEMS Tricky C library interfaces: char buf[20]; char prefix[] = "http://"; strcpy(buf, prefix); strncat(buf, users_path, sizeof(buf)); Use of dangerous functions: char buf[MAX_PATH]; sprintf(buf, "%s - %d\n", users_path, errno); General confusion: char buf[32]; strncpy(buf, user_data, strlen(user_data)); ------------------------------------------ The strcpy in the tricky example is okay, see why? The strncat function takes the number of characters left in the buffer, not the buffer's total size. "It's nearly impossible to use sprintf safely." Even using it for logging can cause problem, as in MS Windows Why is the strncpy example bad? Becuase the size passed should be the size of buf, not of user_data *** implications ------------------------------------------ IMPLICATIONS Possible effects: History: The Morris worm exploited a buffer overflow in 'finger' ------------------------------------------ ... crashes, attacker controls application, stealing cycles, ... Q: Are these effects worse if the application runs as root? YES! *** summary ------------------------------------------ KEY FEATURES OF BUFFER OVERFLOW ATTACK 1. input from user read from user input, read from file or the net read from command line 2. accepting user input ------------------------------------------ ... in an unchecked way (writing beyond bounds) *** related sins ------------------------------------------ RELATED SINS Integer overflow: ptr = malloc(sizeof(type) * users_count); Unbounded write to array: a[users_index] = ... ------------------------------------------ Q: What happens if the user-supplied users_count is very large? Overflow can wrap around to a small number... ** remediation *** in C ------------------------------------------ REMEDIATION IN C Code reviews: Look for: - user input - unsafe handling of user input - mistakes in arithmetic in calculating sizes, remaining size. Replace unsafe string functions: gets strcpy strcat sprintf Use better libraries: SafeCRT ------------------------------------------ replace with: ... fgets ... strncpy ... strncat ... snprintf *** in C++ ------------------------------------------ REMEDIATION IN C++ Use C++ Strings std::string or std::wstring Use STL containers like vector ------------------------------------------ *** Use stack protection ------------------------------------------ STACK PROTECTION Canaries on the stack: special random values that can be tested to Address Space Layout Randomization (ASLR) load code into random places ------------------------------------------ ... see if the stack has been overwritten Q: Why do they have to be random? Otherwise could be defeated by malicious code Stack protection via canaries is default in gcc and VS (/GS flag) Unfortunately these can be defeated... Q: What does ALSR do? Makes it harder for an attacker to write malicious code that works *** Use analysis tools Q: What does the "static" in "static analysis" mean? the code is checked before being run, so no runtime overhead! ------------------------------------------ STATIC ANAYSIS TOOLS Klocwork Code Coverity PREfast in Visual Studio: use /analyze flag in VS Fortify (for Java and .NET) ------------------------------------------ ** examples *** c++ ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? C++ code: char buf[BUFSIZE]; cin >> buf; ------------------------------------------ Yes, it's just like using gets! ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? C++ code: std::string buf; cin >> buf; ------------------------------------------ No, the C++ string type can handle any size input *** C ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? char buf[64], in[MAX_SIZE]; printf("Enter buffer contents:\n"); size_t sz = read(0, in, MAX_SIZE-1); memcpy(buf, in, 64); ------------------------------------------ No, we aren't trusting the user's input for the buffer copy and the read call can't overflow in. ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? char buf[64], in[MAX_SIZE]; printf("Enter buffer contents:\n"); read(0, in, MAX_SIZE-1); printf("Bytes to copy:\n"); scanf("%d", &bytes); memcpy(buf, in, bytes); ------------------------------------------ Yes, although the read is fine, we are trusting the user's number of bytes, potentially causing a buffer overflow. OWASP says that this kind of problem has caused trouble in various image, audio, and file processing libraries. ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? char *lccopy(const char *str) { char buf[BUFSIZE]; char *p; strcpy(buf, str); for (p = buf; *p; p++) { if (isupper(*p)) { *p = tolower(*p); } } return strdup(buf); } ------------------------------------------ Yes, if str is user input the strcpy could cause an overflow, and the code in the loop is no better. ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? char *lccopy(const char *str) { char buf[BUFSIZE]; char *p; if (strlen(str) > BUFSIZE) { return NULL; } else { strncpy(buf, str, BUFSIZE); } for (p = buf; *p; p++) { if (isupper(*p)) { *p = tolower(*p); } } return strdup(buf); } ------------------------------------------ No, if str is too long, then the function will return NULL. The use of strncpy is not needed here, but doesn't hurt. ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? if (!(png_ptr->mode & PNG_HAVE_PLTE)) { /* Should be an error, but we can cope with it */ png_warning(png_ptr, "Missing PLTE before tRNS"); } else if (length > (png_uint_32)png_ptr->num_palette) { png_warning(png_ptr, "Incorrect tRNS chunk length"); png_crc_finish(png_ptr, length); return; } /* ... */ png_crc_read(png_ptr, readbuf, (png_size_t)length); ------------------------------------------ This code is from the popular libPNG image decoder, was used by Mozilla and by IE... No, the problem is the check on length only happens in one case, and the warning doesn't stop execution of the last line... From OWASP: "The code appears to safely perform bounds checking because it checks the size of the variable length, which it later uses to control the amount of data copied by png_crc_read(). However, immediately before it tests length, the code performs a check on png_ptr->mode, and if this check fails a warning is issued and processing continues. Because length is tested in an else if block, length would not be tested if the first check fails, and is used blindly in the call to png_crc_read(), potentially allowing a stack buffer overflow." ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? if (!(png_ptr->mode & PNG_HAVE_PLTE)) { /* Should be an error, but we can cope with it */ png_warning(png_ptr, "Missing PLTE before tRNS"); return; } if (length > (png_uint_32)png_ptr->num_palette) { png_warning(png_ptr, "Incorrect tRNS chunk length"); png_crc_finish(png_ptr, length); return; } /* ... */ png_crc_read(png_ptr, readbuf, (png_size_t)length); ------------------------------------------ No, the control structure now avoids the problem. *** unicode troubles In Windows, user names are in unicode. ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? void getUserInfo(char *username, struct _USER_INFO_2 info) { WCHAR unicodeUser[UNLEN+1]; MultiByteToWideChar(CP_ACP, 0, username, -1, unicodeUser, sizeof(unicodeUser)); NetUserGetInfo(NULL, unicodeUser, 2, (LPBYTE *)&info); } ------------------------------------------ Yes, the calculation doesn't take the conversion into wide characters into account properly, it uses bytes instead of characters From OWASP: "The getUserInfo() function takes a username specified as a multibyte string and a pointer to a structure for user information, and populates the structure with information about the user. Since Windows authentication uses Unicode for usernames, the username argument is first converted from a multibyte string to a Unicode string. This function then incorrectly passes the size of unicodeUser in bytes rather than characters. The call to MultiByteToWideChar() may therefore write up to (UNLEN+1)*sizeof(WCHAR) wide characters, or (UNLEN+1)*sizeof(WCHAR)*sizeof(WCHAR) bytes, to the unicodeUser array, which has only (UNLEN+1)*sizeof(WCHAR) bytes allocated. If the username string contains more than UNLEN characters, the call to MultiByteToWideChar() will overflow the buffer unicodeUser." ------------------------------------------ FOR YOU: IS THIS CODE SUSCEPTIBLE? void getUserInfo(char *username, struct _USER_INFO_2 info) { WCHAR unicodeUser[sizeof(WCHAR)*UNLEN+1]; MultiByteToWideChar(CP_ACP, 0, username, -1, unicodeUser, sizeof(unicodeUser)); NetUserGetInfo(NULL, unicodeUser, 2, (LPBYTE *)&info); } ------------------------------------------ ** fuzz testing *** description Based on https://www.owasp.org/index.php/Fuzzing ------------------------------------------ FUZZ TESTING Goal: find software faults automatically by automated search Basic algorithm: - Use debugger to find likely vulnerabilities - Generate test data based on: - known problematic tests, - random guessing, or - heuristics ------------------------------------------ the heuristics could be AI based *** tools ------------------------------------------ TOOLS FOR FUZZ TESTING For web applications: ZAP https://www.owasp.org/index.php/ OWASP_Zed_Attack_Proxy_Project JBroFuzz https://www.owasp.org/index.php/JBroFuzz For web services: WSFuzzer https://www.owasp.org/index.php/WSFuzzer BurpSuite: https://portswigger.net/burp/download.html For files (in apps): SDL MiniFuzz File Fuzzer https://www.microsoft.com/en-us/download/ details.aspx?id=21769 For regular expressions: SDL Regex Fuzzer http://www.microsoft.com/en-us/download/ details.aspx?id=20095 ------------------------------------------