Software Security

Introduction to Software Security

 

What is computer security?

  • Systems may fail for many reasons
  • Reliability deals with accidental failures
  • Usability deals with problems arising from operating mistakes by users
  • Security deals with intentional failures created by intelligent parties
    • Computing in the presence of an adversary

 

Examples of Software Problems

  • Therac-25 medical accelerator
    • Killed 5
  • Mars Climate Orbiter
    • Destroyed, units mismatch
  • AT&T long distance network
    • Switches crashed when they received certain message
  • iPhone bug (2015)
    • Text certain characters will crash phone

 

Adversarial Failures

  • Bugs are bad
  • Much worse when someone intentionally tries to exploit bugs
    • Force code into worst state
    • Violate security system
  • Common class of bugs: buffer overflow
  • Buffer overflow in Berkeley Unix finger daemon
    • Exploited by Morris Worm

 

Morris Worm (1988)

  • Vulnerability exploited – sendmail, finger and rsh/rexec and weak passwords
  • Worm infect 6000 computers
  • First significant worm

 

Software Vulnerabilities are Everywhere

  • HearBleed (OpenSSL)
  • WannaCry (Microsoft)
  • Cloudbleed

 

What is Software Security

  • System Model – used by several users simultaneously
  • Threat Model = adversary interacts with API provided by software
  • Properties = confidentiality, integrity, availability
    • Pick two??

 

Why software vulnerabilities matter?

  • Software bugs are bad
  • Attacker successfully exploits vulnerability can lead to
    • Crash – no availability
    • Execute arbitrary code – no integrity
    • Obtain sensitive information – no confidentiality

 

Common Vulnerabilities

  • Buffer overflow
  • Integer overflow
  • Format string
  • Input validation
  • Race condiations

 

Buffer Overflows

 

What is Buffer Overflow

  • Anomalous condition where a process attempts to store data beyond the boundaries of fixed-length buffer
  • This may overwrite adjacent memory locations with crucial data, may result in program crash, incorrect results or security and privacy leaks
  • Most common problem in C/C++ programs

Example code

  • Look at slides

 

How does memory work

  • The memory of a process is divided into three regions
    • Text: executable code from program
    • Heap: Dynamically allocated data
    • Stack: Local variables, function return addresses, stack pointer (grows/shrinks)

 

How the memory stack works

 

Anything wrong with this program?

*strcpy doesnt handle bounds check!

If arg is larger than 16 bytes?

 

The attacker can:

  • Crash the program
  • Can execute arbitrary code
    • Call libc functions

 

Buffer Overflow Summary

  • Requires an unsafe function
    • strcpy/strcat/strcmp
    • gets
    • printf/scanf
    • Memcpy
  • Buffer must contain address of attack code in return position
  • Attacker must know wher buffer will be when function is called

 

Software Security – Other Vulnerabilities

 

Introduction

 

Stack-based Exploits

  • Stack Smashing – like an axe
  • Return to libc – steak knife
  • Format string – scalpel

 

 

Bad coding Practices

  • Unsafe libc functions
    • strcpy/strcat/strcmp
    • gets
    • printf/scanf
    • memcpy 
  • What can we do?
    • strncpy/strncat/strncmp
    • fgets
  • Two steps fo finding trivial vulnerabilities
    • Look for unsafe functions
    • Trace attacker-controlled input to these functions

 

Integer Overflow

Three main types of Integer Overflow

  • Assign a large type to a small type
  • Arithmetic overflow
  • Signedness bug

 

Large Type / Small Type

int i = 0xAABBCCDD;

short s = i;

char c = i;

// Type casting problem

s is truncated

c is just the first character

 

struct s {  

unsigned short len;  

char buf[]; 

int len = strlen(str); 

struct s *p = malloc(len+3); 

p->len = len; 

strcpy(p->buf, str); 

char buf[1000]; 

if(p->len < sizeof buf)  

strcpy(buf, p->buf);

 

Arithmetic Overflow

  • The result of an arithmetic operation is too large for a variable
  • Example

unsigned int a = 0xffffff;

b = 1;

c = a + b;

// Here c will be 0 because of overflow 

 

Signedness bug

  • Compare two signed integers
  • Compare signed and unsigned integers
  • Treating a signed negative number as unsigned
  • Also careful with type casting

 

Return to LIBC

 

Why provide our own shellcode

  • OSes already have lots of libraries
  • Idea: point to ibc instead of back into the stack
    • system(); exec*()
  • Modify “arguments” in addition to return address
  • Depending on situation can “chain” calls
    • setuid(); system(…)..

Above example:

  • Instead of returning to program code, it calls a function in libc and passes the shell as argument
  • The system function looks in the system address and treats it as normal operation thinking the argument in the next location is it’s return address. This can be chained to other calls and returns.

 

Format String

 

Exploit of printf family of functions

  • Printf, fprintf, sprintf, snprintf, vprintf, fvprintf
  • Each one takes a format string
    • %c, %d, %i, %u, %x, %s
    • %n = 

 

Ways printf can be exploited:

  • Reading the memory stack
  • Reading arbitrary memory
  • Overwriting memory

 

Examples

printf(“%5d\n”, 37);

 

Int p

printf(“hi %d%n”, 24, &p); // hi 24 – then stores this into p

 

This can be exploited as such:

 

char * evil; // evil = “%08x %08x %08x %08x”

printf(evil); // printf will treat these a format string

// this means the printf will treat subsequent memory stacks as arguments, which could be printed

 

How can it be fixed?

  • Use the fixed-format string
    • printf(“%s”, user_data);
  • Dont use %n !

 

Software Security Defenses

 

What else we can do?

 

  • Use safe library functions (e.g., strncpy, strncat)
  • Use a “safe” language
  • Lots of testing
  • Static code analysis
  • NX/DEP/W^X
  • Runtime checks (stack canary)
  • Do our own bounds checking
  • Shadow stack
  • Address obfuscation 

 

Static Source Code Analysis

  • Statically check source code to detect buffer overflows (and other vulnerabilities)
  • Main idea: automate the code review process
  • Find lots of bugs

 

Possible bugs in source code

  • Crash Causing Defects
  • Null pointer dereference
  • Use after free
  • Double free
  • Array indexing errors
  • Mismatched array new/delete
  • Potential stack overrun
  • Potential heap overrun
  • Return pointers to local variables
  • Logically inconsistent code
  • Uninitialized variables
  • Invalid use of negative values
  • Passing large parameters by value
  • Underallocations of dynamic data
  • Memory leaks
  • File handle leaks
  • Network resource leaks
  • Unused values
  • Unhandled return codes
  • Use of invalid iterators 

 

Do our own bounds checking

char buf[80]; 

void function() {  

int len = read_int_in(); 

char *p = read_string_in();  

if(len > sizeof buf) {   

error(“length too large!”);   

return;  

}  

memcpy(buf, p, len); 

 

User input needs validation

  • Many sources of input for local applications:
    • Command line arguments
    • Environment variables
    • Configuration files and other files
    • Network packets
  • Other user inputs that need validation:
    • Web form input  

 

NX/DEP/W^X

No-eXecute bit

  • Can mark certain areas of memory as non-executable

 

Data Execution Prevention/Write XOR Execute

  • Separation of code and data
  • Force hardware-level exceptions if you try to execute those memory regions

 

Prevents basic stack-based exploits

Problems:

  • Does not defend against return-to-libc (and other) attacks
  • Can break backward compatibility with certain applications 

 

Stack Canary

  • Canary (bird) used to check if environment is safe or not safe
  • Used to detect a buffer overflow before execution of malicious code can occur  
  • Place a random value before the return address  
  • Check this random value before returning  
  • Reactionary not preventative 

We smash a canary and check its value

 

Shadow Stack

  • Keep an extra copy of return address in Kernel Memory
    • Only return if addresses match up
    • Doesn’t protect other memory, registers, etc. 

 

Address Obfuscation

  • Randomize address space
    • Introduce artificial diversity
    • Place stack, buffers at random location
    • Thus, attackers won’t know precise address to point control flow
    • E.g., PaX ASLR, Windows Vista and later, etc.

 

PAX ASLR (Address Space Layout Randomization)

 

PAX ASLR (Address Space Layout Randomization) is a part of the PaX patches for Linux, which are a set of security enhancements to the Linux kernel. ASLR is a security feature that helps to prevent certain types of exploits, such as buffer overflows and return-to-libc attacks, by randomly arranging the positions of key data areas of a process, including the base of the executable and the positions of the stack, heap, and libraries.

Limitations of ASLR

  • Several limitations:
    • 16 bits not a huge number
    • Doesn’t re-randomize on fork()
    • So just try over and over again until you guess randomization
    • 64-bit addressing helps a lot here
    • Sometimes libraries (e.g., DLLs) aren’t randomized 

 

Solutions

 

  • Use a safe language…
    • Java, Ruby, etc.
    • Enforce bounds checking, garbage collection
    • Type safety
    • Don’t ever let programmers near the memory!
    • We can even run untrusted code in a sandbox 
  • Other issues: 
    • Vulnerabilities in JVM
    • Thread issues
    • Load malicious libraries
    • Efficiency? 
  • Interpreted languages aren’t always our friend
    • The new frontier is finding bugs in VMs
    • If you can run arbitrary “safe” code, then it gets lots of chances to work its way out