CS161 Midterm 1
NOP sledding
"slide" the CPU instruction until you get to return address location in memory. Do this by increasing number of no-op instructions (i.e. instructions that just say go to next instruction) until you get to the return address
C things to remember
1. char greeting[64] = "Welcome back, " ; size_t count = sizeof (greeting) − strlen(greeting); //sizeleft -->sizeof(greeting) evaluates sizeof(char *) because greeting is a pointer to an array and not the actual array itself-->evaluates to 4-->size_t count<0, meaning an underflow occurs yielding very large unsigned number 2. read(int fildes, void *buf, size_t nbyte)-->reads nbyte bytes from fildes (usually =0) into buffer buf. overflows if size_t<0. 3. if user can input (i.e. a function calls for..) a character array (essentially a string in C), then the user can input their own format (e.g. %n) if the print functions don't already have those set
Registers to know
1. EIP/RIP 2. EBP 3. ESP
asymmetric cryptography methods
1. El Gamal 2. RSA
vulnerable places for heap based overflow attacks
1. GCC (old version of GNU compiler implementation of malloc() 2.GOT (Global Offset Table)--maps functions to memory addresses
methods to achieve integrity
1. MACs (Message Authentication Codes) 2. Digital Signatures
techniques by attackers to overcome challenges with stack-based overflow
1. NOP sledding 2. return-to-libc 3. jump-to-register (trampolining)
which of the following is true for encryption algorithm with Ci = Esubk(Csub(i-1)⊕Pi)⊕Csub(i-1)
1. Pi = Dsubk(Ci ⊕Csub(i-1))⊕Csub(i-1) 2. flipping any bit in Pi or Ci changes the Ci or decrypted Pi 3. decryption is parallelizable/encryption is not 4. If Ci is lost, Ci-1, Ci-2, Ci+2 can only be decrypted, Ci-1 cannot
3 operations for TLB
1. R: c 2. W: can write 3. X: can execute
stack canary limitations
1. can only protect against stack-based overflow attacks, not heap overflows 2. local variables can be overwritten (since stack canary only protects return address) 3. only occurs if attacks occurs before function is over, otherwise stack canary never validates anything
encryption/decryption parallelizable
1. decryption parallelizable: CBC/CFB/CTR can do this (CBC/CFB are recursively defined, CTR is because all blocks depends on corresponding counter value which is created at once), OFB cannot for instance because need to keep running cipher repeatedly 2. encryption parallelizable: not if Ci=something in terms of Csub(i-1) -- cannot do it for CBC etc.
Digital Signature Process
1. each person public-private key pairs (private key=signature key, public key=verification key) 2. signer feeds data to hash function 3. signer applies private key to hash value, resulting in data with signature at bottom 4. verifier hashes data and signature using same hash function, also passes data, signature using public key and sees if they are equal
ASLR limitations
1. entropy reduction attacks 2. address space information 3. doesnt protect against local data manipulation 4. not all apps work with it (some opt out the the linker flag)
integrity fails
1. if no secret info sent in ciphertext, e.g. ones that use public keys 2. if only one of the 2 possess a secret key that encrypted the message 3.
why does a block cipher need to be a permutation
1. invertible-->more than one input does not map to same output (i.e. if you had order of blocks not matter, then you could have a block be the same as in real message but switched order and have the same encrypted result)
format string vulnerabilities allow an attacker what
1. learn contents of stack frame 2. learn other memory 3. write to other addresses in memory 4. overwrite a saved return address
heap-based overflow vs. stack-based overflow
1. more complex and harder to implement 2. heap-based change data whereas stack-based literally alter the control
requirements for IVs for CBC
1. must be non-repeating 2. must be unpredictable (i.e. attacker can't know what IV is until after encryption)-->just generate a random IV for every message you encrypt
properties for digital-signature scheme
1. nonforgeability: must be difficult for attacker Eve to forge signature Salice(M) 2. Nonmutabality: must be difficult for attacker to take a S(M) and convert S(M) into a valid signature on a different message N --> nonrepudiation, i.e. difficult to claim Alice didn't sign a document M when she did
precondition/postcondition/invariant
1. precondition: avoids memory errors e.g. a. s1 must be null terminated b. length of s1 must be less than length of s2 OR s2 must be null terminated 2. postcondition: proves truth of what function's trying to achieve NOT HARD-just states what function does, e.g. "rounds function down to nearest 10th" 3. invariant: true before and after function call (whereas the other 2 are true before or after must not necessarily during both) e.g. 0<=i<=strlen(s1)
How do you reduce size of the TCB
1. privilege separation--you end up with more components but not all violate security goals if an individual breaks 2. reduce programs reliance on third-party components and software (like libraries and stuff)
challenges for attackers with stack-based overflow
1. processes can't access address spaces of other processes 2. address space of given process is unpredictable--i.e. you don't know where the buffer resides exactly so you don't know whatll happen when you overflow
ways to be nondeterministic in symmetric cryptosystem
1. randomization (key or something is randomized) 2. stateful (key values depend on state)
way to lose confidentiality with encryption
1. reuse key k for different purposes (e.g. if an eavesdropped sees MACsubk(C) and C=EsubK(M) at same time) 2. reveal message in thing 3. anything deterministic (e.g. hash functions, RSA, )
vulnerabilities to look out for on test:
1. string format-if you see printf(blah) but no second argument like %s, then the program is vulnerable because programmer can use %n 2.
for this class what ISA do we use (and why)
32b x86 (because it is common and weak)
on 32b x86, how big is the canary value
64 bits (8 bytes) long
PRNG
A pseudo random number generator (PRNG) is an algorithm that generates a sequence of numbers that seems random but is actually completely predictable. using xsubi+1 = (axsubi +b)modn....xsub0=given random number (seed).....0<=a and b<=n-1 are chosen at random less
ASLR stands for what
Address Space Layout Randomization
MAC process
Alice, before sending messaging, computes A=h(K concatenated with M)....A=MAC, h=hash function, K=private key Alice and Bob share, M=plaintext message Bob computes MAC for received message M, A'' = h(K||M'), if A''=A' (A'=received MAC), then Bob ias assured M'=M (i.e. received message is true message)
RSA Digital Signature
Bob encrypts M using S=M^d mod n anyone can verify signature using public key (e,nM=S^e mod n?
public-key cryptography
Bob releases public key to world (e, n)
x86
Describes 32-bit operating systems and software.
Complete mediation
Everyone goes through a security check process (breadth)
EBP stands for
Extended Base Pointer
EIP/RIP stands for what
Extended Instruction Pointer/Return Instruction Pointer
ESP stands for
Extended Stack Pointer
OFB
IV repeatedly encrypted: V0=IV, Vi=EsubK(Vsub(i-1)) Ci = Vi⊕Pi C = C0||...||Csubl Pi = Vi⊕Ci
IND-CPA stands for
Indistinguishability under chosen-plaintext attack
IV stands for what
Initialization Vector
ECB
M is broken into n-bit blocks. Ci = EsubK(Mi) C=C1*C2*...*CsubL (L=last block) ......(BTW Pi=Mi, and Zi=Vi...just diff notation)..... Pi = DsubK(Csubi) VERY FLAWED: redundancy in blocks reveals info-->if Mi=Mj, Ci=Cj, so ECB LEAKS INFO about plaintext
what prevents someone from writing code into memory to execute
N X bit (i.e. a bit specifying that someone who writes code into memory, cant then execute it...you can do one or the other at once)
MACs vs. Digital Signatures
Public key cryptography with digital signatures: A digital signature with public key cryptography securing a message is created in the following way. First, the message is digitally signed like explained above. Then, this bundle is encrypted with the sender's private key, and again with the receiver's public key......--only sender of message M can produce ciphertext C using private key K, verified by signature produced using private key
What security principle is being used: reuse same IV for counter mode, because you think an attacker won't notice it
Security through Obscurity
Arc Injection
Synonymous with return-to-libc attack--when you get program to use functions in libc (the set of instructions executed is the arc)
Defense in depth
Those who are checked are checked thoroughly with multiple checks in process (depth)
TOCTOU stands for
Time of Check/Time of Use
TLB
Translation Lookaside Buffer
global property for TLB
W XOR X
synonyms for executable space protection
W^X (Write XOR Execute), DEP (Data Execution Prevention)
Kerchoff's principle
a cryptography system (cryptosystem) only requires that the key is unknown to the attacker in order for it to be secure
what is this: EIP/RIP
a register that represents points to next instruction (i.e. it is the address of the NEXT instruction)
TOCTOU
a vulnerability that occurs if you don't perform a validity check of on object and the actual execution of that object as one thing that can't be uninterrupted. i.e. any time you run anything in code, you must perform a validity check for it that relies on that code being unchanged
ASLR does what
a way to prevent stack-based overflow attacks--rearranges data of a process's address space randomly, making it hard to predict where to jump in order to execute the code
where is eip located
address of the instruction we want to return to (i.e. the esp of the parent instruction)
AES
advanced encryption standard, a symmetric 128-bit block data encryption technique 128, 192, or 256 bit long keys process for 128 bit version: 10 rounds-each round performs invertible transformation 128-bit array block state Xsub0 = P⊕K . [x0=first state, plaintext P with key k] Xi = Xsub(i-1)⊕K lookup tables can optimize speed steps: 1. SubBytes: use lookup table to substitute bytes in for encrypted version of bytes 2. ShiftRows: permutation 3. MixColumns: matrix multiplication using Hill cipher step 4. AddRoundKey: XOR : an XOR step with a round key dervied from 128-bit encryption key
at what number does overflow occur for 2's complement
after 0x7fff ffff
at what number does overflow occur for unsigned integers
after 0xffff ffff
buffer
allotted physical memory to temporarily store data before while it is being moved to somewhere else--in security this is often when there's user input and that input is stored in the buffer
Return-Oriented Programming
an attack where you don't change code (so it works even with executable space protection and code signing) and hijack control and have bouncing around of instructions already in the code but do it in an order that is malicious and creates unsafe code
standard stack frame layout
args return address stack pointer (i.e. value of parent's ebp) exception handlers canary local variables callee saved registers
security through obscruity
assuming an attacker won't know--which is a bad thing to assume and is not true security
Stack Smashing attack
attacker causes stack-based overflow that results in changing the return address so that the attacker can see the output rather than the intended return address
Stack-based buffer overflow
attacker inputs something that goes beyond size of buffer which results in overwriting local variables that are adjacent in memory to the memory address of the buffer
Trampolining
attackers know that most processes start off by loading commonly known external libraries into their address space, so they use this knowledge to try to get the processor to jump to an address stored in one of the processor's registers (i.e. jump-to-register)
authenticity vs. integrity
authenticity implies integrity, but not necessarily other way around
what are good/bad PRNGs for sources of entropy for a block cipher
bad: 1. based on time (can close window of time to dramatically reduce entropy 2. process IDs good: 1. hardware based (computer hardware, weather, dice rolls etc.) 2. any strong source ⊕ weak source (the strong overpowers)
bit flipping in CBC vs. CTR
bit flipping can cause integrity loss for both (i.e. decryption result can be different)
symmetric cryptosystem building block
block cipher (used in all the ones we talk about)-->maps n bits to n bits using k-bit random key K..EsubK must be a permutation on the n-bit strings, called blocks
symmetric cryptography methods
block cipher modes: 1. CBC: cipher block chaining 2. CTR: counter mode 3. OFB: output feedback mode 4. ECB: electronic code book 5. CFB: cipher feedback mode
symmetric key cryptography
both endpoints share same private key K
birthday attack
brute-force technique attack of hash functions collision resistance as soon as more than 23 people in room, more than 50% chance 2 people share a birthday With b-bit output hash function H, number of possible hash values is 2^b --> thus security of 256-bit collision resistance hash function has 128-bit security. Pr(ith message generated by attacker does not collide with previous i-1 messages) = 1-(i-1)/2^b
How do you ensure complete mediation
check every access to every object (you can assume based on commonalities later on)
CBC stands for
cipher block chaining
CFB
cipher feedback mode similar to CBC, in that encryption of of block of plaintext Pi involves encryption Csub(i-1) of previous block Ci = EsubK(Csub(i-1))⊕Psubi . [as in C is passed to function E for encryption] Psubi = Esubk(Csub(i-1))⊕Csubi [decryption]
What security principle is being used: airports forcing everyone to go through same metal detector before boarding plane
complete mediation
What security principle is being used: users dont upgrade security on bank accounts, because doing so requires 50 question questionnaire
consider human factors
Dynamic Memory Allocation
crafting variables while a programming is running (i.e. along the heap as opposed to stack)
block cipher
crypographic method of encrypting a message to become ciphertext using a block of data all at once, rather than bit by bit
PUSH
decrement stack pointer and store something there
division of trust
describes how many people are required to exercise a power: no single person should have all the power
methods to achieve authenticity
digital signatures
frames (in a stack)
each frame represents a function call
what point to what on the current stack frame
ebp points to top and esp points to bottom of current frame
Digital Signature
encrypted code that is attached to a message to indicate, using public key, is indeed from the intended person
ciphertext
encrypted text
method to achieve confidentiality
encryption
entropy and stack-canaries
entropy=number of bits a stack canary has, i.e. how hard is it to predict what random value it will take--need sufficient entropy for stack canary to be secure
CTR
every step of encryption and decryption can be done in parallel -->good for high speed applications as alternative to CBC start with random seed and compute ith offset vector according to Vi=EsubK(s+i-1)=E(k, nonce || i)..... i=counter Csubi = Vi⊕Pi . [encryption] Pi = Vi⊕Ci . [decryption[
CBC
flaws: no authenticity/successive blocks are encrypted sequentially C0=IV. Ci = EsubK(Csub(i-1) ⊕ Psubi) Pi = DsubK(Csubi)⊕Csub(i-1) C = C0 || C1 ||...|Csubl [i.e. concatenate all of the ciphertexts to get the final ciphertext] Decryption works in reverse order
unicidity distance
for a cryptosystem, this is the minimum number of characters of ciphertext needed so that there is an intelligible corresponding plaintext associated.....unicidity distance in characters is typically much less than their key lengths in bits
When should you start implementing security measures
from start as opposed to later
common C methods that results in stack-based buffer overflows
gets(), strcpy()
stack canary
goal is to protect the return pointer from being overwritten by a stack buffer when program starts up, create a random value (called a stack canary)
argue that the adversary can win the IND-CPA game
goal: prove that adversary can guess M with significantly greater than 1/2 probability 1. send 2 blocks (M, M') which are unequal but equal length as challenge 2. resulting ciphertexts are: some expressions with XORs or IVs etc.
How do you implement defense in depth
have multiple security checks in a row
IND-CPA
if someone keeps encrypting and encrypting with some operations set over and the attacker can see the ciphertext and one of 2 possible plaintexts....if the challenger cant get it right with more than negligible probability greater than random guessing then its IND-CPA, which is a good thing
with nonexecutable stack and heap, what happens with buffer overflows
if you can return the address you can still probably take advantage of the program using the things in the programs memory already to change the flow of the program by hacking the control (i.e. use Return Oriented Programming---but you cant do the typical stack smashing using a shellcode approach
memory leak for encryption test
if you flip one bit, does the rest of the ciphertext change?--if not, then there is no memory leak which occurs for ciphers defined in terms of each other (like with AES for example)
buffer overflow attacks with memory-safe languages
impossible because we check if we put too much into a memory space at runtime with things like index out bounds with memory-safe languages like java, not with C though
cryptosystem attacks types
in order of least to most info: 1. ciphertext only attack: attacker has access to ciphertext of one or more messages all encrypted using same key K. goal is to discover k 2. known-plaintext attack: attacks has one more plaintext-ciphertext pairs, each plaintext encrypted using same key K. goal is find K 3. chosen-plaintext attack: attacker can choose one or more plaintext messages and get associated ciphertext for each based on use of same key K (offline version-->attacker must choose all plaintexts in advance, adaptive-->attacker iteratively picks) 4. chosen-ciphertext attack: attacker can choose one or more ciphertexts and get associated plaintext for each based on same key K (has offline and adaptive versions also)
memory safe language
language that makes sure input is within size of buffer, e.g. java throws runtime error if index out of bound
What security principle is being used: company has 5 levels of user accounts, each having access to more and more proprietary info
least privilege
what security principle is being considered when you give cardkey access only to TAs who need to use printer rather than all TAs
least privilege (NOT division of trust-->that has to do with requiring a certain NUMBER of people to access some info or power)
POP
load something and increment the stack pointer
what is stored on a stack frame
local variables and arguments of the call and the return address for the parent call, which contains the memory address where the program will run when this function call is done (i.e. a stack frame is like an environment diagram frame with a parent frame and locally defined variables)
TCB for locking your bike up
lock, pole, you, anyone with key
Executable Space Protection
mark certain portions of memory as non-executable (with an NX bit) now user can't place shellcode inside a buffer
MAC stands for what
message authentication code
what is a TCB
minimum set of hardware and software that we completely depend on for correct enforcement of security policy...all the things we need to work to be secure in a prevention of an attack
are block ciphers IND_CPA
no because determinstic-only mode we really know that does this is ECB --> therefore give same output for same input--> solve this by adding entropy to block ciphers: 1. CBC: uses IV 2. CTR: uses nonce (V) (i.e. the counter vector)
N X bit
no execute bit--separate segments of memory for storage--prevents code from being both writable and executable at same time
what qualities alone is enough ensure an attacker cannot read or tamper
none....confidentiality integrity authenticity availability-->you need multiple at once
in stack where do lower bytes go
on bottom
off-by-one error
one fewer iteration or one too many (e.g. fencepost error is an example)
OTP
one-time pad requirements: 1. length m of the block of keys must be equal to n, the length of plaintext-->a key is a sequence of m shift amounts (k0-->ksubm) modulo m 2. each shift amount, ksubi, must be chosen randomly
what is the purpose of this: ESP
points to bottom of frame
what is the purpose of this: EBP
points to top of current frame
ASLR randomizes what
pretty much everything in the stack of code except the text segment (i.e. except the place thats returned) the start of different memory sections, but not the order of each function's stack frame
confidentiality vs. integrity vs. authenticity
preventing adversaries from reading private data, altering it, determining who created a given document
DEP
prevents certain memory sectors, e.g. the stack, from being executed. When combined with ASLR becomes exceedingly difficult to exploit vulnerabilities in applications using shellcode or return-oriented programming (ROP) techniques.
Least privilege
principle that talks about how much power an entity has on its own (whether it be a person or a piece of the software): it says that it should be the least amount of power needed to perform a task
Cryptographic Hash Function
provides mapping that is deterministic, one-way, collision resistant H(M)=hash value computed from message. it should be heasy to do. Collision Resistance: hash function H maps input strings to smaller output strings. For H(M')=H(M), hash function is weak if it is difficult to find ANOTHER M'≠M ....strong collision resistance if computationally hard to compute M1 and M2 such that H(M1)=H(M2) and M1≠M2.
RSA
public key cryptography pick the following: n=pq (p and q are prime) ∅(n)=(p-1)(q-1) e is rel prime to ∅(n) Bob's public key pair is (e,n) and private key d is kept secret d=e^(-1)mod∅(n) C=M^emodn . [encryption] C^dmodn . [decryption] when M is not rel prime to n it must still be rel prime to either p or q, since M<n security of RSA is tied to factoring n, revealing values of p and q, but this is super hard. it is deterministic however so be careful need good algos for: 1. primality testing 2. computing GCD 3. computing modular inverse 4. modular power
ElGamal Cryptosystem
public-key cryptosystem that uses randomization number g is generator mod p if for each positive integer i in Zsubp(0-->p) there is an integer k such that i = g^k mod p-->test this with g^(p-1)/pi mod p ≠1 when we have generator we can compute x=g^k mod p for any k security depends on difficulty of discrete log problem process: 1. setup: Bob choose random large prime number p and find generator g for Zsubp. picks random number x between 1 and p-2-->computes y=g^x mod p. x is Bob's secret key 2. encryption: Alice encrypts plaintext message M by generating random number k between 1 and p-2 and using modular multiplication to compute: a=g^k modp and b=My^kmodp......Alice's encryption is the pair (a,b) 3. each subsequent encryption, Alice must use different random number or information leakage would occur 4. decryption: given Elgamal ciphertext (a,b) Bob decryptys by using: M=b(a^x)^-1 mod p Note: Alice never knew Bob's secret key to encrypt and Bob never knew random value k to decrypt Alice encrypts plaintext message M for Bob by beginning with public key (p,g,y)
MAC
short piece of info that authenticates that the message being sent via cryptography did come from true sender...i.e. verifies that message hasnt been tampered with
Return Oriented Programming
stack-based buffer overflow, when non-executable stack and heap is used, that changes the order of which instruction will be executed next by using existing instructions
symmetric vs. asymmetric encryption
symmetric is faster
Shannon's Maxim
the attacker knows the system they're trying to break into--i.e. they know how everything is working under the hood and that will be revealed one way or the other, i.e. all you need is the key to be private
SHA-256
the commonly used standard hash function producing hash values with 256 bits
shellcode
the malicious code that an attacker chooses to execute once he has gained the ability to execute code, using a stack-based overflow strategy, and starts up a terminal or something for the attacker to use to hack more and more/ written in assembly language instructions, known as op codes
stack is what in memory
the portion of memory address space that contains data related to function calls
plaintext
the text you are going to encrypt
TCB stands for what
trusted computing base
what's an example of a defense in depth
two factor authentication with passwords
Return-to-libc
uses the external libraries loaded at runtime, specifically libc, which contains the functions used in C. If attacker can determine location of a few of these functions, like execv for instance, then he can force the program to call this function with a sufficient overflow. stack contains args to existing functions not shellcode
TCB for preventing breakins to apartment
walls, floor, door, lock, windows, roof, you, anyone with a key
stack canary
way to prevent stack-based overflow attacks--places a random value (canary) between buffer and control data (i.e. data if overwritten changes what lines of code are executed next). The program recurringly checks if the stack canary has been changed. If it has, then that signals that there's an attack. Thus, it should prevent the malicious code from being executed
Format String Attack
when a program doesn't have a format string, the input to the printf function controls the format....then you could use %n which writes to memory. e.g. if program uses printf(argv[1])....an input could use printf(%n, 'adasd') and that first print would write to memory which was probably not the intention
buffer overflow
when a program tries to put more info than can be stored in the allotted space for input, which can often allow an attacker to overwrite data beyond just the location of the buffer in memory---eventually allowing them to gain more control and take control of the process
memory leak problems occur when
when programmers allocate memory on heap and don't deallocate (free) that block on the heap/memory leaks are when memory locations are allocated but not used
arithmetic overflow
when with 32 bit architecture you cannot represent a small enough or large enough number and the next number just wraps around and becomes the most negative value for instance--this is dangerous if a program is checking a condition of a number being less than 5 for instance...because itll work to a point, i.e. until the number becomes greater than the maximum integer possible --solve by only incrementing when you are below the condition then stop incrementing
%n does what
writes the number of characters printed so far
human factors and security
you have to make sure your security system is usable by people
threat model and security
you must know what type of attacks you should be worried about