Network Security

Syn Flood Attacks

https://www.cloudflare.com/learning/ddos/syn-flood-ddos-attack/

A SYN flood (half-open attack) is a type of denial-of-service (DDoS) attack which aims to make a server unavailable to legitimate traffic by consuming all available server resources. By repeatedly sending initial connection request (SYN) packets, the attacker is able to overwhelm all available ports on a targeted server machine, causing the targeted device to respond to legitimate traffic sluggishly or not at all.


In networking, when a server is leaving a connection open but the machine on the other side of the connection is not, the connection is considered half-open. In this type of DDoS attack, the targeted server is continuously leaving open connections and waiting for each connection to timeout before the ports become available again. The result is that this type of attack can be considered a “half-open attack”.

 

3 Types of SYN flood attack

Direct Attack – A SYN flood where the IP address is not spoofed is known as a direct attack. In this attack, the attacker does not mask their IP address at all. As a result of the attacker using a single source device with a real IP address to create the attack, the attacker is highly vulnerable to discovery and mitigation.

Spoofed Attack – A malicious user can also spoof the IP address on each SYN packet they send in order to inhibit mitigation efforts and make their identity more difficult to discover.

Distributed attack (DDoS) – If an attack is created using a botnet the likelihood of tracking the attack back to its source is low. For an added level of obfuscation, an attacker may have each distributed device also spoof the IP addresses from which it sends packets.

 

Mitigating SYN attacks

Increase Backlog Queue:

Each operating system on a targeted device has a certain number of half-open connections that it will allow. One response to high volumes of SYN packets is to increase the maximum number of possible half-open connections the operating system will allow. 

Recycling the Oldest Half-Open TCP connection

Another mitigation strategy involves overwriting the oldest half-open connection once the backlog has been filled. This strategy requires that the legitimate connections can be fully established in less time than the backlog can be filled with malicious SYN packets. This

SYN Cookies

This strategy involves the creation of a cookie by the server. In order to avoid the risk of dropping connections when the backlog has been filled, the server responds to each connection request with a SYN-ACK packet but then drops the SYN request from the backlog, removing the request from memory and leaving the port open and ready to make a new connection.

 

Guide to Kaminsky DNS Vulnerability

http://unixwiz.net/techtips/iguide-kaminsky-dns-vuln.html

Terminology

  1. Zone = domain, collection of hostname
  2. Nameserver = Server software that answers DNS questions
  3. Authoritative Nameserver = Zone master. For every zone, somebody has to maintain a file of the hostnames and IP address associations 
  4. Resolver = Client of the DNS client/server system, asks the questions about hostnames. On linux this is found on the /etc/resolv.conf
  5. Recursive Nameserver = A nameserve that go out and find the results for zones its not authoritative for, as a service to clients. 
  6. Resource Record = Other records on DNS that is not hostname-to-IP mapping
  7. Delegation = when nameserver doesnt have contents of a zone, but knows who to ask, it delagte service of that zone to another nameserver

Example:

  1. Client asks for www.uixwiz.net. IT requests the “A Record” which presents an IP address
  2. The ISP nameserver is not authoritative for unixwiz.net so it looks up in its localzones, also checks cache. If not match goes out to Internet
  3. Recursive nameservers are preconfigured with 13 root servers. Nameserver picks one at random and sends off query for A record.

 

Network Security

Layer 3: (Inter) Network Layer

  • Bridges multiple “subnets” to provide end-to-end internet connectivity between nodes
  • Provides global addressing (IP addresses)
  • Only provides best-effort delivery of data (i.e., no retransmissions, etc.)
  • Works across different link technologies

 

IP Packet “Header”

The payload is the “data” that IP is delivering. It may contain another protocol’s header and payload.

 

IP Packet Header Fields

Version number (4 bits)

  • Indicates the version of the IP protocol
  • Necessary for knowing what fields follow
  • “4” (for IPv4) or “6” (for IPv6)

Header length (4 bits)

  • How many 32-bit words (rows) in the header
  • Typically 5
  • Can provide IP options, too

Type-of-service (8 bits)

  • Allow packets to be treated differently based on different needs
  • Low delay for audio, high bandwidth for bulk transfer, etc.

Two IP addresses (IPv4)

  • Source (32 bits)
  • Destination (32 bits)

Destination address

  • Unique identifier/locator for the receiving host
  • Allows each node (end-host and router) to make forwarding decisions

Source address

  • Unique identifier/locator for the sending host
  • Recipient can decide whether to accept the packet
  • Allows destination to reply to the source

 

IP “Best Effort” Packet Delivery

  • Routers inspect destination address, determine “next hop” in the forwarding table
  • Best effort = “I’ll give it a try”
    • Packets may be lost
    • Packets may be corrupted
    • Packets may be delivered out of order

Fixing these is the job of the transport layer!

 

Attacks on IP

Source-Spoof

  • There nothing in IP that enforces that your source IP address is really you
  • No protection of payload or the header

Why source-spoof?

  • Consider spam: send many emails from one computer
    • Defense that is trying to block source wont work
  • Easy defense: block many emails from a given (source) IP address
  • Easy countermeasure: spoof the source IP address
  • Counter-countermeasure?

How do you know if a packet you receive has a spoofed source?

 

Defenses

How do you know if a packet you receive has a spoofed source?

  • Send a challenge packet to the (possibly spoofed) source (e.g., a difficult to guess, random nonce)
  • If the recipient can answer the challenge, then likely that the source was not spoofed

So do you have to do this with every packet?? – its expensive

  • Every packet should have something that’s difficult to guess

 

Source Spoofing

Why source-spoof?

  • Consider DoS attacks: generate as much traffic as possible to congest the victim’s network
  • Easy defense: block all traffic from a given source near the edge of your network
  • Easy countermeasure: spoof the source address

Challenges won’t help here; the damage has been done by the time the packets reach the core of our network Ideally, detect such spoofing near the source

 

Egress Filtering

  • The point (router/switch) at which traffic enters your network is the ingress point
  • The point (router/switch) at which traffic leaves your network is the egress point
  • You don’t know who owns all IP addresses in the world, but you do know who in your own network gets what IP addresses
    • If you see a packet with a source IP address that doesn’t belong to your network trying to cross your egress point, then drop it

Egress filtering is not widely deployed

 

Eavesdropping / Tampering

  • No security built into IP
  • Deploy secure IP over IP

 

Virtual Private Networks (VPN)

  • Trusted vs Untrusted network
    • Allow the client to connect to trusted network from within untrusted network
  • Creates end-to-end encrypted/authenticated connection
  • Predominate way of doing this: IPSec

 

IPSEC

At Layer 3 / TLS / SSH

Operates in a few different modes

  • Transport mode (TLS): Simply encrypt the payload but not the headers (TLS)
    • Payload gets integrity, cannot be read. Headers are exposed.
  • Tunnel mode (VPN): Encrypt the payload and the headers (VPN)
    • Both Payload + Header
    • Routing accomplished by wrapping the encrypted Payload/Header packet into another IP packet

But how do you encrypt the headers? How does routing work?

  • Encrypt the entire IP packet and make that the payload of another IP packet

 

Tunnel Mode

The VPN server decrypts and then send the payload (full IP packet) as if it received from inside network 

From the client/server’s perspective: looks like client is physically on same network

 

Layer 4: Transport Layer (TCP)

  • End-to-end communication between processes
  • Different types of services provided:
    • UDP: unreliable datagrams
    • TCP: reliable byte stream
  • “Reliable” = keeps track of what data were received properly and retransmits as necessary

 

TCP Reliability

Given best-effort delivery, the goal is to ensure reliability

  • All packets are delivered to applications
  • … in order
  • … unmodified (with reasonably high probability)

Must robustly detect and retransmit lost data

 

TCP Bytestream service

Process A on host 1:

  • Send byte 0, byte 1, byte 2, byte 3, …

Process B on host 2:

  • Receive byte 0, byte 1, byte 2, byte 3, …

The applications do not see:

  • packet boundaries (looks like a stream of bytes)
  • lost or corrupted packets (they’re all correct)
  • retransmissions (they all only appear once)

 

Each byte is reliably delivered in order, however in reality packets are sometimes retransmitted, sometimes out of order.

 

How does TCP achieve reliability?

TCP Congestion Control

  • Try to use as much of the network as is safe (does not adversely affect others’ performance) and efficient (makes use of network capacity)
  • Dynamically adapt how quickly you send based on the network path’s capacity 
  • When an ACK doesn’t come back, the network may be beyond capacity: slow down.

 

TCP Header and Ports

  • Ports are associated with OS processes 
  • Sandwiched between IP header and the application data {src IP/port, dst IP/port} : this 4-tuple uniquely identifies a TCP connection
  • Some port numbers are well-known
    • 80 = HTTP
    • 53 = DNS

 

TCP Sequence Numbers

  • Each byte in the byte stream has a unique “sequence number” 
  • Unique for both directions
  • “Sequence number” in the header = sequence number of the first byte in the packet’s data
  • Next sequence number = previous seqno + previous packet’s data size
  • “Acknowledgment” in the header = the next seqno you expect from the other end-host

 

TCP Flags

  • SYN
    • Used for setting up a connection
  • ACK
    • Acknowledgments, for data and “control” packets 
  • FIN: Let’s shut this down (two-way)
    • FIN
    • FIN+ACK
  • RST: I’m shutting you down
    • Says “delete all your local state, because I don’t know what you’re talking about”

TCP Attacks

  • SYN Flooding
  • Injection Attacks

 

SYN Flooding

A is flooding B, B allocates resources to respond to A, B runs out of resources. Also, A is never ACK the SYN-ACK so the connections stay open, locking the resource. 

SYN Flooding Details

  • Easy to detect many incomplete handshakes from a single IP address
  • Spoof the source IP address
    • It’s just a field in a header: set it to whatever you like
    • Prevents B from detecting multiple SYN requests from the same client
  • Problem: the host who really owns that spoofed IP address may respond to the SYN+ACK with a RST, deleting the local state at the victim
  • Ideally, spoof an IP address of a host you know won’t respond

 

SYN Cookies

Defense for SYN flooding attacks

 

Other Defenses

  • Filtering
  • Increasing Backlog (use more memory so you can have more TCP connections)
  • Reducing SYN-RECEIVED Timer
  • Recycling the Oldest Half-Open TCB (Transmission Control Block)
  • SYN Cache (reduces the amount of state hence less resources consumed, kind of like SYN cookie)
  • Hybrid Approaches (combinations of above)
  • Firewalls and Proxies
    • Firewall and Proxies checks for TCP attacks, auto block malicious before traffic actually gets to the server

 

Injection Attacks

  • Suppose you are on the path between src and dst; what can you do? (MITM)
    • Trivial to inject packets with the correct sequence number
  • What if you are not on the path?
    • Need to guess the sequence number
    • Is this difficult to do? – YES it is difficult today

 

Initial Sequence Numbers

  • Initial sequence numbers used to be deterministic
  • What havoc can we wreak?
    • Send RSTs
    • Inject data packets into an existing connection
    • Initiate and use an entire connection without ever hearing the other end

 

MITNICK Attack (Kevin Mitnick 1994)

  • 1994 Christmas – learn user behavior, then hijack the client connection, was able to remote in and get root access. Exploited the predictable behavior of sequence numbers.
  • Step 1: Information gathering
    • Find sequence number pattern
    • Find X-terminal/server trusted relationship
  • Step 2: The flood/DoS attack
  • Step 3: Trusted relationship hijacking
  • Step 4: Remote commands
    • Create a backdoor to come back later
  • Step 5: Clean up

rhosts = config file on who can remote into the machine. Added his attacker computer, now allowing him to access remotely.

Defenses = initial sequence number must be difficult to predict!

 

Denial of Service Attacks

  • Denial-of-service attack (DoS attack)
    • Any attempt to overwhelm a party’s resources
    • Too much traffic for endpoint computer
    • Too much traffic for network (e.g., saturate link)
    • Use up memory or CPU resources on target
  • Distributed denial-of-service attack (DDoS attack)
    • Incoming traffic flooding the victim originates from many different sources
    • Often the result of botnets

 

Reflection and Amplification

    • Many DoS attacks use “reflection”
      • Send a request to some other machine, cause it to send traffic to your actual victim
      • Two motivations:
        • Reduce traceability
        • Amplify the original attack
    • Amplification attacks are used to magnify the bandwidth that is sent to a victim
      • Attacker uses a compromised endpoint to send UDP packets with spoofed IP addresses to a DNS recursor (spoofed address points to the real IP address of the victim)
      • Each one of the UDP packets makes a request to a DNS resolver (often requesting the most info possible)
      • After receiving the requests, the DNS resolver sends a large response to the spoofed IP address
      • IP address of the target receives the response and the surrounding network infrastructure becomes overwhelmed with the deluge of traffic

 

  • Smurf Attack

 

DDoS attack – ICMP packets are sent from attacker’s router to different devices. But the source/from IP is the target IP. So this overwhelms the victim with an amplified response attack.

Below the content is amplified as the response going to the target is of very large size. By sending a large amount of data it overwhelms the victime.

Dynamic Host Configuration Protocol

Naming

  • IP Address allow global connectivity
  • Useless for humans
    • Cant be expected to pick own IP
    • Cant be expected to remember other’s IP
  • DHCP = setting of IP addresses
  • DNS = mapping a memorable name to routeable IP address

 

DHCP Attacks

  • Requests are broadcast: attackers on same subnet can hear new host’s request
  • Race the actual DHCP server to replace
    • DNS server
      • Redirect any of a hosts lookups to a machine of the attacker’s choice (what IP address should I use when trying to connect?)
    • Gateway
      • The gateway is where the host sends all outgoing traffic. Can modify gateway to intercept all user’s traffic
      • Then relay it to gateway (MITM)
      • How could the user detect this?

 

Rogue DHCP Remediation

  • These attacks would be done on the local subnet
  • DHCP Snooping – at the switch level drop any packets that are not from valid DHCP server. This could be done by keeping track of DHCP server MAC, IP, etc.
  • Wireshark – from client send out DHCP request and see what offers come back

Hostnames and IP Addresses

  • Google.com is easy to remember but not routable
    • 74.125.228.65 is routable
  • Name resolution = process of mapping from one to another

 

Terminology

  • Domain names can map to a set of IP addresses
    • dig = Domain Information Groper
    • https://www.howtogeek.com/663056/how-to-use-the-dig-command-on-linux/

  • zone” = a portion of the DNS namespace, divided up for administrative reasons
    • Think of it like a collection of hostname/IP address pairs that happen to be lumped together
    • www.google.com, mail.google.com, dev.google.com, …
  • Subdomains do not need to be in the same zone
    • Allows the owner of one zone (purdue.edu) to delegate responsibility to another (cs.purdue.edu)

    • Nameserver = piece of code that answers queries of the form “what is the ip address for foo.bar.com”
      • Every zone must run >= 2 nameservers
      • Several very common nameserver implementations:
        • BIND
        • PowerDNS (more popular in Europe)
    • Authoritative nameserver = every zone has to maintain a file that maps IP addresses to hostnames. One of the name servers in the zone has a master copy of the file – this is the authority of the mapping.
    • Resolver (DNS Resolver) = while name servers answer queries, resolvers ask queries
      • Every OS has resolver, typically small and pretty dumb. All it typically does is forward query to local
      • The forward query is to a recursive nameserver
    • Recursive nameserver = nameserver which will do the heavy lifting, issuing queries on behalf of client resolver until an authoritsative answer returns

 

  • Prevalence

 

    • There is almost always a local (private) recursive name server
    • But very rare for name servers to support recursive queries
  • Record = “resource record” usually mapping between hostname and IP
    • But generally can map virtually anything to anything
  • Many record Types
    • (A) address records (IP <-> hostname)
    • Mail server (MX, mail exchanger)
    • SOA (start of authority, delineate different zones)
    • Others for DNSSEC to be able to share keys
  • Records are the unit of information
  • Nameservers within zone must be able to give:
    • Authoritative answers (A) for hostnames in that zone

 

Nameservers

  • Nameservers within a zone must be able to give:
    • Authoritative answers (A) for hostnames in that zone
  • Pointers to nameservers (NS) who host zones in its subdomain
    • This is an nameserver within the broader domain

 

DNS

Domain Name Service at a very high level example workflow. 

  1. Starts at local nameserver (recusive)
  2. Local cannot find so then reaches out to root DNS who forwards request to zone TLD DNS, that also not able to handle completely so forwards the user to next step.
  3. Note that the servers shown below, they have their own cached in case another user asks.

 

How do they know these IP addresses?

  • Local DNS server: host learned this via DHCP
  • A parent knows its children: part of the registration process
  • Root nameserver: hardcoded into the local DNS server (and every DNS server)
    • 13 root servers (logically): A-root, B-root, …, M-root
    • These IP addresses change very infrequently

 

Caching

  • Central to DNS’s success
  • Also central to attacks
  • “Caching poisoning” filling a victims chace with false information

 

Queries

Whats in a response?

  • Many things, but for the attacks we’re concerned with
  • A record = gives the authoritative response for the IP address of this hostname
  • NS record = describes this is the name of the nameserver who should know more about how to answer the query
    • Often also contains ‘glue’ records (IP addresses of those name servers to avoid chicken and egg problem)
    • Resolver will generally cache all of this information

 

Query IDs

QueryId = 16bit field in DNS record header, this is how it tracks query/answers. Ex: client sends #16322 and nameserver responds with #16322 in the header

 

Query IDs used to increment 

Attacker can poison a DNS cache by:

  1. Attacker asks Nameserver for some name, ex: www.bank.com
  2. Local nameserver richest out to Authoritative DNS server for bank.com
  3. Attacker races to beat response from Authoritative DNS server, which creates an incorrect mapping/cache for bank.com on the nameserver
    1. Attacker is able to guess what the QueryID is that the nameserver would use to reach out to the authoritative server
    2. Attacker used another request to www.bad.com to guess the next QueryId
  4. Now attacker gets all traffic from bank.com 

 

Cache Poisoning

Based on manipulating queries, predicting responses and redirecting connection to one controlled by the attacker.

Details of getting the attack to work

  • Must guess query ID: ask for it, and go from there
    • Partial fix: randomize query IDs
    • Problem: small space
    • Attack: issue a Lot of query IDs
  • Must guess source port number
    • Typically constant for a given server (often 53)
  • The answer must not already be in the cache
    • It will avoid issuing a query in the first place

 

DNS (review)

  • Each DNS resolver or authoritative server stores resource records in its cache or local zone file
  • When a DNS resolver or authoritative server receives a query, it searches its cache for a match
  • If there is no match, the server may return a referral response, containing a record of NS type whose label is “closer” to the domain which is the subject of the query
  • Instead of sending a referral response, the DNS resolver may also initiate the same query to an authoritative DNS server responsible for the domain name which is the subject of the query

 

Request for www.unixwiz.net

Glue Records

  • If the nameserver knows the answers to any queries that are related to its answer, it can pre-emptively supply those answers in the ADDITIONAL section without requiring additional queries
  • Many DNS cache implementations cache the glue records as normal records, thus a malicious response could get added to the cache

 

Cache Poisoning

  • An attacker may poison the cache by
    • Compromising an authoritative DNS server
    • Forging a response to a recursive DNS query sent by a resolver to an authoritative server
  • DNS records corresponding to popular domains are likely to be already stored in the cache prior to an attack and are thus not vulnerable to the basic forgery exploit
  • It is the ability to overwrite existing records that makes DNS response forgery such a devastating attack

 

Kaminsky Attack Part 1 – type of cache poisoning

  • If you’re trying to attack www.foobar.com, an attacker doesn’t try to race for that particular name
  • The server might not be willing to go out looking for http://www.foobar.com for a while
  • Attacker asks for 1.foobar.com, 2.foobar.com, 3.foobar.com, and so on
    • The subdomains might limited
  • This increases his chances of guessing the transaction ID as well
  • If you’re trying to attack www.foobar.com, an attacker doesn’t try to race for that particular name
  • The server might not be willing to go out looking for http://www.foobar.com for a while
  • Attacker asks for 1.foobar.com, 2.foobar.com, 3.foobar.com, and so on
  • This increases his chances of guessing the transaction ID as well
  • Faked response: Attacker will feign ignorance: “83.foobar.com? I don’t know, ask www.foobar.com, here’s its (evil) address. Oh, and remember this for the next week.”

Cache Poisoning using Kaminskys 

Why can we overwrite this record?

Solutions

  • Randomizing query ID?
    • Not sufficient alone: only 16 bits of entropy
  • Randomize source port as well
    • There’s no reason for it to stay constant
    • Gets us another 16 bits of entropy
  • DNSSEC?

 

DNS Cache Poisoning – Computerphile

Using a queryId provide the response before the authoritative server. The challenge is getting the correct queryId. Back in the day this was sequential so it was somewhat easier to predict the next query id.

 

When exploiting the nameserver with the wrong IP, it’s impacting all clients connected to this nameserver. 

 

Dan Kaminsky

Instead of requesting google.com, he requests the sub-domains: 1.google.com, 2.google.com, which still causes the nameserver to do lookups. 

 

This exploit was also able to send a different nameserver IP to the clients. So they now goto a compromised nameserver.

 

Both of these vulnerabilities were remediated in 2008 by randomizing QueryIds and Source/Destination ports. This creates complexity for attacker to predict a sequenced query id. 

 

DNSSEC

Use of keys/certifications to verify Root, TLD and Authoritative DNS

Properties of DNSSEC

  • IF everyone has deployed it, and you know root keys, then prevents spoofed responses
    • Very similar to PKIs 
  • But unlike PKIs, we still want authenticity despite the fact that not everyone has deployed DNSSEC
    • What if someone replies without DNSSEC
    • Ignore = secure but you cant connect to a lot of hosts
    • Accept = can connect but insecure
  • Back to notion of incremental deployment
    • DNSSEC is not all that useful incrementally

 

DNSSEC is growing. 

 

The 13 root DNS servers have root keys. Keys were updated in 2018.

 

DNS-over-TLS

DNS over TLS uses port 853 (chrome)

DNS over HTTPS uses port 443 (firefox)

DNSSEC – Youtube

Validates the DNS Query and Responses

The use of signatures to prove authentication

  • A chain of trust
    • Root
    • TLD (top level domain)
    • Example.com

ICANN keeps management over DNSSEC. ICAN has the root level chain signed with certificate

 

Firewalls

(Network) Firewalls

  • Provide central “choke point” for all traffic entering and exiting the system
  • Main goals
    • Service control
      • What services can be accessed (inbound or outbound)
    • Behavior control
      • How services are accessed (e.g., spam filtering, web content filtering)
    • User/machine control
      • Controls access to services on a per-user/machine level
  • Other goals
    • Auditing (see also intrusion detection)
    • Network address translation
    • Can also run security functionality, e.g., IPSec, VPN
  • What they cannot protect against
    • Do not offer full protection against insider attacks
    • Users bypassing the firewall to connect to the Internet
    • Infected devices connecting to network internally

 

Firewalls Overview

  • Positive filter
    • Allow only traffic meeting certain criteria
    • i.e., the default is to reject
  • Negative filter
    • Reject traffic meeting certain criteria
    • i.e., the default is to accept

 

Need for Firewalls?

  • Why not just provision each computer with its own firewall/IDS?
    • Not cost effective
    • Different OS’s make management difficult
    • Patches must be propagated to all machines in the system
    • Does not protect against insider attacks that extend beyond the local network
  • Defense in depth
    • Can also have per-host firewalls as well

 

Packet Filtering

  • Apply a set of rules to each incoming/outgoing packet
  • Packet filtering may be based on any part(s) of the traffic header(s):
    • Source/destination IP address
    • Port numbers
    • Flags
    • Network interface (e.g., reject packet with internal IP address if coming from the wrong interface)

DMZ

Not inside network, outer border. First line defense.

 

Disadvantages of Packet FIltering

  • Can be difficult to configure rules to achieve both usability and security
    • E.g., ftp uses a dynamically-assigned port number for the data transfer
  • Misconfigurations can be easily exploited
  • Does not examine application-level data
  • No user authentication
  • Does not address inherent TCP/IP vulnerabilities
    • E.g., address spoofing

 

Stateful Firewalls

  • Typical packet filtering applied on a packet-by-packet basis
  • Can also look at context
    • Ex., maintain list of active TCP connections (useful when port numbers are dynamically assigned)
    • Ex., look at sequence numbers and detect replays
  • Can also use global information (e.g., number of packets to/from a particular IP address)

 

Host-Based Firewalls

  • Can be used on machines that are not part of a larger network (e.g., home machines)
  • Can also provide additional protection within a larger network
  • Filtering can be machine-specific

 

Multiple Firewalls

  • Can have multiple firewalls, each providing different protection

Everything you wanted to know about Tor but were afraid to ask
https://privacy.net/what-is-tor/

If you’re interested in online privacy, then you’ve no doubt heard about Tor (The Onion Router). The Tor Network (or just “Tor”) is an implementation of a program that was originally developed by the US Navy in the mid-1990s. It enables users greater anonymity online by encrypting internet traffic and passing it through a series of nodes.

When a user is connected to Tor (often through the Tor browser), their outgoing internet traffic is rerouted through a random series of at least three nodes (called relays) before reaching its destination (the website the user wants to visit). Your computer is connected to an entry node, and the final node traffic passes through is the exit node, after which it reaches its destination (the website you want to visit). Incoming traffic is rerouted in a similar manner.

While connected to the Tor network, activity will never be traceable back to your IP address. Similarly, your Internet Service Provider (ISP) won’t be able to view information about the contents of your traffic, including which website you’re visiting.

It’s very difficult, if not impossible, to become truly anonymous online, but Tor can certainly help you get there. All of your traffic arriving at its destination will appear to come from a Tor exit node, so will have the IP address of that node assigned to it. Because the traffic has passed through several additional nodes while encrypted, it can’t be traced back to you.

The darknet houses some legitimate websites, but it is better known for being a place rife with illicit activity.

You can access the clear net with Tor, but you can also access darknet websites, specifically .onion sites. These are sites which only people using the Tor browser can access, and have .onion as part of their URL. They are also referred to as “Tor hidden services.”

It is perfectly legal to use Tor, although it has been or is currently blocked in certain countries. Plus, there is still a stigma attached to it, so you probably shouldn’t assume you can use it trouble-free.

The major downside to using Tor is that its slow. Traffic isn’t going directly to its destination, so this will slow things down. Plus, the speed of traffic flowing between the nodes could be slower than your regular internet connection, further dampening the overall speed.

Due to these issues, the main use for Tor is general browsing. It isn’t suitable for streaming or torrenting, or anything else that requires a lot of bandwidth.

Another downside is that your ISP will be able to see that you’re using Tor. It won’t be able to read the contents of your traffic, but the fact that it detects you’re using Tor could have some repercussions. As mentioned earlier, using Tor alone is enough to raise suspicion from ISPs and authorities. One way around this is to use a VPN with Tor (more on that below).

Your traffic will go through the VPN server before it gets to the Tor entry node. This means that the VPN server can only see that you’re connected to Tor and can’t see where your traffic is going. Going back to your ISP, it only sees that you’re connected to a VPN server, and nothing beyond that. This means your ISP can’t see that you’re connected to a Tor entry node.

One Tor-related project you may be familiar with is Tor Messenger. This open source software was designed for use alongside existing networks such as Facebook, Twitter, and Google Talk. All Tor Messenger traffic is sent over Tor, and Off-The-Record chat is used to enforce encrypted conversations between users.

 

The Dining Cryptographers Problem: Unconditional Sender and Recipient Untraceability

https://users.ece.cmu.edu/~adrian/731-sp04/readings/dcnets.html

The article discusses the “Dining Cryptographers Problem,” presenting a protocol for achieving untraceable sender and recipient anonymity in communications. It explains a method where cryptographers (participants) can anonymously broadcast messages within a network, ensuring that the sender’s identity remains hidden. This is achieved through a combination of shared secrets and a clever protocol that guarantees anonymity and untraceability, even in the face of potential collusion among participants. The approach generalizes to any number of participants and includes discussions on practical considerations for implementing such a system, addressing issues like key distribution, communication techniques, and preventing disruption or misuse of the network.

The protocol operates by having each participant encrypt a message with a key shared with their neighbor, then combine all these encrypted messages together. If a participant wants to send an anonymous message, they modify their part of the combined message accordingly. The unique setup ensures that only the sender knows they are the source of the message, as each participant’s contribution is indistinguishable from random noise to others. This method allows for secure, anonymous communication without revealing the sender’s identity to any party, including those within the network. 

 

Privacy and Anonymity

Anonymity

  • A state of being not identifiable within a set of subjects/individuals
  • Why do we need it?
    • Internet is designed to be a public place
    • WWW = “Web Without Walls”
    • Routing information is public
    • IP packet headers identify source and destination
  • Even a passive observer can easily figure out who is talking to whom
  • Encryption does not and cannot hide identities
    • Encryption hides payload, but not routing information

Positive aspects

  • Avoid detection, retribution, and embarrassment
  • Freedom of expression
  • Whistle-blowing …

Negative aspects (Illegal activity)

  • Anonymous bribery
  • Copyright infringement
  • Harassment and financial scams
  • Disclosure of trade secrets …

 

Difference between Privacy and Anonymity

Privacy

  • Privacy is the claim of individuals, groups, or institutions to determine for themselves when, how, and to what extent information about them is communicated to others

Anonymity

  • The state of being not identifiable within a set of subjects/individuals
  • It is a property exclusively of individuals

Privacy != Anonymity

  • Anonymity is a way to maintain privacy, and sometimes it is not necessary
  • Aren’t privacy preserving protocols sufficient??

 

Key Differences

  • Identity vs. Information: Anonymity is primarily concerned with concealing a user’s identity, making sure that the user’s actions cannot be linked back to them. Privacy is focused on protecting the user’s information from being accessed, used, or shared without their consent.
  • Protection Focus: Anonymity protects the user against identification and tracking, while privacy safeguards the user’s data from exposure or theft.
  • Implementation Methods: Achieving anonymity might involve using services that mask one’s IP address and employing strategies to avoid identification. Privacy measures include encryption, secure passwords, and careful management of personal information and how it’s shared.

 

Anonymity Use Cases

Privacy preserving protocols are not pervasive

  • Reasons: Efficiency, Overhead, Law, Surveillance

The Internet has become a mass surveillance system

 

Anonymity Basics

Requirements of an anonymous communication network (ACN)

  • Provide anonymity at communication and application layers
  • Remove identifying information
  • Anonymize the communication trace

Anonymity Basics

  • Unlinkable Anonymity
    • Each action is anonymous
    • Cash
    • TOR
    • Crypto
  • Linkable Anonymity
    • Each action is linkable but still anonymous
  • Pseudonymity
    • There is a pseudonym that is registered and links all transactions
  • Verinymity
    • Each action is done with a real name

 

Modeling Anonymity: Attacker Types

Passive vs Active

  • Passive Eavesdropping
    • Listen to communication lines (e.g., traffic analysis)
    • Participate with a few entities, and actions
  • Active Attacks
    • Modify communicated messages or communication pattern
    • Replay sender messages or add malicious traffic
    • Actively block, inject, tamper messages

Global vs Local

  • Global (Eavesdropping)
    • Listening to all communication
    • Compromising all but a few entities
  • Local Attackers
    • Compromise some local (e.g. geographically restricted) subjects, communication links and network nodes (routers)
    • Launch active or passive attacks using those

 

Important Anonymity Properties

  • Subject Anonymity
    • Sender anonymity
    • Receiver anonymity
  • Relationship Anonymity
  • Unlinkability
  • Unobservability
  • Pseudonymity (opposite of unlinkability)

 

Anonymity Properties: Unlinkability

  • Unlinkability ensures that a user may make multiple uses of resources or services without others being able to link these uses together
  • Unlinkability of two or more IOIs from the attacker’s perspective means that the attacker cannot sufficiently distinguish whether these IOIs are related or not
    • Sender, Receiver, or Communication (action) unlinkability

Anonymous Communication Networks

  • Categorization of ACNs
  • Ideal: Low latency and communication complexity, and high traffic analysis resistance
    • There is no known ideal protocol!
  • General Categorization
    • Low latency ACNs e.g., Tor, Dining Cryptographers (DC) nets
    • High latency ACNs e.g., mix-networks
  • Pick 2 of the 3 above is generally the rule

Low Latency ACNS


Low Latency ACN Protocols

  • Aim: To keep latency/delay due to the ACN protocol small such that its existence/usage remains transparent to the user
  • Useful for applications such as
    • web browsing
    • instant messaging, tele-conferencing
    • web services such as internet banking
  • Examples:
    • Low communication cost and low traffic analysis resistance
      • anonymizer.com, Tor
    • High communication cost and high traffic analysis resistance
      • DC-nets, DISSENT, P5

 

Single Proxy Protocol

  • Example: https://www.anonymizer.com
    • Use a single trusted proxy to hide your IP address
    • All traffic goes through them as a proxy, you get anonymity
  • Drawbacks
    • Traffic analysis
    • Trust on anonymizer.com
    • What happens when they go bankrupt?
    • NSA’s Prism program

The NSA’s PRISM program is a surveillance initiative that was first revealed to the public in June 2013 through documents leaked by Edward Snowden. PRISM is a component of the United States National Security Agency (NSA) that allows for the collection of internet communications from at least nine major US internet companies. These companies reportedly include tech giants such as Google, Microsoft, Yahoo, Facebook, Apple, and others.

 

Anonymising Proxies – Anonymizer.com

  • Functionality:
    • Web proxy (single ”Mix”) that forwards request on the user’s behalf
    • Does not forward IP address of end user
    • Eliminates information about user’s machine (e.g., previously visited sites)
    • Filters out cookies, JavaScript, active content
  • Limitations:
    • Single point of trust
    • Connection itself is not anonymised

 

Onion Routing

  • What about trusting multiple, geographically distributed proxies/hops instead?
  • Not at all satisfactory
    • One bad proxy node is enough to break the anonymity completely
    • The attacker may be able to link communication by observing it at two different locations

 

TOR: Second Generation Onion Routing Protocol [The Onion Routing protocol]

  • Tor (https://www.torproject.org)
    • Intended to provide anonymity over the Internet
    • Running since October 2002
  • Tremendously successful!
    • > 1,500,000 users all over the world
    • > 6000 OR (volunteers) nodes/proxies/router
  • The second most employed privacy enhancing technology after the TLS protocol

 

The TOR protocol

  • First, a user must establish a circuit
  • In the simplest setting, Alice learns the public key for all intermediate Tor nodes from the Tor public directory

  • First, a user must establish a circuit
  • Then create a layered, encrypted message for each Tor node in the path
    • Each connection is a layer (onion skin) with its own unique encryption
      • At the core (yellow below) would be the message, every other layer is just to make that edge connection
  • Alice sends 𝐸𝑛𝑐𝑝𝑘𝑅𝑎,𝑅𝑏,𝐸𝑛𝑐𝑝𝑘𝑅𝑏,𝑅𝑐,𝐸𝑛𝑐𝑝𝑘𝑅𝑐 ,𝑆,𝐸𝑛𝑐𝑝𝑘𝑆,𝑚 to 𝑅𝑎
  • In the simplest setting, Alice learns the public key for all intermediate Tor nodes from the Tor public directory
  • As the message goes through the network, at each node a layer of the onion is peeled such that at the last node they get the final core message

 

TOR Goals and non-goals

  • Tor seeks to frustrate attackers from linking communication partners, or from linking multiple communications to or from a single user
  • Goals
    • Protection against local (partial) active attacker
    • Low latency
    • Low communication complexity
  • Non-goals
    • Not secure against global passive adversary
    • Not secure against end-to-end attacks
      • Traffic analysis, timing attacks, fingerprinting

 

Traffic Analysis Attacks on TOR

  • A number of traffic analysis attacks have been performed on the Tor network
  • Prominent Example: End-to-end timing correlation
    • First and last nodes can link packets using timing

 

High Latency ACN Protocols

High Latency Protocols

  • These protocols mitigate traffic analysis attacks by introducing latency while routing the user messages
  • Traffic analysis resistance is improved by compromising latency
  • For example: Mix Networks

 

Mix Nodes

  • In cryptographic sense, mix nodes are similar to the onion routing (OR) nodes
  • In addition, they aim at hiding correspondence between inputs and outputs even at the communication level
  • How do you hide correspondence?
  • Pool Mixing
    • The mix flushes when a certain condition is fulfilled; it could be reaching a threshold or the expiration of a timeout
    • The mix may keep a certain amount of messages from one round to the other (to increase the anonymity)
  • Continuous Mixing (stop-and-go)
    • Messages are delayed following an exponential distribution (or other distributions)
    • The delay is independent of the traffic

 

Mix Networks vs Onion Routing

  • Circuits vs Paths
    • In onion routing, circuits are employed. Each communication (including replies) associated with a user follows the same circuit/path for a pre-defined time.
    • Maintaining the message order is important
    • Consecutive messages from a user can be linked
    • In mix networks, every communication (including replies) may follow a different path.
    • Each message is unlinkable
  • Latency
    • The OR nodes process/transfer the messages immediately
    • The mix nodes induce some latency to communications by introducing pool or continuous mixing
  • Partial active adversary in onion routing vs global passive adversary in mixnets

 

Traffic Analysis Resistance without Latency

  • In mix networks, latency increases as we increase traffic analysis resistance
  • What if we do not want to hurt the latency while improving the traffic analysis resistance?
  • We can also increase the communication cost to improve traffic analysis resistance
    • Example: Dining Cryptographer (DC) Network

 

High vs Low ACN

High latency Anonymous Communication Networks (ACNs) prioritize strong anonymity and are typically slower due to multiple layers of encryption and the routing of messages through several nodes (as seen in Tor). Low latency ACNs, on the other hand, offer faster communication but at the expense of reduced anonymity, employing fewer layers of encryption and simpler routing mechanisms. The main difference lies in the balance each type of protocol strikes between speed (latency) and the degree of anonymity provided.

 

DC Nets

DC (Dining Cryptographers) Network (chaum 1988)

  • Three cryptographers are having dinner.
    • Either NSA is paying for the dinner
    • Or one of them is paying, but wishes to remain anonymous
    • 2 stage protocol:
  • 1 = Each diner flips a coin and shows it to his left neighbor.
    • Every diner will see two coins: his own and his right neighbor’s
  • 2 = Each diner should announce (1) if the two coins are the different and (0) otherwise.
    • If he is the payer, he lies and says the opposite
  • Outcome:
    • Odd number of 1s ⇒ NSA is paying
    • Even number of 1s ⇒ one of them is paying

Assumption = all parties will be honest and follow the protocol