CH2 - Application Layer (2.4-2.7)
(2.7) Review the UDPClient.py and UDPServer.Py code
1. Client creates a socket() 2. Server bind()s to a port number 3. Client sendto() a message to server and includes the destination address and port number 4. Server recv() the packet from client 5. Server sendto() response to client 6. Client recv() response and close()s
(2.6) What are CBR and VBR in videos?
- CBR = Constant bit rate (video encoding fixed) - VBR = Variable bit rate (video encoding rate changes as amount of spatial and temporal coding changes)
(2.6) What are CDNs? What do they do?
- CDN = Content Distribution Network - Manage world-wide servers, store copies of videos and other web content, redirect users to CDN locations that provide best experience - Private CDN = owned by content provider - Third-party CDN = distributes content on behalf of multiple content providers - Note: CDNs are not scalable if a SINGLE "mega-server" is used (similar to how a single DNS server wouldn't work)
(2.7) How does socket programming with TCP work?
- Client and server handshake first to associate both with IP addresses and port number - Do NOT need destination address attached to packet, or have packets at all - TCP server must be running as process before client initiates contact 1. Client process initiates TCP connection by creating TCP socket 2. Specifies address of welcoming server socket (IP address, port #) 3. Three-way handshake with server, client, socket 4. TCP server creates a new socket dedicated to client during handshake 5. Application sees "pipe" formed between client and server sockets, can drop bytes into the stream
(2.6) What are some over the top (OTT) challenges of CDN?
- Coping with congested internet - Which CDN node to retrieve content from - How to deal with viewer behaviour in presence of congestion - What content to place in each CDN node
(2.6) What is DASH in HTTP streaming?
- DASH = Dynamic Adaptive Streaming over HTTP - Server divides video file into multiple chunks and encodes a version of each at different rates - Manifest file = provides URLs for the different chunks - Client will periodically measure server-to-client BW (ie. average throughput) - Client chooses when and where to request chunk, as well as what encoding rate to query TLDR: Video streaming = Encoding + DASH + Playout buffering
(2.4) What is DNS?
- DNS = Domain Name System - It is a hierarchical, distributed naming system for anything that connects to the Internet - DNS Server translates hostnames (eg. facebook.com, google.com) into IP addresses - Typically employed by other application-layer protocols (HTTP, SMTP) to RESOLVE (ie. translate) names
(2.4) What are the two types of DNS protocol messages? What format do they have?
- DNS only has QUERY and REPLY messages - Share same format [Identification] [Flags] [# of questions] [# of answer RRs] [# of authority RRs] [# of additional RRs] [Questions] [Answers] [Authority] [Additional info] - ID = 16 bit number - Flags = query/reply, recursion desired, recursion available, is reply authoritative - Questions = name, type fields for query - Answers = RRs in response to query - Authority = records for authoritative server
(2.5) How does a service like BitTorrent use P2P distribution?
- Files divided into 256kB chunks - Tracker = tracks peers participating in a torrent - Torrent = group of peers exchanging chunks of a file - Churning = peers coming and going - When a peer joins a torrent, it registers with the tracker to connect with peers and accumulate chunks --> Requesting chunks - Peers will request rarest chunks first from other peers, who will have different chunks --> Sending chunks - User AA will connect with 4 peers sending AA chunks at the highest rate - Other users are "choked" (not being sent chunks) - Top 4 re-evaluated every ~10 sec - Every ~30 sec, randomly a peer is chosen to be un-choked - Connecting with better trading partners = higher rate
(2.4) Give an example of how a DNS query is passed through the DNS hierarchy.
- Iterated query example - Local host makes DNS query (ie. wants www.amazon.com) - Query sent to local DNS server - Local DNS server checks cache for copy, if not found forwards query to root server - Root server tells local server to look for ".com" domain IP addresses - Local server forwards query to TLD server, which tells local server to look for "amazon.com" IP addresses - Local server forwards to authoritative server, which finds the IP address for "www.amazon.com" and returns it
(2.4) What is a local DNS server?
- Not strictly bound to hierarchy - Each ISP has one ("default name server") - When host makes DNS query, query is sent to local DNS server - Local server has a cache of recent translations - Forwards query into hierarchy
(2.4) What are Root DNS servers?
- Official, contact-of-last-resort by name servers that can't resolve name - VERY important -> critical to the Internet - DNSSEC provides security (authentication and message integrity) - ICANN (Internet Corporation for Assigned Names and Numbers) manages Root DNS domain - 13 logical root name "servers" (that are replicated around the world)
(2.4) What is DNS caching?
- Once a name server learns a mapping, it can cache it - Cache entries time out/get deleted after a while (TTL - Time to live) - TLD servers usually cached in local name servers so root name servers aren't visited often - Cache entries may be out-of-date! Changed IP addresses may not be known until all existing TTLs expire
(2.4) What are Authoritative servers?
- Organization's own DNS server(s), provides authoritative hostname to IP mappings for organization's named hosts - ie. Houses DNS records for publicly accessible DNS hosts - Organization itself or service provider can manage records
(2.5) What is P2P Distribution?
- P2P = Peer-to-peer - No always-on server, end-systems communicate with each other (peers request/provide services to each other) - P2P is self-scalable -> new peers bring both service capacity and demands
(2.4) What are resource records (RR) in DNS? What is their format?
- RRs are the unit of information entry in DNS servers - Format: (Name, value, type, TTL)
(2.4) How do you insert records into DNS?
- Register domain name at a DNS registrar - Registrar is a commercial entity that provides names, IP addresses of authoritative name servers
(2.4) What are Top-Level Domain (TLD) servers?
- Responsible for .com, .org, .net, .edu, etc. and countries like .cn, .uk, .fr, etc. - Network Solutions = authoritative registry for .com/.net - Educause = authoritative registry for .edu - Provides IP address to authoritative servers
(2.7) How does socket programming with UDP work?
- Sending process attaches destination address to PACKET of data - Packet passed out through sender's socket, internet uses address to route packet to receiving process - Receiving process gets packet, looks at content, takes action - Destination address = destination host IP address + port number - Attaching done by the OS, NOT UDP application code
(2.6) What are the main challenges of HTTP streaming?
- Server-to-client bandwidth values, packet loss, and delay due to congestion leads to poor quality or playback delay - Continuous playout restraint = once playout begins, playback should match original timing - Network delays are variable (jittering) which can make meeting the restraint difficult
(2.4) Why is DNS not centralized?
- Single point of failure - High traffic volume - Distant centralized database (not all locations are nearby) - Difficulty of maintenance - Conclusion -> DNS doesn't scale very well
(2.6) What is a video? How are digital images coded?
- Video = sequence of images displayed at a constant rate - Digital image = array of pixels -> each pixel represented by bits - Images coded by using redundancy WITHIN and BETWEEN images to decrease # of bits used - Spatial coding (within) = instead of sending same value N times, send 2 values -> 1 for colour, 1 for N - Temporal coding (between) = instead of sending complete i+1 frame, only send difference between frame i and i+1
(2.6) How does HTTP streaming of videos work?
- Video stored at HTTP server as file with a URL - When user wants to see a video, client establishes a TCP connection with server and issues an HTTP GET request - Video bytes gather in the client buffer and playback begins when buffer is full
(2.5) For P2P distribution of a file (F) to N peers, what is the total upload capacity of the system?
- u_total = sum of server + peers = u_s + u_1 + u_2 + ... + u_N - System must deliver total of NF bits at rate <= u_total
(2.7) Review the TCPClient.py and TCPServer.py code
1. Client creates a socket() 2. Client connects() to server 3. Server has bind() with server port number and was listen() for a client 4. Server receives connection request and accept()s 5. Client send()s a message through socket and TCP connection 6. Server recv() (receives) message and decode()s it 7. Server send()s response back 8. Client recv() response and close()s 9. Server close()s
(2.4) What are possible ways DNS security is threatened?
1. DDoS attacks - Bombard root servers with traffic - Historically has been ineffective due to packet filtering at root servers - Bombarding TLD servers is potentially more dangerous 2. Redirect attacks - Attacker intercepts DNS queries and sends bogus replies to DNS servers (which are cached) - Called DNS poisoning 3. Exploiting DNS for DDoS - Sending queries with spoofed source address in large amplitudes
(2.6) What are the two types of CDN server placements?
1. Enter deep - Deploying server clusters in access ISPs globally - Maintaining and managing can become challenging 2. Bring home - Building large clusters at a smaller # of sites - Lower maintenance, but possibility for more delay
(2.4) What services are offered by DNS?
1. Hostname to IP address translation 2. Host aliasing - Canonical hostname = original hostname, other versions are aliases 3. Mail server aliasing - DNS can get canonical hostnames for alias mail hostnames 4. Load distribution host - Sites can be replicated over many servers - DNS will associate these IP addresses under one hostname
(2.4) What are the two types of queries that can occur in a DNS query?
1. Iterated query - Server replying will give referral to other DNS servers if it doesn't know - Local DNS server must re-query other servers 2. Recursive query - Server replying may query other DNS servers on client's behalf - Name resolution burden put on contacted name server -> can cause heavy load at upper levels of hierarchy
(2.7) What are the two types of network applications?
1. Open network applications - Operations are specified in publicly known protocol standards - Well-known port numbers used for sockets 2. Proprietary network applications - Application-layer protocol that is private, only publisher knows protocol standards - Well-known port numbers avoided
(2.4) What types of RRs are there?
1. Type A - name = hostname - value = IP address 2. Type NS - name = domain (eg. foo.com) - value = hostname of authoritative nameserver 3. Type CNAME - name = alias hostname - value = canonical hostname 4. Type MX - name = alias hostname - value = mail server associated with name
(2.6) How does Netflix work as a CDN?
Bob has a Netflix account. - Can manage account through Netflix registration and accounting servers - Amazon Cloud uploads copies of multiple versions of files to different CDN servers - Amazon Cloud allows Bob to browse catalogue and request files - When Bob wants to watch a specific video, a CDN (DASH) server will be selected, contacted, and streaming begins
(2.6) What are challenges of content distribution? What is the main solution?
Challenges: - Reaching billions of users - Heterogeneity of users (different internet rates, technology, etc.) Solution = distributed, application-level infrastructure
(2.4) What is the hierarchy of DNS?
Root DNS servers -> Top-level domain servers -> Authoritative servers - Local servers are not bound strictly to hierarchy, but act as a stepping stone between local host and DNS hierarchy
(2.6) Give an example of how a CDN works.
ex) Browser instructed to retrieve a video at http://video.netcinema.com/XXX 1. User visits NetCinema webpage and clicks on /XXX link 2. Host sends DNS query for video.netcinema.com 3. User's local DNS (LDNS) relays query to authoritative DNS (ADNS) for NetCinema 4. NetCinema's ADNS returns hostname in KingCDN's domain 5. LDNS sends another query to KingCDN's private infrastructure 6. KingCDN's ADNS returns IP address for /XXX 7. User's LDNS returns IP address to host
(2.5) How long does it take to distribute a file (F) to N peers using client-server distribution vs P2P distribution?
u_s = server upload capacity d_i, u_i = ith peer's download/upload capacity d_min = minimum download rate Client-server: --> Server transmission - Must sequentially send/upload N copies - Time to send N copies = N*(F/u_s) --> Client transmission - Minimum client DL time = F/d_min THUS time to distribute F to N clients D >= max{N*(F/u_s), F/d_min} P2P: --> Server transmission - Must send at least one copy - Time to send 1 copy = F/u_s --> Client transmission - Each client must download copy (N*F bits) - Minimum client DL time = F/d_min - Maximum upload rate = u_s + sum(u_i) THUS time to distribute D >= max{F/u_s, F/d_min, (N*F)/(u_s+sum(ui))} - Note that minimum distribution time goes to a limit for P2P while C-S tends to increase linearly