Chapter 2
Web page
(also called a document) consists of objects
root DNS servers, top-level domain (TLD) DNS servers, and authoritative DNS servers
3 classes of DNS servers: ___________________________
cache server client
A ________ is both a server and a client at the same time. When it receives requests from and sends responses to a browser, it is a ______. When it sends requests to and receives responses from an origin server, it is a ______.
registrar names and IP addresses of your primary and secondary authoritative DNS servers
A ___________ is a commercial entity that verifies the uniqueness of the domain name, enters the domain name into the DNS database (as discussed below), and collects a small fee from you for its services. When you register the domain name with some registrar, you also need to provide the registrar with the _____________________________________________.
distributed hash table
A _______________ is a simple database, with the database records being distributed over the peers in a P2P system.
POP3
A mail access protocol, such as _____, is used to transfer mail from the recipient's mail server to the recipient's user agent.
Name, Value, Type, TTL
A resource record is a four-tuple that contains the following fields:
Subsequent requests and responses between the same client and server can be sent over the same connection.
Advantage of persistent connections:
feeding her bits at the highest rate 10 seconds unchoked optimistically unchoked compatible rates tend to find each other new peers choked
Alice continually measures the rate at which she receives bits and determines the four peers that are ________________________. Every _________, she recalculates the rates and possibly modifies the set of four peers and said to be _______. Importantly, every 30 seconds, she also picks one additional neighbor at random and sends it chunks said to be ____________. The effect is that peers capable of uploading at ___________________. The random neighbor selection also allows _________ to get chunks, so that they can have something to trade. All other neighboring peers besides these five peers (four "top" peers and one probing peer) are "_________," that is, they do not receive any chunks from Alice.
0 to 255 where the host is located in the Internet bottom to top
An IP address looks like 121.7.106.83, where each period separates one of the bytes expressed in decimal notation from ______________. An IP address is hierarchical because as we scan the address from left to right, we obtain more and more specific information about ____________________________ (that is, within which network, in the network of networks). Similarly, when we scan a postal address from ________________, we obtain more and more specific information about where the addressee is located.
IMAP, folder, INBOX
An _______ server will associate each message with a ______; when a message first arrives at the server, it is associated with the recipient's _______ folder.
geographically closest geographic location LDNS
Assign the client to the cluster that is _________________. Using commercial geo-location databases each LDNS IP address is mapped to a __________. When a DNS request is received from a particular _______, the CDN chooses the geographically closest cluster.
cluster selection strategy IP address of the client's LDNS cluster based on this IP address
At the core of any CDN deployment is a ____________, that is, a mechanism for dynamically directing clients to a server cluster or a data center within the CDN. As we just saw, the CDN learns the __________________ server via the client's DNS lookup. After learning this IP address, the CDN needs to select an appropriate _________________.
DDoS bandwidth-flooding attack majority of legitimate DNS queries never get answered. top-level-domain servers, .com domain DNS servers, root servers, caching in local DNS servers man-in-the-middle attack, sends bogus replies to a DNS server
Attacks against DNS: For example, an attacker could attempt to send to each DNS root server a deluge of packets, so many that the ____________________________________________. A potentially more effective DDoS attack against DNS would be send a deluge of DNS queries to ________________________________, for example, to all the top-level-domain servers that handle the _______________. It would be harder to filter DNS queries directed to _________; and top-level-domain servers are not as easily bypassed as are _______________. But the severity of such an attack would be partially mitigated by _______________. In a ________________, the attacker intercepts queries from hosts and returns bogus replies. In the DNS poisoning attack, the attacker _______________________, tricking the server into accepting bogus records into its cache.
stateless protocol
Because an HTTP server maintains no information about the clients, HTTP is said to be a ____________.
repeatedly try to send the message to Bob's mail server
By having Alice first deposit the e-mail in her own mail server, Alice's mail server can ___________________________________________________, say every 30 minutes, until Bob's mail server becomes operational.
DNS caching local memory cached information after a period of time two days root servers
DNS extensively exploits ____________ in order to improve the delay performance and to reduce the number of DNS messages ricocheting around the Internet. In a query chain, when a DNS server receives a DNS reply (containing, for example, a mapping from a hostname to an IP address), it can cache the mapping in its ___________. Because hosts and mappings between hostnames and IP addresses are by no means permanent, DNS servers discard _________________ (often set to __________). In fact, because of caching,_________________ are bypassed for all but a very small fraction of DNS queries.
the hostname of the server that houses the object and the object's path name
Each URL has two components:
distributing the file upload capacity peers P2P architecture each peer distributes portions of the file to the other peers server each bit of the file at least once into its access link F/us redistribute the bit among themselves F bits of the file in less than F/dmin seconds F/dmin upload rate of the server plus the upload rates of each of the individual peers NF utotal
Each peer can assist the server in ____________. In particular, when a peer receives some file data, it can use its own _______________ to redistribute the data to other _______. Calculating the distribution time for the ________________ is somewhat more complicated than for the client-server architecture, since the distribution time depends on how _______________________. At the beginning of the distribution, only the _____ has the file. To get this file into the community of peers, the server must send ____________________. Thus, the minimum distribution time is at least _______. (Unlike the client-server scheme, a bit sent once by the server may not have to be sent by the server again, as the peers may ________________________.) As with the client-server architecture, the peer with the lowest download rate cannot obtain all ________________________. Thus the minimum distribution time is at least __________. Finally, observe that the total upload capacity of the system as a whole is equal to the _________________________, that is, utotal=us+u1+⋯+uN. The system must deliver (upload) F bits to each of the N peers, thus delivering a total of __ bits. This cannot be done at a rate faster than _____. Thus, the minimum distribution time is also at least
mailbox
Each recipient, such as Bob, has a _____________ located in one of the mail servers.
tracker informs the tracker that it is still in the torrent participating in the torrent
Each torrent has an infrastructure node called a _________. When a peer joins a torrent, it registers itself with the tracker and periodically ___________________. In this manner, the tracker keeps track of the peers that are _________________.
publicly accessible DNS records hosts to IP addresses authoritative DNS server service provider primary and secondary (backup) authoritative DNS server
Every organization with publicly accessible hosts (such as Web servers and mail servers) on the Internet must provide ___________________ that map the names of those ________________. An organization's ____________________ houses these DNS records. An organization can choose to implement its own authoritative DNS server to hold these records; alternatively, the organization can pay to have these records stored in an authoritative DNS server of some ________________. Most universities and large companies implement and maintain their own ___________________________________.
top-level domains com top-level domain edu top-level domain large and complex authoritative DNS servers
For each of the _________________ there is TLD server (or server cluster). The company Verisign Global Registry Services maintains the TLD servers for the _____________________, and the company Educause maintains the TLD servers for the __________________. The network infrastructure supporting a TLD can be __________________. TLD servers provide the IP addresses for _____________________________.
variable-length alphanumeric characters, IP addresses
Furthermore, because hostnames can consist of _____________________________, they would be difficult to process by routers. For these reasons, hosts are also identified by so-called _________________.
a client program and a server program
HTTP is implemented in two programs:
pull push
HTTP is mainly a _______ protocol—someone loads information on a Web server and users use HTTP to pull the information from the server at their convenience. SMTP is primarily a _______ protocol—the sending mail server pushes the file to the receiving mail server.
conditional GET
HTTP mechanism that allows a cache to verify that its objects are up to date
(1) the request message uses the GET method (2) the request message includes an If-Modified-Since: header line.
HTTP request message is a conditional GET message if:
TCP
HTTP uses ____ as its underlying transport protocol
persistent
HTTP uses _____ connections in its default mode
This can easily be done with the nslookup program.
How to send a DNS query message directly from the host you're working on to some DNS server?
message queue
If Alice's server cannot deliver mail to Bob's server, Alice's server holds the message in a _______________ and attempts to transfer the message later.
entered into the form fields
If the value of the method field is POST, then the entity body contains what the user ____________.
torrent no chunks uploads chunks to other peers leave the torrent remain in the torrent and continue to upload chunks to other peers subset of chunks rejoin
In BitTorrent lingo, the collection of all peers participating in the distribution of a particular file is called a _____________. When a peer first joins a torrent, it has _________. While it downloads chunks it also ________________. Once a peer has acquired the entire file, it may (selfishly) ________________, or (altruistically) ______________________. Also, any peer may leave the torrent at any time with only a _________, and later _________ the torrent.
large number of servers, organized in a hierarchical fashion and distributed around the world hosts distributed across the DNS servers
In order to deal with the issue of scale, the DNS uses a ___________________________________. No single DNS server has all of the mappings for all of the ________ in the Internet. Instead, the mappings are _____________________________.
Content Distribution Networks (CDNs) geographically distributed locations best user experience private third-party CDN
In order to meet the challenge of distributing massive amounts of video data to users distributed around the world, almost all major video-streaming companies make use of __________________. A CDN manages servers in multiple __________________, stores copies of the videos (and other types of Web content, including documents, images, and audio) in its servers, and attempts to direct each user request to a CDN location that will provide the ______________. The CDN may be a ____________ CDN, that is, owned by the content provider itself. The CDN may alternatively be a ______________ that distributes content on behalf of multiple content providers and Level-3 all operate third-party CDNs.
scale design implemented in the Internet
In summary, a centralized database in a single DNS server simply doesn't _______. Consequently, the DNS is distributed by _______. In fact, the DNS is a wonderful example of how a distributed database can be _______________________________.
download-and-keep mail server different machines work, home
In the _________________ mode, the user agent leaves the messages on the ________________ after downloading them. In this case, Bob can reread messages from ________________; he can access a message from_______ and access it again later in the week from _______.
user agents, mail servers, and the Simple Mail Transfer Protocol (SMTP)
Internet mail system has three major components:
redistribution scheme lower bound self-scaling redistributors as well as consumers of bits
It turns out that if we imagine that each peer can redistribute a bit as soon as it receives the bit, then there is a ______________ that actually achieves this ___________. Thus, applications with the P2P architecture can be _________________. This scalability is a direct consequence of peers being ____________________________.
Dcs peers one copy of the file to each of the N peers NF/us dmin F bits of the file F/dmin minimum distribution time
Let's first determine the distribution time for the client-server architecture, which we denote by ____. In the client-server architecture, none of the _____ aids in distributing the file. We make the following observations: - The server must transmit __________________. Thus the server must transmit NF bits. Since the server's upload rate is us, the time to distribute the file must be at least ____. - Let ______ denote the download rate of the peer with the lowest download rate, that is, dmin = min{d1,dp,. . .,dN}. The peer with the lowest download rate cannot obtain all _________ in less than ________ seconds. Thus the ______________________ is at least F/dmin.
base HTML file
Most Web pages consist of a ___________ and several referenced objects.
the Amazon cloud and its own private CDN infrastructure simplify and tailor its CDN design directly tells the client to use a particular CDN server push caching dynamically during cache misses
Netflix video distribution has two major components: ______________________________________. However, because Netflix uses its own private CDN, which distributes only video (and not Web pages), Netflix has been able to __________________________. In particular, Netflix does not need to employ DNS redirect to connect a particular client to a CDN server; instead, the Netflix software (running in the Amazon cloud) ________________________. Furthermore, the Netflix CDN uses ____________ rather than pull caching content is pushed into the servers at scheduled times at off-peak hours, rather than ____________________.
Dynamic Adaptive Streaming over HTTP (DASH)
New type of HTTP-based streaming, often referred to as _________________________________.
a brand-new connection must be established and maintained for each requested object
Non-persistent connections shortcomings:
reliable data transfer service of TCP
SMTP can count on the ______________________________________ to get the message to the server without errors.
HTTP
SMTP is a push protocol, so in order for Bob to receive his data via a pull function, he must use another protocol like _____ to get his mail from the server.
7-bit ASCII format
SMTP requires each message, including the body of each message, to be in _____________.
resource records (RRs) hostname-to-IP address
The DNS servers that together implement the DNS distributed database store _______________, including ones that provide _______________ mappings.
manifest file various versions HTTP GET request message
The HTTP server also has a _____________, which provides a URL for each version along with its bit rate. The client first requests the manifest file and learns about the _______________. The client then selects one chunk at a time by specifying a ______ and a byte range in an ________________ for each chunk.
cookie
The browser sends ______ information to the server, permitting the server to identify the user throughout the user's session with the application.
GET POST
The entity body is empty with the ____ method, but is used with the _____ method.
host on which the object resides persistent connections close the connection after sending the requested object user agent (browser type that is making the request to the server) French version of the object default version
The header line Host: www.someschool.edu specifies the _______. Connection: close header line, the browser is telling the server that it doesn't want to bother with ______________; it wants the server to _________. User-agent: header line specifies the _______________ Accept-language: header indicates that the user prefers to receive a _____________, if such an object exists on the server; otherwise, the server should send its ____________.
rarest rarest chunks first rarest chunks
The idea is to determine, from among the chunks she does not have, the chunks that are the _______ among her neighbors (that is, the chunks that have the fewest repeated copies among her neighbors) and then request those _________. In this manner, the _________ get more quickly redistributed, aiming to (roughly) equalize the numbers of copies of each chunk in the torrent.
A single point of failure Traffic volume Distant centralized database Maintenance
The problems with a centralized design include: ___________________. If the DNS server crashes, so does the entire Internet! _________________. A single DNS server would have to handle all DNS queries (for all the HTTP requests and e-mail messages generated from hundreds of millions of hosts). ___________________________. A single DNS server cannot be "close to" all the querying clients. If we put the single DNS server in New York City, then all queries from Australia must travel to the other side of the globe, perhaps over slow and congested links. This can lead to significant delays. _______________________. The single DNS server would have to keep records for all Internet hosts. Not only would this centralized database be huge, but it would have to be updated frequently to account for every new host.
recursive query iterative iterative or recursive recursive iterative
The query sent from cse.nyu.edu to dns.nyu.edu is a _____________, since the query asks dns.nyu.edu to obtain the mapping on its behalf. But the subsequent three queries are ________ since all of the replies are directly returned to dns.nyu.edu. In theory, any DNS query can be________________. The query from the requesting host to the local DNS server is ____________, and the remaining queries are _____________.
access links us ui di F N get a copy of the file to all N peers
The server and the peers are connected to the Internet with __________. Denote the upload rate of the server's access link by ___, the upload rate of the ith peer's access link by __, and the download rate of the ith peer's access link by __. Also denote the size of the file to be distributed (in bits) by __ and the number of peers that want to obtain a copy of the file by __. The distribution time is the time it takes to ________________.
the protocol version field, a status code, and a corresponding status message
The status line has three fields:
LAN delay, the access delay, and the Internet delay.
The total response time is the sum of the:
root name servers organizations TLD servers
There are over 400 _______________ scattered all over the world. They are managed by 13 different ________________. They provide the IP addresses of the ______________.
+OK -ERR,
There are two possible responses: ______ (sometimes followed by server-to-client data), used by the server to indicate that the previous command was fine; and ________ used by the server to indicate that something was wrong with the previous command
local DNS server hierarchy of servers residential ISP or an institutional ISP local DNS server local DNS servers
There is another important type of DNS server called the _________________. It does not strictly belong to the ________________ but is nevertheless central to the DNS architecture. Each ISP—such as a __________________—has a _____________________ (also called a default name server). When a host connects to an ISP, the ISP provides the host with the IP addresses of one or more of its __________.
POP3, remote folders, messages, IMAP
This is not possible with ___________, it does not provide any means for a user to create _____________ and assign _____________ to folders. - ________ solves this problem
Content Distribution Networks (CDNs)
Through the use of __________________, Web caches are increasingly playing an important role in the Internet.
linearly peers N 1,000
Thus, the distribution time increases ______ with the number of _________. So, for example, if the number of peers from one week to the next increases a thousand-fold from a thousand to a million, the time required to distribute the file to all peers increases by _________.
ISP
Typically a Web cache is purchased and installed by an ____.
directly interacts translating hostnames to their underlying IP addresses
Unlike these applications, the DNS is not an application with which a user _________________. Instead, the DNS provides a core Internet function—namely, _______________________________________, for user applications and other software in the Internet.
hostname, IP address mnemonic hostname identifier, hierarchically structured IP addresses IP addresses domain name system (DNS) distributed database implemented in a hierarchy of DNS servers application-layer protocol that allows hosts to query the distributed database
We have just seen that there are two ways to identify a host—by a ______________ and by an ________________. People prefer the more ___________________, while routers prefer fixed-length, ___________________. In order to reconcile these preferences, we need a directory service that translates hostnames to _________________. This is the main task of the Internet's ____________________. The DNS is (1) a _______________________________________________, and (2) an _________________________________________________.
1) Web cache can substantially reduce the response time for a client request, particularly if the bottleneck bandwidth between the client and the origin server is much less than the bottleneck bandwidth between the client and the cache. 2) Web caches can substantially reduce traffic on an institution's access link to the Internet thereby reducing cost because does not have to upgrade bandwidth as quickly, thereby reducing costs.
Web caching has seen deployment in the Internet for two reasons:
HyperText Transfer Protocol (HTTP)
Web's application-layer protocol, is at the heart of the Web
HTTP protocol, POP3, IMAP protocol, HTTP, SMTP, SMTP
When a recipient, such as Bob, wants to access a message in his mailbox, the e-mail message is sent from Bob's mail server to Bob's browser using the _________ rather than the _____ or ____________. When a sender, such as Alice, wants to send an e-mail message, the e-mail message is sent from her browser to her mail server over _______ rather than over _______. Alice's mail server, however, still sends messages to, and receives messages from, other mail servers using ______.
HEAD debugging
When a server receives a request with the _______ method, it responds with an HTTP message but it leaves out the requested object. - often used for:
persistent
When transferring the files, both persistent HTTP and SMTP use ___________ connections.
authorization, transaction, and update username, password messages deletion, deletion marks, mail statistics quit, deletes
With the TCP connection established, POP3 progresses through three phases: _________________________. During the first phase, the user agent sends a _________________ and a ____________ (in the clear) to authenticate the user. During the second phase, the user agent retrieves __________; also during this phase, the user agent can mark messages for __________, remove ___________, and obtain _____________. The third phase occurs after the client has issued the ______ command, ending the POP3 session; at this time, the mail server __________ the messages that were marked for deletion.
HTTP, SMTP
______ transfers files (also called objects) from a Web server to a Web client (typically a browser); ______ transfers files (that is, e-mail messages) from one mail server to another mail server.
Load distribution different end system, different IP address canonical hostname
________________: DNS is also used to performs among replicated servers, such as replicated Web servers. Busy sites, such as cnn.com, are replicated over multiple servers, with each server running on a __________________ and each having a ____________________. For replicated Web servers, a set of IP addresses is thus associated with one ___________________.
Mail server aliasing [email protected] hostname
____________________: it is highly desirable that e-mail addresses be mnemonic. For example, if Bob has an account with Yahoo Mail, Bob's e-mail address might be as simple as ________________. However, the _________________- of the Yahoo mail server is more complicated and much less mnemonic than simply yahoo.com.
Host aliasing
________________________: A host with a complicated hostname can have one or more alias names.
object
a file—such as an HTML file, a JPEG image, a Java applet, or a video clip—that is addressable by a single URL
Web cache (also called a proxy server)
a network entity that satisfies HTTP requests on the behalf of an origin Web server
persistent connections
all of the requests and their corresponding responses sent over the same TCP connection
PUT
allows a user to upload an object to a specific path (directory) on a specific Web server
DELETE
allows a user, or an application, to delete an object on a Web server
(1) a cookie header line in the HTTP response message (2) a cookie header line in the HTTP request message (3) a cookie file kept on the user's end system and managed by the user's browser (4) a back-end database at the Web site.
cookie technology has four components:
port number 80
default port number for HTTP
non-persistent connections
each request/response pair sent over a separate TCP connection
World Wide Web
first Internet application that caught the general public's eye
www.someSchool.edu /someDepartment/picture.gif
http://www.someSchool.edu/someDepartment/picture.gif URL hostname: URL path name:
Web browsers (such as Internet Explorer and Firefox)
implement the client side of HTTP
Web server
implement the server side of HTTP, house Web objects, each addressable by a URL
GET, POST, HEAD, PUT, and DELETE GET
method field different values The great majority of HTTP request messages use the ____ method.
SMTP
principal application-layer protocol for Internet electronic mail
method field, URL field, and HTTP version field
request line three fields:
1) human readable 2) Many lines as you want 3 ) first line: request line 4) other lines: header lines
request message:
initial status line, six header lines, and the entity body entity body
response message has three sections: - meat of the message—it contains the requested object itself
access delay
the delay between the two routers
total response time
time from the browser's request of an object until its receipt of the object
round-trip time (RTT)
time it takes for a small packet to travel from client to server and then back to the client
two RTTs plus the transmission time at the server of the HTML file.
total response time for HTTP request/repsonse
request messages and response message
two types of HTTP messages: