LS170 - Networking Foundations
What are two different ways to encode a space in a query parameter?
%20 or +
Authentication
a process to verify the identity of a particular party in the message exchange.
Ethernet
a set of standards and protocols that enables communication between devices on a local network.
ephemeral port
a short-lived transport protocol port for Internet Protocol communications. Ephemeral ports are allocated automatically from a predefined range by the IP stack software.
cipher suite
a suite, or set, of ciphers.
WLAN or Wireless LAN
a wireless hub or switch so that the devices can connect wirelessly rather than being plugged in via a cable.
Data Packet - TTL
every packet has a Time to Live (TTL) value. This is to ensure that any packets which don't reach their destination for some reason aren't left to endlessly bounce around the network. The TTL indicates the maximum number of network 'hops' a packet can take before being dropped. At each hop, the router which processes and forwards the packet will decrement the TTL value by one.
Content-Type header
indicates the media type of the resource being sent in the message
What character is used between multiple query parameters?
The and sign &
What character indicates the beginning of a URL's query parameters?
The question mark ?
The key points to remember about the TLS Handshake process is that it is used to:
- Agree which version of TLS to be used in establishing a secure connection. - Agree on the various algorithms that will be included in the cipher suite. - Enable the exchange of symmetric keys that will be used for message encryption.
The more important fields in the TCP Segment header in terms of implementing reliability?
- CHECKSUM - SEQUENCE NUMBER and ACKNOWLEDGEMENT NUMBER - WINDOW SIZE - various Flag fields
What are 3 common methods information is shipped via the internet.
- Electricity - Light - Radio Waves
HTTP GET requests primary concepts?
- GET requests are used to retrieve a resource, and most links are GETs. - The response from a GET request can be anything, but if it's HTML and that HTML references other resources, your browser will automatically request those referenced resources. A pure HTTP tool will not.
Summary of how TLS provides security.
- HTTP Requests and Responses are transferred in plain text; as such they are essentially insecure. - We can use the Transport Layer Security (TLS) Protocol to add security to HTTP communications. - TLS encryption allows us to encode messages so that they can only be read by those with an authorized means of decoding the message - TLS encryption uses a combination of Symmetric Key Encryption and Asymmetric Key Encryption. Encryption of the initial key exchange is performed asymmetrically, and subsequent communications are symmetrically encrypted. - The TLS Handshake is the process by which a client and a server exchange encryption keys. - The TLS Handshake must be performed before secure data exchange can begin; it involves several round-trips of latency and therefore has an impact on performance. - A cipher suite is the agreed set of algorithms used by the client and server during the secure message exchange. - TLS authentication is a means of verifying the identity of a participant in a message exchange. - TLS Authentication is implemented through the use of Digital Certificates. - Certificates are signed by a Certificate Authority, and work on the basis of a Chain of Trust which leads to one of a small group of highly trusted Root CAs. - Certificates are exchanged during the TLS Handshake process. - TLS Integrity provides a means of checking whether a message has been altered or interfered with in transit. - TLS Integrity is implemented through the use of a Message Authentication Code (MAC).
The Evolution of Network Technologies Summary
- HTTP has changed considerably over the years, and is continuing to change. - Many of the changes to HTTP are focused on improving performance in response to the ever increasing demands of modern networked applications. - Latency has a big impact on the performance of networked applications. As developers and software engineers we need to be aware of this impact, and try to mitigate against it through the use of various optimizations. - In building networked applications, there are tools and techniques available to us that work around or go beyond the limitations of basic HTTP request-response functionality. - For certain use cases a peer-to-peer architecture may be more appropriate than a client-server architecture.
Contrasting UDP against TCP.
- It provides no guarantee of message delivery - It provides no guarantee of message delivery order - It provides no built-in congestion avoidance or flow-control mechanisms - It provides no connection state tracking, since it is a connectionless protocol
The Transport Layer Launch School Summary
- Multiplexing and demultiplexing provide for the transmission of multiple signals over a single channel - Multiplexing is enabled through the use of network ports - Network sockets can be thought of as a combination of IP address and port number - At the implementation level, sockets can also be socket objects - The underlying network is inherently unreliable. If we want reliable data transport we need to implement a system of rules to enable it. - TCP is a connection-oriented protocol. It establishes a connection using the Three-way-handshake - TCP provides reliability through message acknowledgement and retransmission, and in-order delivery. - TCP also provides Flow Control and Congestion Avoidance - The main downsides of TCP are the latency overhead of establishing a connection, and the potential Head-of-line blocking as a result of in-order delivery. - UDP is a very simple protocol compared to TCP. It provides multiplexing, but no reliability, no in-order delivery, and no congestion or flow control. - UDP is connectionless, and so doesn't need to establish a connection before it starts sending data - Although it is unreliable, the advantage of UDP is speed and flexibility.
Countermeasures for Session Hijacking
- One popular way of solving session hijacking is by resetting sessions. With authentication systems, this means a successful login must render an old session id invalid and create a new one. - Another useful solution is by setting an expiration time on sessions. Sessions that do not expire give an attacker an infinite amount of time to pose as the real user. Expiring sessions after, say 30 minutes, gives the attacker a far narrower window to access the app. - Finally, as we have already covered, another approach is to use HTTPS across the entire app to minimize the chance that an attacker can get to the session id.
Potential solutions for cross-site scripting (XSS)
- One way to prevent this kind of attack is by making sure to always sanitize user input. Eliminate problematic input, such as script tags, or disallowing HTML and JavaScript input altogether in favor of a safer format, like Markdown. - The second way to guard against XSS is to escape all user input data when displaying it. If you do need to allow users to input HTML and JavaScript, then when you print it out, make sure to escape it so that the browser does not interpret it as code.
Query strings are great to pass in additional information to the server, however, there are some limits to the use of query strings, what are they?
- Query strings have a maximum length. Therefore, if you have a lot of data to pass on, you will not be able to do so with query strings. - The name/value pairs used in query strings are visible in the URL. For this reason, passing sensitive information like username or password to the server in this manner is not recommended. - Space and special characters like & cannot be used with query strings. They must be URL encoded, which we'll talk about next.
The general process of certificate public key authentication
- The server sends its certificate, which includes its public key. - The server creates a 'signature' in the form of some data encrypted with the server's private key. - The signature is transmitted in a message along with the original data from which the signature was created. - On receipt of the message, the client decrypts the signature using the server's public key and compares the decrypted data to the original version. - If the two versions match then the encrypted version could only have been created by a party in possession of the private key.
Protocols for Different Aspects of Communication?
- To ensure that a particular message is understood, we need to order the words within that message in a certain order. We can think of this word order as part of the syntactical rules that govern the structure of the message. - A different aspect of communication might be the flow and order of all the messages in the conversation. For example, speaking in turn rather than both at the same time. We can maybe think of these as message transfer rules for how speech is conducted.
The 5 components of a URL?
- http: The scheme. It always comes before the colon and two forward slashes and tells the web client how to access the resource. In this case it tells the web client to use the Hypertext Transfer Protocol or HTTP to make a request. Other popular URL schemes are ftp, mailto or git. You may sometimes see this part of the URL referred to as the protocol, and there is a connection between the two things in that the scheme can indicate which protocol (or system of rules) should be used to access the resource; in the context of of a URL however, the correct term for this component is the scheme. - www.example.com: The host. It tells the client where the resource is hosted or located. - :88 : The port or port number. It is only required if you want to use a port other than the default. - /home/: The path. It shows what local resource is being requested. This part of the URL is optional. Sometimes, the path can point to a specific resource on the host. For instance, www.example.com/home/index.html points to an HTML file located on the example.com server. - ?item=book : The query string, which is made up of query parameters. It is used to send data to the server. This part of the URL is also optional.
Disadvantages of TCP
- latency overhead in establishing a TCP connection due to the handshake process - Head-of-Line (HOL) blocking
What are two main reasons there are so many different protocols for network communications.
1.) Different protocols were developed to address different aspects of network communication. 2.) Different protocols were developed to address the same aspect of network communication, but in a different way or for a specific use-case.
A step-by-step description of the TLS Handshake process might look something like this:
1.) The TLS Handshake begins with a ClientHello message which is sent immediately after the TCP ACK. Among other things, this message contains the maximum version of the TLS protocol that the client can support, and a list of Cipher Suites that the client is able to use (we'll discuss Ciper Suites a little later on). 2.) On receiving the ClientHello message, the server responds with a message of its own. This message includes a ServerHello, which sets the protocol version and Cipher Suite, as well as other related information. As part of this message the server also sends its certificate (which contains its public key), and a ServerHelloDone marker which indicates to the client that it has finished with this step of the handshake. 3.) Once the client has received the ServerHelloDone marker, it will initiate the key exchange process. It's this key exchange process that ultimately enables both the client and server to securely obtain a copy of the symmetric encryption key that will be used for the bulk of the secure message transfer between the two parties. The exact process for generating the symmetric keys will vary depending on which key exchange algorithm was selected as part of the Cipher Suite (e.g. RSA, Diffie-Hellman, etc). 4.) The server also sends a message with ChangeCipherSpec and Finished flags. The client and server can now begin secure communication using the symmetric key.
When a CA issues a certificate, what two important things do they do?
1.) Verifies that the party requesting the certificate is who they say they are. The way that this is done is up to the CA and will depend to an extent on the type of certificate being issued. In the case of a domain validated server certificate, for example, it can involve proving that you own the domain by uploading a specific file to a server that is accessible by the domain for which the certificate is being issued. 2.) Digitally signs the certificate being issued. This is often done by encrypting some data with the CA's own private key and using this encrypted data as a 'signature'. The unencrypted version of the data is also added to the certificate. In order to verify that the certificate was issued by the CA, the signature can be decrypted using the CA's public key and checked for a match against the unencrypted version.
HTTP Status: 500 Internal Server Error
A 500 status code says "there's something wrong on the server side". This is a generic error status code and the core problem can range from a mis-configured server setting to a misplaced comma in the application code. But whatever the problem, it's a server side issue. Someone with access to the server will have to debug and fix the problem, which is why sometimes you see a vague error message asking you to contact your System Administrator. In the wild, a 500 error can be shown in a variety of ways, just like a 404 page can.
TCP Segment header
A TCP Segment header contains a number of different fields. As we saw earlier in this Lesson, two of these fields -- Source Port and Destination Port -- provide the multiplexing capability of the protocol. Most of the other header fields are related to the way that TCP implements reliable data transfer. A typical TCP header would look something like this. Note: there are a number of different TCP variants, and the exact header fields they contain will vary slightly. Most of these variants will at a minimum contain the fields shown in the above diagram.
Uniform Resource Locator (URL)
A URL is like that address or phone number you need in order to visit or communicate with your friend. A URL is the most frequently used part of the general concept of a Uniform Resource Identifier or URI, which specifies how resources are located.
In terms of computer networks, what is a protocol?
A set of rules governing the exchange or transmission of data.
One common way to store session information: in a browser cookie.
A cookie is a piece of data that's sent from the server and stored in the client during a request/response cycle. Cookies or HTTP cookies, are small files stored in the browser and contain the session information. By default, most browsers have cookies enabled. When you access any website for the first time, the server sends session information and sets it in your browser cookie on your local computer. Note that the actual session data is stored on the server. The client side cookie is compared with the server-side session data on each request to identify the current session. This way, when you visit the same website again, your session will be recognized because of the stored cookie with its associated information.
Secure HTTP (HTTPS)
A form of HTTP protocol that supports encryption of the communications. With HTTPS every request/response is encrypted before being transported on the network. This means if a malicious hacker sniffed out the HTTP traffic, the information would be encrypted and useless. A resource that's accessed by HTTPS will start with https:// instead of http://, and usually be displayed with a lock icon in most browsers: HTTPS sends messages through a cryptographic protocol called TLS for encryption. Earlier versions of HTTPS used SSLor Secure Sockets Layer until TLS was developed. These cryptographic protocols use certificates to communicate with remote servers and exchange security keys before data encryption happens. You can inspect these certificates by clicking on the padlock icon that appears before the https://:
Multiplexing
A form of transmission that allows multiple signals to travel simultaneously over one medium. In the context of a communication network, this the idea of transmitting multiple signals over a single channel.
What is Round-trip Time (RTT)?
A latency calculation often used in networking is Round-trip Time (RTT). This is the length of time for a signal to be sent, added to the length of time for an acknowledgement or response to be received.
What is Last-mile latency?
A lot of the delays described above can take place within the parts of the network which are closest to the end points. This is often referred to as 'last-mile latency' and relates to the delays involved in getting the network signal from your ISP's network to your home or office network. The 'hops' within the core part of the network are longer with less interruptions for transmission, processing, and queuing. At the network edge, there are more frequent and shorter hops as the data is directed down the network hierarchy to the appropriate sub-network. You can think of the network edge as the 'entry point' into a network like a home or corporate LAN.
Local Area Network (LAN)
A network in which the nodes are located within a small geographic area.
Digital certificate (TLS)
A notice that guarantees a user or a website is legitimate. A data file that identifies individuals or organizations online and is comparable to a digital signature
What is the bandwidth bottleneck?
A point at which bandwidth changes from relatively high to relatively low.
Transport Layer Security (TLS)
A protocol based on SSL 3.0 that provides authentication and encryption, used by most servers for secure exchanges over the Internet. A cryptographic protocol that ensures data security and integrity over public networks, such as the Internet A data encryption technology used for securing data transmitted over the Internet. TLS succeeded SSL. (13)
Statelessness in regards to Protocols?
A protocol is said to be stateless when it's designed in such a way that each request/response pair is completely independent of the previous one. In other words, the server does not hang on to information between each request/response cycle. Each request made to a resource is treated as a brand new entity, and different requests are not aware of each other. This statelessness is what makes HTTP and the internet so distributed and difficult to control, but it's also the same ephemeral attribute that makes it difficult for web developers to build stateful web applications.
HTTP headers
A set of meta-data that we include in our request that give the server extra information about how to handle the request. HTTP headers allow the client and the server to send additional information during the request/response HTTP cycle. Headers are colon-separated name-value pairs that are sent in plain text.
HTTP request/response cycle
A single HTTP message exchange consists of a Request and a Response. The exchange generally takes place between a Client and a Server. The client sends a Request to the server and the server sends back a Response.
Network Congestion
A situation that occurs when there is more data being transmitted on the network than there is network capacity to process and transmit the data. You can perhaps think of it as similar to a gridlock of vehicles on a road network. Instead of things coming to a standstill however, the 'excess vehicles' are simply lost. In the last lesson we looked at IP packets moving across the networks in a series of 'hops'. At each hop, the packet needs to be processed: the router at that hop runs a checksum on the packet data; it also needs to check the destination address and work out how to route the packet to the next hop on its journey to that destination. All of this processing takes time, and a router can only process so much data at once. Routers use a 'buffer' to store data that is awaiting processing, but if there is more data to be processed than can fit in the buffer, the buffer over-flows and those data packets are dropped.
Caesar Cipher
A substitution cipher that shifts characters a certain number of positions in the alphabet.
How does a switch "learn" MAC addresses?
A switch directs the frames to the correct device by keeping and updating a record of the MAC addresses of the devices connected to it, and associating each address with the Ethernet port to which the device is connected on the switch. It keeps this data in a MAC Address Table, a simple representation of which might look something like this:
What is the Internet? (Khan Academy Definition by Tess Winlock)
A tangible, physical system that was made to move information. This information is in the form of binary information. Everything on the internet, whether it's words, emails, images, cat videos, puppy videos, all come down to these ones and zeroes being delivered by electronic pulses, light beams, and radio waves.
Session ID
A unique number that a Web site's server assigns a specific user for the duration of that user's visit (session). The session ID can be stored as a cookie, form field, or URL.
Three primary server-side infrastructure pieces.
A web server, an application server and a data store.
AJAX and what it means in the HTTP request/response cycle.
AJAX is short for Asynchronous JavaScript and XML. Its main feature is that it allows browsers to issue requests and process responses without a full page refresh. When AJAX is used, all requests sent from the client are performed asynchronously(not at the same time), which just means that the page doesn't refresh. AJAX requests are just like normal requests: they are sent to the server with all the normal components of an HTTP request, and the server handles them like any other request. The only difference is that instead of the browser refreshing and processing the response, the response is processed by a callback function, which is usually some client-side JavaScript code.
Within our network models, what is a Protocol Data Unit (PDU)?
An amount or block of data transferred over a network. Different protocols or protocol layers refer to PDUs by different names. At the Link/ Data Link layer, for example, a PDU is known as a frame. At the Internet/ Network layer it is known as a packet. At the Transport layer, it is known as a segment (TCP) or datagram (UDP). In all cases, the basic concept is effectively the same; the PDU consists of a header, a data payload, and in some cases a trailer or footer.
The one wayness of Asymmetric Key Encryption.
Alice wants to receive encrypted messages. She generates a public key and a private key. She makes the public key available but keeps the private key to herself. Bob uses Alice's public key to encrypt a message and send it to Alice. Alice decrypts Bob's message using her private key. An important thing to note here is that this encryption is primarily intended to work in one direction. Bob can send Alice messages encrypted with the public key which she can then decrypt with the private one. The same key pair would not be used in the other direction for secure communication, since anyone with access to the public key can decrypt the message.
Different Protocols for the Same Aspect of Communication?
All of these situations are concerned with the same aspect of communication, the flow and order of the message transfer, but use different sets of rules, or protocols.
Routing and Routing Tables
All routers on the network store a local routing table. When an IP packet is received by a router, the router examines the destination IP address and matches it against a list of network addresses in its routing table. As explained above, these network addresses define a range of addresses within a particular subnet. The matching network address will determine where in the network hierarchy the subnet exists. This will then be used to select the best route for the IP packet to travel.
Asymmetric Key Encryption
Also known as public key encryption, uses a pair of keys: a public key, and a private key. Unlike the symmetric system where the same key is used to encrypt and decrypt messages, in the asymmetric system the keys in the pair are non-identical: the public key is used to encrypt and the private key to decrypt. The important thing to understand is that messages encrypted with the public key can only be decrypted with the private key. The public key is made openly available but the private key is kept in the sole possession of the message receiver.
Network Sockets at a conceptual level?
An abstraction for an endpoint used for inter-process communication. To allow for many processes within a single Host to use TCP communication facilities simultaneously, the TCP provides a set of addresses or ports within each host. Concatenated with the network and host addresses from the internet communication layer, this forms a socket.
Session Hijacking
An attack in which an attacker attempts to impersonate the user by using their session token. We also know that a session id serves as that unique token used to identify each session. Usually, the session id is implemented as a random string and comes in the form of a cookie stored on the computer. With the session id in place on the client side now every time a request is sent to the server, this data is added and used to identify the session. In fact, this is what many web applications with authentication systems do. When a user's username and password match, the session id is stored on their browser so that on the next request they won't have to re-authenticate. Unfortunately, if an attacker gets a hold of the session id, both the attacker and the user now share the same session and both can access the web application. In session hijacking, the user won't even know an attacker is accessing his or her session without ever even knowing the username or password.
Cross-Site Scripting (XSS)
An attack that injects scripts into a Web application server to direct attacks at clients. A vulnerability in dynamic web pages that allows an attacker to bypass a browser's security mechanisms and instruct the victim's browser to execute code, thinking it came from the desired website. This type of attack happens when you allow users to input HTML or JavaScript that ends up being displayed by the site directly.
Why understand the history and evolution of the HTTP?
An awareness of the history of HTTP, and the changes it has undergone over time, can provide us with more insight into the work-arounds used to deal with some of its limitations. This in turn can better enable us to make informed decisions when building networked applications.
Same Origin Policy
An important concept that permits unrestricted interaction between resources originating from the same origin, but restricts certain interactions between resources originating from different origins. It is an important guard against session hijacking attacks and serves as a cornerstone of web application security. What we mean by origin here is the combination of a url's scheme, hostname, and port. So http://mysite.com/doc1 would be considered to have the same origin as http://mysite.com/doc2, but a different origin to https://mysite.com/doc2 (different scheme), http://mysite.com:4000/doc2 (different port), and http://anothersite.com/doc2 (different host). Doesn't restrict all cross-origin requests. Requests such as linking, redirects, or form submissions to different origins are typically allowed. Also typically allowed is the embedding of resources from other origins, such as scripts, css stylesheets, images and other media, fonts, and iframes.
URLs with Query Strings/Parameters
Are used to pass additional data to the server during an HTTP Request. They take the form of name/value pairs separated by an = sign. Multiple name/value pairs are separated by an & sign. The start of the query string is indicated by a ?. Because query strings are passed in through the URL, they are only used in HTTP GET requests. We'll talk about the different HTTP requests later in the book, but for now just know that whenever you type in a URL into the address bar of your browser, you're issuing HTTP GET requests. Most links also issue HTTP GET requests, though there are some minor exceptions.
Is Client-Server the only network paradigm?
As an architecture it is very prevalent on the web, and it might be easy to think that this is the only architecture available to us. Being aware that other network paradigms exist can be useful when making high-level design decisions about networked applications.
As a software engineer, what can you do about the Limitations of the Physical Networks?
As developers and software engineers, there's really not a lot we can do about the limitations of the physical network itself. If we want to improve the performance of the applications we build, then our influence is limited to the implementation of the application in terms of how we use the higher-level protocols. Having an understanding of these physical limitations can impact the way we think about those higher-level protocols, and therefore the decisions we make about how we use them within our applications.
TCP Congestion Avoidance
As we've already seen, TCP retransmits lost data. If lots of data is lost that means lots of retransmitted data, which is inefficient. Ideally we want to keep retransmission to a minimum. TCP actually uses data loss as a feedback mechanism to detect, and avoid, network congestion; if lots of retransmissions are occurring, TCP takes this as a sign that the network is congested and reduces the size of the transmission window. There are various different approaches and algorithms for determining the size of the initial transmission window, and how much it should be reduced or increased by depending on network conditions. The exact algorithm or approach used will depend on which variant of TCP is in operation.
Interframe Gap
As well as using the Preamble and SFD to prepare a receiving device to process the frame data, Ethernet also specifies an interframe gap (IFG). This gap is a brief pause between the transmission of each frame, which permits the receiver to prepare to receive the next frame. The length of this gap varies according to the capability of the Ethernet connection. For example, for 100 Mbps Ethernet the gap is 0.96 microseconds (or just under one millionth of a second). This Interframe gap contributes to the Transmission Delay element of latency we looked at in the previous assignment.
Network Sockets at an implementation level?
At an implementation level it can be used to refer to different specific things: - UNIX socket: a mechanism for inter-process communication between local processes running on the same machine. - Internet sockets (such as a TCP/IP socket): a mechanism for inter-process communication between networked processes (usually on different machines).
Is bandwidth relatively the same across networks?
Bandwidth varies across the network, and isn't going to be at a constant level between the start point and the end point of our data's journey. For example, the capacity of the core network is going to be much higher than the part of the network infrastructure that ultimately connects to your home or office building. The bandwidth that a connection receives is the lowest amount at a particular point in the overall connection.
The Application Layer
Both the TCP/IP model and the OSI model define an Application layer as the topmost layer in their respective layered systems. Something to be clear about here is that the application layer is not the application itself, but rather a set of protocols which provide communication services to applications. One thing both models have in common however is that the protocols which exist at the Application layer are the ones with which the application most directly interacts. That's not to say that networked applications are limited to interacting with only Application layer protocols. You can see many applications interacting with Transport layer protocols by, for example, opening a TCP socket. However, it is much less common to build applications which interact directly with protocols below the Transport layer.
Browser Optimizations
Browsers have their own optimizations in order to overcome the performance challenges of today's web. While every browser has specific optimizations, there are two broad types: Document-Aware Optimizations and Speculative Optimizations. These optimizations are the browsers' attempt to pre-empt some work before some user action prompts a request. By doing so it can complete the loading more quickly and efficiently, and thus create a better user experience.
What is a Content-Length header?
Can be used to indicate the size of the body. This can help determine where the HTTP message should end. Indicates the size of the entity-body, in bytes, sent to the recipient.
What is a network hub?
Central connection point for all devices in network. A hub is a basic piece of network hardware that replicates a message and forwards it to all of the devices on the network. Sending every frame to every device on the network isn't very efficient, especially for large networks. These days it is rare that you'll find a network that uses a hub; most modern networks instead use switches.
Does each certificate need to be signed by both an Intermediate CA and a Root CA in order to be valid?
Certificates are generally signed by one CA. An Intermediate CA will sign an end-user's certificate, and a Root CA will sign the Intermediate CA's certificate. This creates the 'chain' in the chain of trust.
When does URL Encoding need to happen?
Characters must be encoded if: 1.) They have no corresponding character within the standard ASCII character set. 2.) The use of the character is unsafe because it may be misinterpreted, or even possibly modified by some systems. For example % is unsafe because it can be used for encoding other characters. Other unsafe characters include spaces, quotation marks, the # character, < and >, { and }, [ and ], and ~, among others. 3.) The character is reserved for special use within the URL scheme. Some characters are reserved for a special meaning; their presence in a URL serve a specific purpose. Characters such as /, ?, :, @, and & are all reserved and must be encoded. For example & is reserved for use as a query string delimiter. : is also reserved to delimit host/port components and user/password.
Transport Security Layer (TLS)
Cryptographic protocols designed to provide communications security over a computer network.
How is data encapsulated in network communication?
Data is encapsulated into a Protocol Data Unit, creating separation between protocols operating at different layers.
What is Processing delay?
Data travelling across the physical network doesn't directly cross from one link to another, but is processed in various ways . We'll look at what this processing entails in more detail in a later assignment. This is probably stretching our driving analogy somewhat, but imagine if at every interchange there was some sort of checkpoint; the amount of time it takes to be processed at the checkpoint is the processing delay.
Demultiplexing
Demultiplex is a process reconverting a signal containing multiple analog or digital signal streams back into the original separate and unrelated signals. Although demultiplexing is the reverse of the multiplexing process, it is not the opposite of multiplexing. The opposite of multiplexing is inverse multiplexing (iMuxing), which breaks one data stream into several related data streams. Thus, the difference between demultiplexing and inverse multiplexing is that the output streams of demultiplexing are unrelated, while the output streams of inverse multiplexing are related.
What do we mean when we say that HTTP is a "stateless" protocol?
Each request/response cycle is independent of one another. Each request has no direct bearing on the next request nor should the previous request affect the current. Each request should contain all the information necessary for the request to be fulfilled.
Symmetric Key Encryption
Encryption system in which a single key is used for both encryption and decryption.
The three important security services that are provided by TLS?
Encryption, Authentication, and Integrity. Each of these services are important in their own right, but when combined they provide for very secure message exchange over what is essentially an unsecure channel. It isn't mandatory for an application which uses TLS that all three of these services are used simultaneously. For example, you could design your application to accept encrypted messages from a sender without authenticating who they are. In practice however, all three services are generally used together to provide the most secure connection possible.
What Protocol Data Unit does Ethernet use?
Ethernet Frames are a Protocol Data Unit, and encapsulate data from the Internet/ Network layer above. The Link/ Data Link layer is the lowest layer at which encapsulation takes place. At the physical layer, the data is essentially a stream of bits in one form or another without any logical structure. An Ethernet Frame adds logical structure to this binary data. The data in the frame is still in the form of bits, but the structure defines which bits are actually the data payload, and which are metadata to be used in the process of transporting the frame. An Ethernet-compliant network device is able to identify the different parts of a frame due to the fact that different 'fields' of data have specific lengths in bytes and appear in a set order.
The most commonly used protocol at the Link/Data Link Layer?
Ethernet protocol. Two of the most important aspects of Ethernet are framing and addressing.
A major part of the implementation of the Transmission Control Protocol (TCP) is finding the balance between......?
Finding a balance between reliability and performance
The consequences of using sessions to simulate statefulness.
First, every request must be inspected to see if it contains a session identifier. Second, if this request does, in fact, contain a session id, the server must check to ensure that this session id is still valid. The server needs to maintain some rules with regards to how to handle session expiration and also decide how to store its session data. Third, the server needs to retrieve the session data based on the session id. And finally, the server needs to recreate the application state (e.g., the HTML for a web request) from the session data and send it back to the client as the response. This means that the server has to work very hard to simulate a stateful experience, and every request still gets its own response, even if most of that response is identical to the previous response.
TCP Flow Control
Flow control is a mechanism to prevent the sender from overwhelming the receiver with too much data at once. The receiver will only be able to process a certain amount of data in a particular time-frame. Data awaiting processing is stored in a 'buffer'. The buffer size will depend on the amount of memory allocated according to the configuration of the OS and the physical resources available. Each side of a connection can let the other side know the amount of data that it is willing to accept via the WINDOW field of the TCP header. This number is dynamic, and can change during the course of a connection. If the receiver's buffer is getting full it can set a lower amount in the WINDOW field of a Segment it sends to the sender, the sender can then reduce the amount of data it sends accordingly. Although flow control prevents the sender from overwhelming the receiver, it doesn't prevent either the sender or receiver from overwhelming the underlying network. For that task we need a different mechanism: Congestion Avoidance.
What determines whether a request should use GET or POST as its HTTP method?
GET requests should only retrieve content from the server. They can generally be thought of as "read only" operations, however, there are some subtle exceptions to this rule. For example, consider a webpage that tracks how many times it is viewed. GET is still appropriate since the main content of the page doesn't change. POST requests involve changing values that are stored on the server. Most HTML forms that submit their values to the server will use POST. Search forms are a noticeable exception to this rule: they often use GET since they are not changing any data on the server, only viewing it.
HTTP Request Methods
GET, HEAD, POST, OPTIONS, PUT, DELETE, TRACE, CONNECT. You can think of this as the verb that tells the server what action to perform on a resource. The two most common HTTP request methods you'll see are GET and POST. When you think about retrieving information, think GET, which is the most used HTTP request method. Every request gets a response, even if the response is an error -- that's still a response. (That's not 100% technically true as some requests can time out, but we'll set those rare cases aside for now.)
How do protocols work in groups?
Groups of protocols work in a layered system. Protocols at one layer provide services to the layer above.
The importance of HTTP?
HTTP is at the core of what the web is about, and also at the core of dynamic web applications. Understanding HTTP is central to understanding how modern web applications work and how they're built.
Head-of-the-Line (HOL) blocking
Head-of-line blocking is a general networking concept, and isn't specific to TCP. In general terms it relates to how issues in delivering or processing one message in a sequence of messages can delay or 'block' the delivery or processing of the subsequent messages in the sequence.
Data Encapsulation
Hiding data from one layer by encapsulating it within a data unit of the layer below. This is the means by which protocols at different network layers can work together. Encapsulation is implemented through the use of Protocol Data Units (PDUs). The PDU of a protocol at one layer, becomes the data payload of the PDU of a protocol at a lower layer.
FOCUS on this -> Be aware of the limitations of the physical network.
Higher-level protocols, such as TCP and HTTP, rely on the underlying network infrastructure in order to function. As such, they are bound by the limitations of that infrastructure, such as network bandwidth and latency. Understanding these limitations provides important context when learning about those protocols.
IP Addresses (IPv4)
IPv4 addresses are 32 bits in length and are divided into four sections of eight bits each. When converted from binary to decimal, each of those sections provides a range of numbers between 0 and 255 (since 2 to the power of 8 equals 256). For example 109.156.106.57.
IPv6 vs IPv4
IPv4: The most prevalent IP address format, which consists of four byte values (32 bits). This can yield 2^32 possible addresses. IPv6: In contrast to IPv4's 32-bit address space, IPv6 uses a 128 bit address space providing 2^128 possible addresses. This meets the demand of over 1.5 trillion Internet-connected devices. As well as a difference in address structure, IPv6 has some other differences with IPv4 such as a different header structure for the packet and a lack of error checking (it leaves this to the Link Layer checksum).
TLS Certificates Contain
Identity of issuer, who produced certificate. Identity of subject. Public key of subject. Range of dates for which certificate is valid. Digital signature from issuer
UDP Datagram Header Fields
If we examine the header of a UDP Datagram, we can see that it's really quite simple. The header only has four fields: Source Port, Destination Port, UDP Length (the length, in bits, of the Datagram, including any encapsulated data), and a Checksum field to provide for error detection. That's it. Even the Checksum field is optional if using IPv4 at the Network layer (if using IPv6, you need to include a Checksum in the Datagram since IPv6 packets don't include one themselves). Through the use of the Source and Destination Port numbers, UDP provides multiplexing in the same way that TCP does. Unlike TCP however, it doesn't do anything to resolve the inherent unreliability of the layers below it.
Certificate Authority (CA)
If you are presented with a piece of identification, you are much more likely to accept it as genuine if it has been issued by a trustworthy source. When it comes to digital certificates, the trustworthy sources are called Certificate Authorities (CAs). An CA is an entity trusted by one or more users as an authority in a network that issues, revokes, and manages digital certificates.
Postal Service network analogy (Internet Protocol and Transport Layer Protocol)
Imagine an apartment building. It has numerous apartments, but the building itself has a single street address. The postal worker delivers a bunch of mail to the building. The concierge of the building then sorts the mail and posts the individual letters to the appropriate mailbox in the foyer, each mailbox being identified by a specific apartment number. In this context we can think of the street address of the apartment building address as the IP address and the individual apartment numbers as port numbers. Furthermore, the postal service can be thought of as the Internet Protocol, and the building concierge as a Transport layer protocol (e.g. TCP or UDP).
Simplified analogy of connection-oriented networks communication.
Imagine that you could somehow replicate yourself so that there was one instance of you for each conversation you had to manage. Each instance of you could give their undivided attention to the specific conversation that they were dealing with, essentially creating a connection between you and the other participant in that conversation. Of course humans are not able to replicate themselves, but in code we can instantiate multiple objects of a certain type. By instantiating multiple socket objects, we can implement connection-oriented network communication between applications. In a connection-oriented system you could have a socket object defined by the host IP and process port, just as in the connectionless system, also using a listen() method to wait for incoming messages; the difference in implementation would be in what happens when a message arrives. At this point we could instantiate a new socket object; this new socket object wouldn't just be defined by the local IP and port number, but also by the IP and port of the process/host which sent the message. This new socket object would then listen specifically for messages where all four pieces of information matched (source port, source IP, destination port, destination IP). The combination of these four pieces of information are commonly referred to as a four-tuple. Any messages not matching this four-tuple would still be picked up by the original socket, which would then instantiate another socket object for the new connection.
Simplified analogy of connectionless networks communication.
Imagine you were holding separate conversations simultaneously with five different people. In this scenario the other parties in the conversations don't check to see if you were paying attention, they just speak whenever they are ready to do so. It would be pretty difficult to keep track of all these conversations; you could easily miss important parts of individual conversations or get confused about who said what and in what order. In a connectionless system we could have one socket object defined by the IP address of the host machine and the port assigned to a particular process running on that machine. That object could call a listen() method which would allow it to wait for incoming messages directed to that particular ip/port pair. Such messages could potentially come from any source, at any time, and in any order, but that isn't a concern in a connectionless system -- it would simply process any incoming messages as they arrived and send any responses as necessary.
An advantage of having a dedicated connection like in the connection oriented model?
Implementing communication in this way effectively creates a dedicated virtual connection for communication between a specific process running on one host and a specific process running on another host. The advantage of having a dedicated connection like this is that it more easily allows you to put in place rules for managing the communication such as the order of messages, acknowledgements that messages had been received, retransmission of messages that weren't received, and so on. The purpose of these types of additional communication rules is to add more reliability to the communication.
The fundamental elements required for reliable data transfer.
In order delivery: data is received in the order that it was sent Error detection: corrupt data is identified using a checksum Handling data loss: missing data is retransmitted based on acknowledgements and timeouts Handling duplication: duplicate data is eliminated through the use of sequence numbers
Routers
In order to enable communication between networks, we need to add routers into the picture. Routers are network devices that can route network traffic to other networks. Within a Local Area Network, they effectively act as gateways into and out of the network.
TCP involves a lot of overhead in terms of establishing connections, and providing reliability through the retransmission of lost data. What two mechanisms are available with TCP to mitigate this?
In order to mitigate against this additional overhead, it is important that the actual functioning of data transfer when using the protocol occurs as efficiently as possible. In order to help facilitate efficient data transfer once a connection is established, TCP provides mechanisms for flow control and congestion avoidance.
The downside of Symmetric Key Encryption
In order to work securely, this system relies on the sender and receiver both having access to the key and no one else being able to access it. This raises the question: how do sender and receiver exchange encryption keys in the first place? The most secure way to exchange keys would be in person. If we want to use symmetric key encryption over the internet however, this isn't a feasible option. We also can't just send the key in a readable format to the other party in our message exchange; if the key was intercepted by a third-party, they could then use it to decrypt any subsequent messages between the sender and receiver. What we need is a mechanism whereby we can encrypt the encryption key itself, so that even if it is intercepted it can't be used.
Network Ports
In simple terms a port is an identifier for a specific process running on a host. This identifier is an integer in the range 0-65535. Sections of this range are reserved for specific purposes: - 0-1023 are well-known ports. These are assigned to processes that provide commonly used network services. For example HTTP is port 80, FTP is port 20 and 21, SMTP is port 25, and so on. - 1024-49151 are registered ports. They are assigned as requested by private entities. For example, companies such as Microsoft, IBM, and Cisco have ports assigned that they use to provide specific services. On some operating systems, ports in this range are also used for allocation as ephemeral ports on the client side. - 49152-65535 are dynamic ports (sometimes known as private ports). Ports in this range cannot be registered for a specific use. They can be used for customized services or for allocation as ephemeral ports.
Do you need to know all the Differences Between Ethernet Standards?
In terms of understanding the general function of an Ethernet frame, these differences between standards don't matter too much. The main elements to focus on are the Data Payload field being used as an encapsulation mechanism for the layer above, and the MAC Address fields being used to direct the frame between network devices. These particular fields exist across all the different Ethernet standards.
The Internet/ Network Layer
In the Internet Protocol Suite, the Internet layer is layer 2 (between the Link layer and the Transport layer). Within both models, the primary function of protocols at this layer is to facilitate communication between hosts (e.g. computers) on different networks. The Internet Protocol (IP) is the predominant protocol used at this layer for inter-network communication. There are two versions of IP currently in use: IPv4 and IPv6. We'll look at some of the differences between these protocol versions a bit later on. For a general overview of how IP works, we'll mostly be looking at IPv4. Although there are differences between the versions, the primary features of both versions are the same: - Routing capability via IP addressing - Encapsulation of data into packets
Network Sockets in Launch School
In this course, what we're primarily interested in is the concept of a socket and to a lesser extent the application of this concept for inter-network communication between networked applications, i.e. Internet Sockets. We're not going to look too much into the implementation detail of how Internet Sockets work. One important thing to be clear on though is that there is a distinction between the concept of a network socket and its implementation in code.
What do 5xx level response codes indicate?
Indicates an error or issue on server side. This time the request was valid, and the server was able to understand it, but it replies to the client that it cannot communicate using the version of HTTP that the client wants to use.
What do 4xx level response codes indicate?
Indicates an error or issue on the client side, i.e. with the request. A 400 error is a general indication that there is a problem with the structure of the request; in other words the server did not understand the request due to its syntax. This is basically the server saying to the client 'I don't understand what you just asked me'.
Packet Sniffing
Inspecting information packets as they travel the Internet and other networks
Uniform Resource Identifier (URI)
Is a string of characters which identifies a particular resource within an information space. It is part of a system by which resources should be uniformly addressed on the Web. The Web is an information space. Human beings have a lot of mental machinery for manipulating, imagining, and finding their way in spaces. URIs are the points in that space. The terms URI and URL (Uniform Resource Locator) are often used interchangeably. URL is a subset of URI that includes the network location of the resource.
Web Server
Is a typically a server that responds to requests for static assets: files, images, css, javascript, etc. These requests don't require any data processing, so can be handled by a simple web server.
Application Server
Is typically where application or business logic resides, and is where more complicated requests are handled. This is where your server-side code lives when deployed. The application server will often consult a persistent data store, like a relational database, to retrieve or create data
What is Cross-origin resource sharing or CORS and why was it developed?
It is a mechanism that allows interactions that would normally be restricted cross-origin to take place. It works by adding new HTTP headers, which allow servers to serve resources cross-origin to certain specified origins. While secure, same-origin policy is an issue for web developers who have a legitimate need for making these restricted kinds of cross-origin requests.
The benefit/challenge of HTTP being a stateless protocol?
It is important to be aware of HTTP as a stateless protocol and the impact it has on server resources and ease of use. In the context of HTTP, it means that the server does not need to hang on to information, or state, between requests. As a result, when a request breaks en route to the server, no part of the system has to do any cleanup. Both these reasons make HTTP a resilient protocol, as well as a difficult protocol for building stateful applications. Since HTTP, the protocol of the internet, is inherently stateless that means web developers have to work hard to simulate a stateful experience in web applications. The key concept to remember is that even though you may feel the application is stateful, underneath the hood, the web is built on HTTP, a stateless protocol. It's what makes the web so resilient, distributed, and hard to control. It's also what makes it so difficult to secure and build on top of.
An important thing to understand about the Internet Protocol, and its system of addressing?
It is intended to provide communication between hosts, or devices. These hosts can potentially be on the same local network, or on different local networks halfway around the world from each other. Either way, we can use IP to get a message from one host to the other, but not any more than that. As we know though, there are potentially many applications running on a single host. If IP can get us as far as the host, how do we provide communication between an application running on one host and an application running on another host (or potentially between two different applications or processes running on the same host)?
Mimicking Statefulness using Sessions.
It's obvious the stateless HTTP protocol is somehow being augmented to maintain a sense of statefulness. With some help from the client (i.e., the browser), HTTP can be made to act as if it were maintaining a stateful connection with the server, even though it's not. One way to accomplish this is by having the server send some form of a unique token to the client. Whenever a client makes a request to that server, the client appends this token as part of the request, allowing the server to identify clients. In web development, we call this unique token that gets passed back and forth the session identifier. Each request, however, is technically stateless and unaware of the previous or the next one.
How might we go about simulating a "stateful" application?
It's possible to send simple, limited data via URL parameters; this is one way in which we can simulate a stateful application. We can also persist data in the form of cookies. Cookies are temporary pieces of information that we can send back and forth via the HTTP requests and responses. We can also use session data to keep track of a user for a given web domain.
How does the TLS protocol encapsulates data?
Just like other protocols we've looked at in this course, TLS sends messages in a certain format. This format can vary depending on the the particular function that TLS is performing, but when it is transporting application data TLS encapsulates that data in the same way that we've seen with other Protocol Data Units. In other words, the data to be transported forms a payload, and meta data is attached in the form of header and trailer fields. The main field that interests us in terms of providing message integrity is the MAC field.
Two main characteristics that measure the performance of a physical network?
Latency and Bandwidth
What is a network switch?
Like a hub, a switch is a piece of hardware to which you connect devices to create a network. Unlike a hub however, a switch uses the destination address in order to direct a frame only to the device for which it is intended.
HTTP Response Header
Like request headers, we can also use the inspector to view response headers. Response headers offer more information about the resource being sent back. Some common response headers are pictured. There are a lot more response headers, but just like request headers, it's not necessary to memorize them. They have subtle effects on the data being returned, and in some cases, they have subtle workflow consequences (eg, your browser automatically following a Location response header). Just understand that response headers contain additional meta-information about the response data being returned.
What is a MAC Address?
MAC means Media Access Control. A MAC address is a unique number stored in each NIC (Network Interface Card) so it can be used to identify a device on a network. Since this address is linked to the specific physical device, and (usually) doesn't change, it is sometimes referred to as the physical address or burned-in address. MAC Addresses are formatted as a sequence of six two-digit hexadecimal numbers, e.g. 00:40:96:9d:68:0a, with different ranges of addresses being assigned to different network hardware manufacturers.
HTTP and your modern web browser.
Modern browsers provide numerous APIs that provide functionality which HTTP alone cannot. An awareness that these APIs exist and what they offer can help us when developing applications.
What is Queuing delay?
Network devices such as routers can only process a certain amount of data at one time. If there is more data than the device can handle, then it queues, or buffers, the data. The amount of time the data is waiting in the queue to be processed is the queuing delay. In our driving analogy, this would be the time spent in a queue of traffic waiting to cross the checkpoint.
FOCUS on this -> Build a general picture of the network infrastructure.
Networking is a deep topic, and there's much of it that we purposely don't cover here. We don't expect you to memorize every single detail of how things work at this level, that's the domain of network engineering. As a developer or software engineer, it helps to have a high-level picture of how the underlying infrastructure functions. Much of the specific detail isn't necessary. The idea instead is to build the general mental models that will provide sufficient context to understand protocols operating at higher levels of abstraction, such as TCP and HTTP.
The TLS handshake complexity implication?
One of the implications of this complexity is its impact on performance. It can add up to two round-trip of latency (depending on the TLS version) to the establishment of a connection between client and server prior to the point where any application data can be sent. This is on top of the initial round trip resulting from the TCP Handshake. The TLS Handshake must be performed before secure data exchange can begin; it involves several round-trips of latency and therefore has an impact on performance.
TCP Three-way Handshake to Establish a Connection: main takeaway
One of the main take-aways should be that there's a certain amount of complexity involved in the way that TCP manages connection state. This fact is particularly pertinent to the initial establishment of a connection, where a key characteristic of the process is that the sender cannot send any application data until after it has sent the ACK Segment. What this means in practical terms, is that there is an entire round-trip of latency before any application data can be exchanged. Since this hand-shake process occurs every time a TCP connection is made, this clearly has an impact on any application which uses TCP at the transport layer.
What characters can be used safely within a URL?
Only alphanumeric and special characters $-_.+!'()", and reserved characters when used for their reserved purposes can be used unencoded within a URL. As long as a character is not being used for its reserved purpose, it has to be encoded.
Building a Reliable Protocol Version 3 - Disadvantage?
Our protocol as it stands is reliable. Unfortunately, it's not very efficient. One of the main features of our protocol is that each message is sent one at a time, and an acknowledgement is received before the next message is sent. This type of protocol is known as a Stop-and-Wait protocol. It's the 'Wait' part that's the problem here. Within our system, much of the time is spent just waiting for an acknowledgement. This is not an efficient use of bandwidth.
HTTP POST requests
POST is used when you want to initiate some action on the server, or send data to a server. Typically from within a browser, you use POST when submitting a form. POST requests allow us to send much larger and sensitive data to the server, such as images or videos. For example, say we need to send our username and password to the server for authentication. We could use a GET request and send it through query strings. The flaw with this approach is obvious: our credentials become exposed instantly in the URL; that isn't what we want. Using a POST request in a form fixes this problem. POST requests also help sidestep the query string size limitation that you have with GET requests. With POST requests, we can send significantly larger forms of information to the server.
The flip-side of the benefits that TCP provides?
Performance challenges that come with its complexity. Just because that complexity is abstracted away from a developer working at the application level doesn't mean it doesn't exist. It's there, and it has a very real impact on performance. TCP does attempt to balance this impact on performance by providing mechanisms for flow control and congestion avoidance.
Building a Reliable Protocol Version 3
Problem: the message is received but acknowledgement is not received (or not in time), resulting in a duplicate message. Solution: add sequence numbers to the messages. Rules: - Sender sends one message at a time, with a sequence number, and sets a timeout - If message received, receiver sends an acknowledgement which uses the sequence number of the message to indicate which message was received - When acknowledgement is received, sender sends next message in the sequence - If acknowledgement is not received before the timeout expires, sender assumes either the message or the acknowledgement went missing and sends the same message again with the same sequence number - If the recipient receives a message with a duplicate sequence number it assumes the sender never received the acknowledgement and so sends another acknowledgement for that sequence number and discards the duplicate
TCP Segment
Segments are the Protocol Data Unit (PDU) of TCP. Like the PDUs of protocols we've looked at for other network layers, it uses a combination of headers and payload to provide encapsulation of data from the layer above.
HTTP Request headers
Request headers give more information about the client and the resource to be fetched. Some useful request headers are pictured: Don't bother memorizing any of the request headers, but just know that it's part of the request being sent to the server.
Internet Resources
Resource is a generic term for things you interact with on the Internet via a URL. This includes images, videos, web pages and other files. Resources are not limited to files and web pages. Resources can also be in the form of software that lets you trade stock or play a video game. There is no limit to the number of resources available on the Internet.
The Link/ Data Link Layer.
Simply having these devices being physically connected to each other isn't sufficient for them to communicate. They don't know how to communicate as they haven't established any rules for communication. One of the most important rules for transferring data from one place to another is identifying the device to which we want to send that data. The protocols operating at this layer are primarily concerned with the identification of devices on the physical network and moving data over the physical network between the devices that comprise it, such as hosts (e.g. computers), switches, and routers. We can think of what happens at this layer as an interface between the workings of the physical network and the more logical layers above.
What are the required components of an HTTP response? What are the additional optional components?
Status code is required. Headers and body are optional.
TCP (Connection or Connectionless?)
TCP is a connection-oriented protocol, it doesn't start sending application data until a connection has been established between application processes. In order to establish a connection TCP uses what is known as a Three-way Handshake' this is where the SYN and ACK flags come into play; the FIN flag is used in different process, the Four-way Handshake, used for terminating connections.
Transmission Control Protocol (TCP)
TCP is one of the corner-stones of the Internet. One of the key characteristics of this protocol is the fact that it provides reliable data transfer. What TCP essentially provides is the abstraction of reliable network communication on top of an unreliable channel. What this abstraction does is to hide much of the complexity of reliable network communication from the application layer: data integrity, de-duplication, in-order delivery, and retransmission of lost data. The services that TCP provides makes it the protocol of choice for many networked applications. TCP also provides data encapsulation and multiplexing. It achieves this through the use of TCP Segments.
TCP Segment header: CHECKSUM
The Checksum provides the Error Detection aspect of TCP reliability. It is an error checking value generated by the sender using an algorithm. The receiver generates a value using the same algorithm and if it doesn't match, it drops the Segment. We've encountered Checksums already in this course, in other PDUs at other network layers such as IP Packets. Having a Checksum at the Transport Layer does render Checksums at lower layers redundant to a certain extent. IPv6 headers don't include a Checksum for this reason, based on the assumption that checksums are implemented at either the Transport or Link/ Data Link layers (or both).
What are the required components of an HTTP request? What are the additional optional components?
The HTTP method and the path are required, and form part of what is known as the start-line or request-line. As of HTTP 1.0, the HTTP version also forms part of the request-line. The Host header is a required component since HTTP 1.1. Parameters, all other headers, and message body are optional. Technically speaking the 'path' portion of the request-line is known as the 'request-URI', and incorporates the actual path to the resource and the optional parameters if present. In practice, most people simply refer to this part of the request-line as the 'path'.
In regards to the Transport Layer, what enables end-to-end communication between specific applications on different machines?
The IP address and the port number together are what enables end-to-end communication between specific applications on different machines. The combination of IP address and port number information can be thought of as defining a communication end-point. This communication end-point is generally referred to as a socket.
FOCUS on this -> Learn that IP enables communication between devices.
The Internet Protocol (IP) is a key part of the functionality of the internet. Make sure to form a clear mental model of what it does.
The MAC Addressing system problem?
The MAC Addressing system works well for local networks, where all the devices are connected to a switch that can keep a record of each device's address. In theory, we could conduct inter-network communication just using MAC addresses. For example, we could design routers that kept records of which MAC Addresses could be accessed via other routers on the wider network. In practice however, there's an issue that prevents us from doing this: scale. This approach isn't scalable due to certain characteristics of MAC addresses: - They are physical rather than logical. Each MAC Address is tied (burned in) to a specific physical device - They are flat rather than hierarchical. The entire address is a single sequence of values and can't be broken down into sub-divisions.
Message Authentication Code (MAC)
The MAC field is similar in concept to the checksum fields we've already seen in other PDUs, although there is a difference in implementation as well as overall intention. The checksum field in, say, a TCP Segment is intended for error detection (i.e. to test if some data was corrupted during transport). The intention of the MAC field in a TLS record is to add a layer of security by providing a means of checking that the message hasn't been altered or tampered with in transit.
Preamble and SFD
The Preamble and Start of Frame Delimiter (SFD/ SOF) generally aren't considered part of the actual frame but are sent just prior to the frame as a synchronization measure which notifies the receiving device to expect frame data and then identify the start point of that data. The preamble is seven bytes (56 bits) long and the SFD is one byte (eight bits). Both use a repeated pattern that can be recognised by the receiving device, which then knows that the data following after is the frame data.
What is the Protocol Data Unit (PDU) of UDP?
The Protocol Data Unit (PDU) of UDP is known as a Datagram. Like all the PDUs we've looked at so far it encapsulates data from the layer above into a payload and then adds header information.
What is the Protocol Data Unit (PDU) within the IP Protocol?
The Protocol Data Unit (PDU) within the IP Protocol is referred to as a packet. A packet is comprised of a Data Payload and a Header. Just as with Ethernet Frames, the Data Payload of an IP Packet is the PDU from the layer above (the Transport layer). It will generally be a TCP segment or a UDP datagram. The Header is split into logical fields which provide metadata used in transporting the packet. Again, as with Ethernet Frames, the data in the IP Packet is in bits. The logical separation of those bits into header fields and payload is determined by the set size of each field in bits and the order within the packet.
The Transport Layer Security (TLS) Protocol (History)
The Transport Layer Security (TLS) protocol started life as a protocol called SSL (Secure Sockets Layer). This was a proprietary protocol developed by Netscape. Although it was standardized and renamed as TLS in 1999 by the IETF, the terms are often still used interchangeably. You'll commonly still hear people talk about SSL Certificates rather than referring to them as TLS Certificates or Public Key Certificates. When you hear SSL being mentioned, just bear in mind that the person saying it is probably referring to TLS, unless they are speaking historically. There have been several version of TLS since 1999, the most recent one being TLS 1.3.
TCP Segment header: WINDOW SIZE field and the various Flag fields
The WINDOW SIZE field is related to Flow Control, which we will look at a bit later on. The Flag fields are one-bit boolean fields. A couple of these fields, URG and PSH, are related to how the data contained in the Segment should be treated in terms of its importance or urgency; we aren't going to go into exactly how these particular flags are used. The SYN, ACK, FIN, and RST flags are used to establish and end a TCP connection, as well as manage the state of that connection; we'll look at these in some more detail below.
So we can use ports to identify specific services running on host machines, but how does that help us with multiplexing and demultiplexing?
The answer is that the source and destination port numbers are included in the Protocol Data Units (PDU) for the transport layer. The name, and exact structure, of these PDUs varies according to the Transport Protocol used, but what they have in common is that they include these two pieces of information. Data from the application layer is encapsulated as the data payload in this PDU, and the source and destination port numbers within the PDU can be used to direct that data to specific processes on a host. The entire PDU is then encapsulated as the data payload in an IP packet. The IP addresses in the packet header can be used to direct data from one host to another.
In regards to the HTTP POST method, how is the data we're sending being submitted to the server since it's not being sent through the URL?
The answer to that is the HTTP body. The body contains the data that is being transmitted in an HTTP message and is optional. In other words, an HTTP message can be sent with an empty body. When used, the body can contain HTML, images, audio and so on. You can think of the body as the letter enclosed in an envelope, to be posted. It's critical to understand that when using a browser, the browser hides a lot of the underlying HTTP request/response cycle from you. Your browser issued the initial POST request, got a response with a Location header, then issued another request without any action from you, then displayed the response from that second request.
It is necessary that such a 'chain of trust' would need to have an end-point, but if no-one is authenticating the Root CAs other than themselves, how do we know we can trust them?
The answer to this is simply their reputation gained through prominence and longevity. Root CAs are essentially a small group of organisations approved by browser and operating system vendors. Ultimately this system still relies on trust, and as such isn't infallible.
Is the TLS Certificate all you need for good authentication?
The certificate on its own isn't much proof of anything, however. Since such certificates are publicly available to anyone, a malicious third-party could easily access one and present it as its own. What's to stop a malicious third-party creating their own key pair and certificate identifying them as, say, a well-known bank? Just as it's possible to create a fake ID card in the real world, it's possible to create a fake digital certificate. How are we to know if a certificate is genuine or not? This is where Certificate Authorities come in.
Data Payload
The data payload field can be between 42 and 1497 bytes in length. It contains the data for the entire Protocol Data Unit (PDU) from the layer above, an IP Packet for example.
Data Payload of a PDU?
The data payload portion of a PDU is simply the data that we want to transport over the network using a specific protocol at a particular network layer. The data payload is the key to the way encapsulation is implemented. The entire PDU from a protocol at one layer is set as the data payload for a protocol at the layer below. For example, a HTTP Request at the Application layer could be set as the payload for a TCP segment at the transport layer.
The default port number for HTTP?
The default port number for HTTP is port 80. Even though this port number is not always specified, it's assumed to be part of every URL. Unless a different port number is specified, port 80 will be used by default in normal HTTP requests. To use anything other than the default, one has to specify it in the URL.
What character is used between the name and value of a query parameter?
The equal sign =
Header and Trailer of a PDU?
The exact structure of the header and, if included, trailer varies from protocol to protocol, but the purpose of them is the same in each case: to provide protocol-specific metadata about the PDU.
Frame Check Sequence (FCS)
The final four bytes (32 bits) of an Ethernet Frame is the Frame Check Sequence. This is a checksum generated by the device which creates the frame. It is calculated from the frame data using an algorithm such as a cyclic redundancy check. The receiving device uses the same algorithm to generate a FCS and then compares this to the FCS in the sent frame. If the two don't match, then the frame is dropped. Ethernet doesn't implement any kind of retransmission functionality for dropped frames; it is the responsibility of higher level protocols to manage retransmission of lost data if this is a requirement of the protocol.
cipher
The generic term for a technique (or algorithm) that performs encryption. A cryptographic algorithm; in other words they are sets of steps for performing encryption, decryption, and other related tasks.
Application Layer Protocols
The highest level of abstraction because they manage how data is interpreted and displayed to users. The rules for implementing the end-user services provided by a network. These protocols give meaning to the bits sent by lower-level protocols; user and server computers must agree on what the bits mean, and application protocols (like HTTP) offer this. We can perhaps think of Application layer protocols as being the rules for how applications talk to each other at a syntactical level. Different types of applications have different requirements with regards to how they communicate at a syntactical level, and so as a result there are many different protocols which exist at the application layer.
In socket programming or network programming terms, how is the concept of a network socket implemented?
The implementation of this concept involves instantiating socket objects. While implementations vary, many follow the Berkley sockets API model. Implementations which follow this model define specific functions such as bind(), listen(), accept(), and connect(), among others. You can see examples of this in documentation for programming languages such as Ruby and Python and for environments like Node.js.
Internet vs. World Wide Web
The internet is essentially a network of networks. It can be thought of as the infrastructure that enables inter-network communication, both in terms of the physical network and the lower-level protocols that control its use. The World Wide Web, or web for short, is a service that can be accessed via the internet. In simple terms it is a vast information system comprised of resources which are navigable by means of a URL (Uniform Resource Locator). HTTP is closely tied, both historically and functionally, to the web as we know it. It is the primary means by which applications interact with the resources which make up the web.
Network Hops and the `tracert` console command.
The journey of a piece of data on the network isn't direct from the start point to the end point but will consist of several 'hops', or journeys, between nodes on the network. You can think of the nodes as routers that process the data and forward it to the next node on the path. `tracert` is a utility for displaying the route and latency of a path across a network. Running the command should return a list of hops taken for the test data to get from your PC or laptop to the Google server. The values indicated here are the Round-Trip Time (RTT) for each hop.
What is the major benefit to how the data payload implements encapsulation?
The major benefit of this approach is the separation it creates between the protocols at different layers. This means that a protocol at one layer doesn't need to know anything about how a protocol at another layer is implemented in order for those protocols to interact. It creates a system whereby a lower layer effectively provides a 'service' to the layer above it. In other words, a TCP segment isn't really concerned whether its data payload is an HTTP request, an SMTP command, or some other sort of Application layer data. It just knows it needs to encapsulate some data from the layer above and provide the result of this encapsulation to the layer below. This separation of layers provides a certain level of abstraction, which allows us to use different protocols at a certain layer without having to worry about the layers below.
HTTP response
The raw HTTP data returned by the server. The most important parts of an HTTP response are: - status code - headers - message body, which contains the raw response data
Domain Name System (DNS)
The mapping from URL to IP address. DNS is a distributed database which translates domain names like http://www.google.com to an IP address and maps the request to a remote server. Stated differently, it keeps track of URLs and their corresponding IP addresses on the Internet. So an address like http://www.google.com might be resolved to an IP address 197.251.230.45. By the way, you can also get to Google's main page by typing the IP address into your browser's address bar.
Clients and Servers
The most common client is an application you interact with on a daily basis called a Web Browser. Examples of web browsers include Internet Explorer, Firefox, Safari and Chrome, including mobile versions. Web browsers are responsible for issuing HTTP requests and processing the HTTP response in a user-friendly manner onto your screen. Web browsers aren't the only clients around, as there are many tools and applications that can also issue HTTP requests. The content you're requesting is located on a remote computer called a server. Servers are nothing more than machines or devices capable of handling inbound requests, and their job is to issue a response back. Often, the response they send back contains relevant data as specified in the request.
What does the netstat utility do in the command prompt?
The netstat utility should return a list of active network connections that looks something like this: The important thing to notice here is that the Local Address and Foreign Address are combinations of ip address and port number. As stated earlier, these combinations act as communication end-points or sockets for the transfer of data between applications running on hosts.
Length
The next field is two bytes (16 bits) in length. It is used to indicate the size of the Data Payload.
DSAP, SSAP, Control
The next three fields are all one byte (8 bits) in length. The DSAP and SSAP fields identify the Network Protocol used for the Data Payload. The Control field provides information about the specific communication mode for the frame, which helps facilitate flow control.
Source and Destination MAC address
The next two fields, each six bytes (48 bits) long, are the source and destination MAC addresses. The source address is the address of the device which created the frame (as we'll see later on in this assignment, this can change at various points along the data's journey). The destination MAC address is the address of the device for which the data is ultimately intended. MAC Addresses are a key part of the Ethernet protocol; we'll look at them in more detail shortly.
Data Store
The place or medium where system data is stored. Data stores can also be simple files, key/value stores, document stores and many other variations, as long as it can save data in some format for later retrieval and processing.
FOCUS on this -> Understand that protocols are systems of rules.
The protocols that support network functionality are essentially logical sets of rules that have been designed and engineered to be the way they are. Viewing them as such makes them easier to break down and contextualize.
What is the purpose of the TLS Handshake?
The purpose of the TLS Handshake is to enable the secure exchange of encryption keys between clients and servers. More generally we could say that its purpose is to establish a secure connection.
What is the purpose of the chain of trust?
The purpose of this chain-like structure is the level of security it provides. The private keys of the Root CAs are kept behind many layers of security in order to be kept as inaccessible as possible. As such they don't issue end-user certificates, but leave that up to the Intermediate CAs. Additionally, if the private key of an Intermediate CA somehow became compromised, the root CA can revoke the certificate for Intermediate, therefore invalidating all of the certificates down the chain from it, and simply issue a new one.
HTTP Status: 302 Found
The requested page has moved temporarily to a new URL. What happens when a resource is moved? The most common strategy is to re-route the request from the original URL to a new URL. The general term for this kind of re-routing is called a redirect. When your browser sees a response status code of 302, it knows that the resource has been moved, and will automatically follow the new re-routed URL in the Location response header.
HTTP Status: 404 Not Found
The server returns this status code when the requested resource cannot be found. Remember, a resource can be anything including audio files, CSS stylesheets, JavaScript files, images etc.
Hypertext Transfer Protocol (HTTP)
The set of rules which provide uniformity to the way resources on the web are transferred between applications. HTTP follows a simple model where a client makes a request to a server and waits for a response. Hence, it's referred to as a request response protocol. HTTP is nothing more than an agreement in the form of formatted text that dictates how a client and server communicate. HTTP is a text-based protocol. HTTP Request and Responses involve sending text between the client and server. In order for the protocol to work, the Request and Response must be structured in such a way that both the client and the server can understand them.
Where is the session data stored?
The simple answer is: on the server somewhere. Sometimes, it's just stored in memory, while other times, it could be stored in some persistent storage, like a database or key/value store. Where the session data is actually stored is not too important right now. The most important thing is to understand that the session id is stored on the client, and it is used as a "key" to the session data stored server side. That's how web applications work around the statelessness of HTTP.
HTTP Status Code
The status code is a three-digit number that the server sends back after receiving a request signifying the status of the request. The status text displayed next to status code provides the description of the code. It is listed under the Status column of the Inspector. The most common response status code you'll encounter is 200 which means the request was handled successfully. Other useful status codes are pictured. As a web developer, you should know the above response status codes and their associated meaning very well.
What is the total latency between two points, such as a client and a server?
The sum of all the delays (Propagation, transmission, processing, and queuing). This value is usually given in milliseconds (ms).
What is the purpose of the TCP Three-way Handshake?
The three way handshake is TCPs way of establishing a connection between a client and server. More precisely, a connection between specific applications/ services on the client/ server. First, the client requests a connection to be made by sending a synchronize message to the server. Then the server sends a synchronize acknowledge message back to the client after receiving the first SYN message. Then the client sends an acknowledgement back to the server once the client received the SYN ACK. After this, the connection is established and only then can actual application data begin to be transferred.
So who exactly are these Certificate Authorities, and why should we trust them?
There are different 'levels' of CA. An 'Intermediate CA' can be any company or body authorised by a 'Root CA' to issue certificates on its behalf.
Implementing a Pipeline Approach
There are different ways of implementing this pipelined approach, such as Go-back-N and Selective Repeat. The exact differences between these implementations aren't too important. With both systems, the sender will implement a 'window' representing the maximum number of messages that can be in the 'pipeline' at any one time, once it has received the appropriate acknowledgements for the messages in the window, it moves the window on.
TCP Segment header: SEQUENCE NUMBER and ACKNOWLEDGEMENT NUMBER
These two fields are used together to provide for the other elements of TCP reliability such as In-order Delivery, Handling Data Loss, and Handling Duplication. The precise way in which TCP uses these fields is beyond the scope of this course, but it is essentially a more complex version of the simplified example of the Reliable Protocol we constructed in the previous assignment.
A major characteristic of the communication protocols that are primarily used to provide the functionality for the lower layers in our network system.
They are inherently unreliable. We've seen that protocols such as Ethernet and the Internet Protocol include checksum data as part of their header or trailer so that the data transported as frames and packets can be tested to ensure it hasn't become corrupt during its journey. If the data is corrupt however, these protocols simply discard it (dropping the frame or packet); there is no provision within these protocols for enabling the replacement of lost data. The possibility of losing data and it not being replaced means that the network up to and including the Internet Protocol is effectively an unreliable communication channel.
Where are DNS databases stored?
They are stored on computers called DNS servers. It is important to know that there is a very large world-wide network of hierarchically organized DNS servers, and no single DNS server contains the complete database. If a DNS server does not contain a requested domain name, the DNS server routes the request to another DNS server up the hierarchy. Eventually, the address will be found in the DNS database on a particular DNS server, and the corresponding IP address will be used to receive the request.
What do 3xx level response codes indicate?
They don't indicate an error. Instead they are generally used in relation to redirection, and indicate to the client that it must take some additional action in order to complete the request.
When you're coding or debugging a web application, it's important to establish a mental model of "where" you are when analyzing a piece of code.
This ability to zoom in to the details while also being able to recognize where you are in the larger picture is crucial to piecing together the jigsaw puzzle of web development.
Datagram Transport Layer Security (DTLS)
This protocol is specifically for use with network connections which use UDP rather than TCP at the Transport layer. TLS can't be used with UDP, though there is another protocol called DTLS (based on TLS) which can be used with UDP.
What is subnetting?
This splitting of a network into parts is referred to as sub-netting. By dividing IP address ranges further, subnets can be split into smaller subnets to create even more tiers within the hierarchy.
How does the MAC field implement the data integrity check?
Through the use of a hashing algorithm. It works something like this: 1.) The sender will create what's called a digest of the data payload. This is effectively a small amount of data derived from the actual data that will be sent in the message. The digest is created using a specific hashing algorithm combined with a pre-agreed hash value. This hashing algorithm to be used and hash value will have been agreed as part of the TLS Handshake process when the Cipher Suite is negotiated. 2.) The sender will then encrypt the data payload using the symmetric key (as described earlier in the Encryption section), encapsulate it into a TLS record, and pass this record down to the Transport layer to be sent to the other party. 3.) Upon receipt of the message, the receiver will decrypt the data payload using the symmetric key. The receiver will then also create a digest of the payload using the same algorithm and hash value. If the two digests match, this confirms the integrity of the message.
Pipelining for Performance
To improve the throughput of our protocol, we could send multiple messages one after the other without waiting for acknowledgements. You might be wondering how that impacts the reliability. Well, the receiver still sends acknowledgements, and retransmission can still occur, so our system is still reliable. The difference is that multiple messages are being transferred at any one time. This kind of approach is referred to as 'pipelining'. The advantage of this pipelined approach is its a more efficient use of available bandwidth. Instead of wasting lots of time just waiting for acknowledgements, more time is spent actually transmitting data.
TLS Handshake
To securely send messages via HTTP we want both the request and the response to be encrypted in a such a way that they can only be decrypted by the intended recipient. The most efficient way to do this is via symmetric key cryptography. If we want to use symmetric keys however, we also need a way to securely exchange the symmetric key. The clever thing about TLS is the way that it uses a combination of symmetric and asymmetric cryptography. The bulk of the message exchange is conducted via symmetric key encryption, but the initial symmetric key exchange is conducted using asymmetric key encryption. The process by which the initial secure connection is set up is conducted during what is known as the TLS handshake. TLS assumes TCP is being used at the Transport layer, and the TLS Handshake takes place after the TCP Handshake. The TLS Handshake is the process by which a client and a server exchange encryption keys.
If we want to create modern networked applications however, there's a number of things that we need beyond what IP can provide. What are two of the most important things we need?
Two of the most important things are a direct connection between applications, and reliable network communication.
User Datagram Protocol (UDP)
UDP provides a lightweight service for connectionless data transfer without error detection and correction. It is an alternative to TCP that achieves higher transmission speeds at the cost of reliability. It is a connectionless transport protocol that provides unreliable transport, in that if a segment is dropped, the sender is unaware of the drop, and no retransmission occurs.
URL encoding
URLs are designed to accept only certain characters in the standard 128-character ASCII character set. Reserved or unsafe ASCII characters which are not being used for their intended purpose, as well as characters not in this set, have to be encoded. URL encoding serves the purpose of replacing these non-conforming characters with a % symbol followed by two hexadecimal digits that represent the ASCII code of the character. The picture shows some popular encoded characters and example URLs:
The Physical Network.
Underlying everything at the most basic level is a 'physical' network made of tangible pieces such as networked devices, cables, and wires. Even the radio waves used in wireless networks, though we can't touch or see them, exist in the physical realm and are bound by physical laws and rules. These laws and rules determine how data actually gets transported from one place to another in a physical sense. What happens at this level involves real-world limitations and boundaries, such as how fast an electrical signal or light can travel, or the distance a radio wave can reach. These limitations determine the physical characteristics of a network, and these characteristics have an impact on how protocols function further up at the conceptual level. If we want to work with these protocols, it is therefore important to have at least a basic understanding of how a network works at this level.
IP Addresses vs MAC Addresses
Unlike MAC Addresses, IP Addresses are logical in nature. This means that they are not tied to a specific device, but can be assigned as required to devices as they join a network. The IP address that the device is assigned must fall within a range of addresses available to the local network that the device is connected to. This range is defined by a network hierarchy, where an overall network is split into logical subnetworks, with each defined by the range of IP addresses available to it.
Schemes vs Protocols
When looking at URL Components, we described the component that prepends the colon and two forward slashes at the start of a URL as the scheme. You'll often hear this URL component incorrectly referred to as the protocol. The source of this confusion is that, although referring to this component as the protocol is technically incorrect, in the context of a URL there is a relationship between the two things in that the scheme identifies which protocol should be used to access the resource. It should be noted that 'protocol' in this sense refers to a 'family' of protocols, rather than a specific protocol version, e.g. HTTP rather than HTTP 1.0 or HTTP 1.1. One more thing to note when discussing schemes and protocols is that the canonical form of a scheme name is lowercase. The convention is to refer to scheme names in lowercase, e.g. http, and protocol names in uppercase, e.g. HTTP.
Speculative optimizations
When the browser learns the navigation patterns of the user over time and attempts to predict user actions. This can involve pre-resolving DNS names, or even pre-rendering pages to frequently visited sites. Or the browser can open a TCP connection in anticipation of an HTTP request when a user hovers over a link.
Document-Aware Optimizations
When the browser leverages networking integrated with parsing techniques to identify and prioritize fetching resources. The goal is to more efficiently load a web page by prioritizing certain resources such as CSS layouts and JS which can take the longest amount of time.
Between which two layers does TLS operate at?
When thinking about TLS it can be useful to think of it as operating between HTTP and TCP.
Ethernet protocol vs. Internet Protocol
Whereas the Ethernet protocol provides communication between devices on the same local network, the Internet Protocol enables communication between two networked devices anywhere in the world. We can send a message from one device on the internet and it can reach another device on the internet.
Head-of-the-Line (HOL) blocking in regards to TCP
With TCP, HOL blocking can occur as a result of the fact that TCP provides for in-order delivery of segments. Although this in order delivery is one aspect of TCP's reliability, if one of the segments goes missing and needs to be retransmitted, the segments that come after it in the sequence can't be processed, and need to be buffered until the retransmission has occurred. This can lead to increased queuing delay which, as we saw in an earlier assignment, is one of the elements of latency.
The concept of the internet as a layered system of communication.
Within this system, each layer provides a certain level of functionality or service to the layer above. We looked at some of the protocols at each layer in more detail, and at how the Internet Protocol (IP) essentially provides the inter-network communication services necessary for what we might think of as a minimum viable internet.
Encryption
a process of encoding a message so that it can only be read by those with an authorized means of decoding the message.
Integrity
a process to detect whether a message has been interfered with or faked.
Key Parts of the ethernet frame to remember for launch school.
You don't need to memorize all of these fields. We list them here mainly to build a picture of an Ethernet Frame as structured data. The key components to remember are the Source and Destination MAC address and the Data Payload.
The case for UDP.
You might be wondering, if UDP is unreliable then why use it? Why not just use TCP? The advantage that UDP has over TCP is its simplicity. This simplicity provides two things to a software engineer: speed and flexibility. UDP is a connectionless protocol. Applications using UDP at the Transport layer can just start sending data without having to wait for a connection to be established with the application process of the receiver. In addition to this, the lack of acknowledgements and retransmissions means that the actual data delivery itself is faster; once a datagram is sent it doesn't have to be sent again. Latency is less of an issue since without acknowledgements data essentially just flows one way: from sender to receiver. The lack of in-order delivery also removes the issue of Head-of-line blocking (at least at the Transport layer). The specifics of which services to include are in left up to the software engineer and can be implemented at the application level, effectively using UDP as a 'base template' to build on top of. An example of such an application would be a voice or video calling application. Another example would be online gaming where the occasional loss of data causing a slight glitch is more acceptable than having significant lag in the gaming experience. While UDP provides a lot of flexibility and freedom, with that freedom comes a certain amount of responsibility. There are various best practices that should be adhered to. For example, it would be expected that your UDP-based application implements some form of congestion avoidance in order to prevent it overwhelming the network.
What is a a Network?
a communications, data exchange, and resource-sharing system created by linking two or more computers and establishing standards, or protocols, so that they can work together.
What is latency in a physical network?
a measure of the time it takes for some data to get from one point in a network to another point in a network. We can think of latency as a measure of delay. There are actually different types of delay(latency) that go together to determine the overall latency of a network connection.
Vigenere Cipher
a method of encrypting text by applying a series of Caesar ciphers based on the letters of a keyword. A keyword is used along with a tabula recta in order to produce the cipher text. The theory behind this approach is that only those also in possession of the keyword can decrypt the ciphertext.
How are the end of headers indicated in HTTP/1.1?
by an empty line.
Data Packet - Destination Address
the 32-bit IP address of the destination (intended recipient) of the packet
Data Packet - Source Address
the 32-bit IP address of the source (sender) of the packet
What is bandwidth in a physical network?
the amount of data that can be sent at once.
What is Propagation delay?
the amount of time it takes for a message to travel from the sender to the receiver, and can be calculated as the ratio between distance and speed. Propagation delay is basically what was explained by our car analogy earlier.
What is Transmission delay?
the journey of data from point A to point B on a network typically won't be made over one single cable. Instead, the data will travel across many different wires and cables that are all inter-connected by switches, routers, and other network devices. Each of these elements within the network can be thought of as an individual 'link' within the overall system. Transmission delay is the amount of time it takes to push the data onto the link. In terms of our driving analogy, you can think of it as the time taken to navigate an intersection or interchange between different roads. We'll explore this idea further with the traceroute program later in this assignment.
Data Packet - ID, Flags, Fragment Offset
these fields are related to fragmentation. If the Transport layer PDU is too large to be sent as a single packet, it can be fragmented, sent as multiple packets, and then reassembled by the recipient.
Data Packet - Protocol
this indicates the protocol used for the Data Payload, e.g. TCP, UDP, etc.
Data Packet - Version
this indicates the version of the Internet Protocol used, e.g. IPv4
Data Packet - Checksum
this is an error checking value generated via an algorithm. The destination device generates a value using the same algorithm and if it doesn't match, it drops the packet. IP doesn't manage retransmission of dropped packets. This is left to the layers above to implement.