Technical Concepts

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Technical Explanation Questions

1. Clarify 2. Ask for time to think of a structure 3. Explain step-by-step 4. Conclude and discuss

How does Google Docs Work?

1. Clarify: which capability do we want to focus on? 2. Identify capabilities: 1) Real-time collaboration, 2) Storage, 3) Permissioning; chose real-time collaboration since that is Google Docs' value proposition 3. Draw a diagram and walk through a scenario: 2 clients making POST requests to a server (let's say both clients have a document open with the word "DOG" and are trying to make changes to it simultaneously); requests go into a queue that is FIFO for the server -> server maintains its source of truth and is able to adjust accordingly (i.e., user 1 inserts X at index 2, while user 2 deletes "D"; if the deletion request is processed first, then the server will readjust accordingly and know that the second request should be to insert X at index 1) 4. Another scenario: if user 1 deletes "DO" and user 2 deletes "D, then once the first request is processed, the second request is invalidated ("no-op" or no operation) 5. Another scenario: overlapping "inserts" -> need to implement both since you don't know if they meant to insert at the same position or not

System Design Framework

1. Define Functional and Non-Functional Requirements (features in scope vs. out of scope) 2. Estimate: Expected Requests Per Second, Number of Users, how much data should be stored 3. Design the service at a high-level (draw a diagram) 4. Articulate the Data Model (e.g., users table with the following fields: SQL or NoSQL?) 5. Scale

Video Conferencing Tool - System Design

1. Functional Requirements: 1) 1:1 calls, 2) Group calls, 3) Audio, video, screen share, 4) Record 2. Non-functional Requirements: 1) Super Fast: better than low latency - needs to be instantaneous, 2) High Availability, 3) Data Loss is ok 3. Design: 1) Use UDP instead of TCP for video transfer; for the rest of the communication, you can use TCP; 2) Use WebRTC for 1:1 calls, but Websocket for group calls; 3) For groups with large number of users, route all the requests through a server, 5) Communication should be encrypted, 6) Microservices (many different services: video chat services, recordings, storage, authentication, etc.)

Input/Output: what happens when you press a key in the keyboard?

1. The first step is to press a key in the keyboard. Each key has a binary code associated with it. 2. Store the key: a tiny chip / device called keyboard controller will take action as a key has been pressed and will store the key code to a memory called keyboard buffer - a small area in a computer's memory that holds data until it can be processed. 3. Notify the system software: the keyboard controller sends an interrupt request to the system software to read the key code from its buffer. 4. Read from buffer: the system software reads the key code from the keyboard buffer. 5. Send to CPU: system software sends the key code to CPU for further processing. 6. Clear buffer: after the key code is read from the system software, it will be removed from the queue of the keyboard buffer. 7. CPU interprets information, echoes the data to the screen and/or performs the instruction or sends the data to primary storage (RAM).

How does the Internet work?

1. The internet is composed of many complex elements. I could talk about network connections, blockchain technology, specific web services, etc. But the most fundamental feature of the Internet is probably that websites can be accessed by typing a URL in a browser, so this is what I suggest we focus on. Is that what you have in mind too? 2. The Client browser uses the URL (e.g. example.com) to find the website's IP address, which is either stored in local memory or found with a DNS lookup. Let's use a metaphor for a second. A DNS resolver is like a big phone book matching URLs and IP addresses. If you wanted to call "John Smith" on the phone first you would need to find his number in the phone book. Next, the browser uses the IP address and queries the Internet for the website's data. This is like if you dialed John Smith's number, the phone company would make a connection between your phone lines. Then the website's Server sends appropriate data (e.g. an index.html file) back across the Internet. In our metaphor, when John Smith answers and says hello, his voice is translated into an electronic signal that's passed through the phone lines. Finally, the website's data reaches the browser, which then displays a visual interpretation of that data. This is like your phone's speaker turning the electronic signal into John Smith's voice again.

Microprocessor

A microprocessor is an integrated circuit - one of the most complex of its kind. It is a computer processor that incorporates the functions of a central processing unit (CPU) on a single integrated circuit, or a single chip. It is used in a computer system to execute logical and computational tasks so other external circuits, including memory or peripheral ICs, can perform their intended functions. The microprocessor is more than the CPU. It contains other processors such as graphics processor units. Sound cards and network cards are encased in microprocessors. So a CPU is part of a microprocessor, but a microprocessor is more than the CPU.

How does single sign-on work?

Authentication process that allows a user to access multiple applications with one set of login credentials. Single Sign-on usually makes use of a central service which orchestrates the single sign-on between multiple clients. For example, in the case of Google, this central service is Google Accounts. Whenever a user accesses their first Google product like Gmail, she will automatically redirect to their central service called Google Accounts to log in. Once she is successfully logged in, Google Accounts provides her with several authentication cookies and redirects her back to Gmail. After this, if she tries to access some other Google service like YouTube, she will be first redirected to the same Google Accounts, which validates her existing cookies and safely drops her off to YouTube. There are three parties involved: the user, the service provider, and the identity provider. Step 1: The user opens their browser and enters the URL. Step 2: Request gets sent to the web server. Step 3: Web server will redirect the request to the identity provider as well as generate a SAML authorization request. Step 4: If the user is already authenticated on the identity provider, this step can be skipped. If not, the identity provider authenticates the user by prompting for a username and password or some other authentication factor. Step 5: A SAML token / response is generated by the identity provider and sent to the browser. Step 6: Browser redirects token to web server. Step 7: Web server validates the SAML response / token. Step 8: Server returns secure page to user. Once a particular token is assigned, there will be a validity of this token, suppose for an entire session or a certain time window. During that time window, the user need not to authenticate again and again and again until this time expires.

How does OAuth work?

Authorization protocol that allows you to approve one application interacting with another on your behalf without giving away your password. OAuth is about authorization and not authentication. Authorization is asking for permission to do stuff; authentication is about proving you are the correct person because you know things. OAuth doesn't pass authentication data between consumers and service providers - but instead acts as an authorization token of sorts. The simplest example of OAuth in action is one website saying "hey, do you want to log into our website with another website's login?" How OAuth Works The players in an OAuth transaction: Resource Owner: user who authorizes the application to access their account. The application's access to the user's account is limited to the scope of the authorization granted (e.g. read or write access) Client: application that wants to access the user's account. Before it may do so, it must be authorized by the user, and the authorization must be validated by the API. Resource Server: hosts the protected user accounts. Authorization Server: verifies the identity of the user then issues access tokens to the application. From an application developer's point of view, a service's API fulfills both the resource and authorization server roles. We refer to both of these roles combined, as the Service API. Step 1: The application requests authorization to access service resources from the user. Step 2: If the user authorizes the request, the application receives an authorization grant. Step 3: The application requests an access token from the authorization server by presenting authentication of its own identity and the authorization grant. Step 4: If the application identity is authenticated and the authorization grant is valid, the authorization server issues an access token to the application. Authorization is complete. Step 5: The application requests the resource from the resource server and presents the access token for authentication. Step 6: If the access token is valid, the resource server serves the resource to the application.

Bluetooth

Bluetooth uses radio waves instead of wires and cables to transmit information between electronic devices over short distances. Unlike your cell phone, which uses radio waves to communicate with a cell tower several miles away, the radio waves Bluetooth products use are 1000 times weaker and only travel small distances between the two communicating devices, usually 10 feet or less. Communicate using electromagnetic waves - when your smartphone sends a long string of binary 1s and 0s to your earbuds, it communicates these 1s and 0s by designating different wavelengths to each one respectively. Your smartphone's antenna generates these wavelengths.

How would you explain cloud computing to your grandmother?

Cloud computing is like going out to a restaurant instead of cooking dinner at home. When you cook at home, you have to do everything yourself. You need to use your own plates, pots, and pans. You have to know how many people are coming over and buy the right amount of ingredients. If more people join for dinner than you expected, you'll run out of food! But if fewer people show up, then you'll have leftovers that go to waste. And on top of it all, you have to do all the cooking, set up, and clean up yourself. Before cloud computing existed, running an Internet business was actually pretty similar: You had to buy your own computers or rent dedicated servers from someone else. You had to predict the number of visitors who would come to your website and make sure your servers could handle that traffic. And you had to do a lot of infrastructure and maintenance work yourself—the computer equivalent of washing the dishes. With cloud computing, on the other hand, it's more like going to a big restaurant or cafeteria. Restaurants can handle groups of all sizes because they (usually) have more than enough food and plates to go around. If more friends want to join, the restaurant can move some tables around and give you more room. You're going to pay more on average than cooking yourself, but you only pay for what you eat. Cloud computing providers like AWS, Google Cloud, or Azure are similar. They have nearly unlimited computing power which allows you to worry about your business instead of buying and maintaining computers. As your business grows, you simply pay for what you need. Like fancy restaurants that offer exotic menu items, cloud platforms also offer new technologies that you can't find at home, like machine learning and more. And best of all—no washing dishes!

Technology Stack

Combination of different libraries, frameworks, applications that help you create a mobile app, simple website, or something that needs to scale up -> use the right tool for the right job. Break it down into 3 parts: 1) Front-end: includes tools required to build a user interface for the end user. 2) Back-end: includes server-side runtime along with a database to store user-generated data. 3) API: tools used to connect the front-end to the back-end.

How could you reduce load times?

Compress and Optimize your Images Know when to use the appropriate file format for your images. Changing to a different file format can dramatically decrease the file size of an image. Compressing images through lossy or lossless compression. Minimize image size. Compress and Optimize your Content Compress your website content - use HTTP compression in which all of your webpage data is sent in a single smaller file instead of a request that is full of many different files. Cache your web pages Cache web pages and database queries so you can decrease the strain on your server as well as speed up page rendering times. Caching stores copies of your site's files, minimizing the work needed for the server to generate and serve a webpage to a visitor's browser. You can do this at the server level, meaning that your host handles it for you. Enable browser caching Enables the browser to serve a variety of information, including stylesheets, images, and JavaScript files, so it doesn't have to reload the entire page every time a user visits it. Use Asynchronous Loading Your site is made up of CSS and JavaScript files; asynchronous loading enables multiple files to load at the same time, which can speed up the page's performance. Choose a performance-optimized hosting solution The hosting provider you use plays a major role in your website's management and performance. Does your provider offer shared hosting? Is your website sharing its resources with other websites? Reduce your Redirects Too many redirects on your website can really hurt loading times. Every time a page redirects somewhere else, it prolongs the HTTP request and response process. Minify CSS, JavaScript, and HTML Removing unnecessary spaces, characters, comments, and other unneeded elements to reduce the size of the files. Eliminate unnecessary plugins Having too many plugins on your site can cause unnecessary bloat that slows it down. Additionally, plugins that are outdated or aren't well-maintained can pose a security threat, and even introduce compatibility issues that hamper performance. Leverage a Content Delivery Network (CDN) Content Delivery Network, also referred to as a 'content distribution network', is a network of servers that can help improve page loading speed. It does this by hosting and delivering copies of your site's static content from servers located across the globe. A CDN works with, rather than in place of, your host. In addition to the server that hosts your primary website, you can leverage a CDN to distribute copies of your site's files among strategically-chosen data centers. This can maximize performance, by reducing the distance data requests have to travel between browsers and your host's servers. By loading the content for a webpage from a server close to each visitor, a CDN helps reduce network latency (e.g., Amazon Cloudfront)

How can you speed up an API call?

Compression Use compression techniques to compress the data that is transferred between the server and the client. Use Faster Data Access Strategies The faster you can retrieve data from the database, the more responsive your Web API would be You can use ORM - a lightweight ORM will always fetch data faster compared to ORMs that are complete and are slow. Use Caching If you have requests that frequently produce the same response, a cached version of the response avoids excessive database queries. Use Asynchronous Methods You should implement asynchronous methods to maximize the number of concurrent requests that your Web API can handle at a given point of time. Prevent Abuse Ex: a developer using your API on a local application may accidentally execute a loop, causing a sudden burst of requests that slow down your performance. The best way to avoid these problems is to implement a rate-limiting strategy. By measuring the number of transactions per second, per IP address, or token (if each client is authorized before accessing the API), you can cut off API clients that make excessive requests. Use PATCH Many developers believe PUT and PATCH are essentially the same methods, but in reality, they both update a resource in different ways. PUT requests modify a resource by sending updates to the entire resource, whereas PATCH applies a partial update to it. The latter has a smaller payload that can improve performance in some cases. Design Practical Endpoints Create practical, user-focused endpoints for your API -> this will minimize the number of calls developers need to make, thereby reducing the cumulative latency of those calls and making your API feel faster. Stream Where Applicable With streaming APIs, the developer makes an initial request, and the server continually sends responses back as new data is made available. Compare that to the alternative: data is made available, sits around for a while until the developer makes a request, and is only then sent off from the server. Streaming eliminates the repeated requests sent from the developer to the server.

Main Memory

Computer memory consists of two basic types: primary / main memory (RAM & ROM) and secondary memory (hard-drive, flash drive, etc.) Place where computers keep the instructions and the data so that the CPU can get to them relatively quickly to do its work - CPU interacts closely with primary memory / main memory, referring to it for both instructions and data. For the CPU to be able to access the memory at a particular address, the CPU has to be connected to the memory. That is the function of the so-called "memory controller" combined with the "memory bus". You can think of the controller as the traffic cop and the bus as a highway between the CPU and the main memory. Together they allow the CPU to either send out an address and then receive the contents of that location in memory (a read) or to send out an address and the bits to be put into that location in memory (a write). The memory controller and bus take care of getting the data to and from the memory even when that memory is spread out - as is usually the case - across multiple different chips no matter what the location is. The RAM part or "Random Access Memory" in the names above comes from the fact that the CPU can access address locations in main memory in any pattern it wants to. (ROM - read-only memory: type of memory that retains its contents even without power but cannot ever be changed (written)) ROM contains the programming needed to start a PC, which is essential for boot-up; it performs major input/output tasks and holds programs or software instructions.

Continuous Delivery vs. Continuous Deployment

Continuous Delivery: Practice of automating the entire software release process The idea is to do Continuous Integration plus automatically prepare and track a release to production. The desired outcome is that anyone with sufficient privileges to deploy a new release can do so at any time in one or a few clicks. The continuous delivery process typically includes at least one manual step of approving and initiating a deploy to production. Continuous Deployment: Step up from Continuous Delivery in which every change in the source code is deployed to production automatically, without explicit approval from a developer. A developer's job typically ends at reviewing a pull request from a teammate and merging it to the master branch. A CI/CD service takes over from there by running all tests and deploying the code to production, while keeping the team informed about the outcome of every important event.

Development, Staging, Prod Environments

Each server (computer) used to host a website or application has different hardware and software dependencies in order for it to function properly. Once your server is set up for your website, this is called the "environment". Development: When developers first start to build your website, they do so in what we call the dev environment. This could be an internal office computer, a developer's personal computer. Staging: A copy of your prod environment (your current live website) on a private server. This is a safe place that allows you to test any changes that you plan on implementing in a secure environment, preventing any unexpected errors on your live website. Prod: Where your final website will be hosted.

SFTP

FTP: protocol for transferring data between a computer and a remote computer or server over an Internet connection. SFTP: SSH file transfer protocol; secure protocol for transferring files. Still uses a client and server connection to facilitate file transfer, but SFTP is a separate protocol. SFTP requires authentication by the server you're connecting to. SSH acts as a privacy layer for your connection, establishing a secure channel between your local computer and a remote computer or server. In addition, encryption helps ensure your data isn't visible in plain text as it transfers over the Internet.

Why does Gmail search take longer than Google search?

Gmail search is real-time For most of the queries on the web, it's okay for web search to have some latency. However, the user's expectation when they search in their email inbox is that the email / chat they sent or received 30 seconds ago should be at the top of their search results. Otherwise, the search results are wrong. This combined with the fact that there are no real "popular" queries makes caching much more difficult and therefore latency a harder problem to solve. Does not have significant caching / hot searches When you search the web, for the most part, you're getting the same results for your query as anyone else would get for that query. This means caching works well for web search. Most search engines have a small "hot index" with the most popular content that can handle the majority of queries which is replicated out to lots of local data centers, giving low average response time even if the worst case is slow. Gmail search needs to be 100% accurate Gmail search results need to be perfect matches, whereas web search results are sorted by relevance and approximations can be made to cut corners. Total contents of the Web is actually smaller than the total contents of everyone's Gmail This means it could take more servers to hold all the indexes for mail search than for web search. Users are only searching across their own mail, but servers are still needed to store all the indexes so that every user can do this. Gmail storage is different as it needs decryption for every query / search

How would you explain the Internet to your grandmother?

Grandma, do you remember when you saw the telephone for the first time and how you could call and talk to your brothers and cousins from another city. There was an electricity pole near our house and we had these wirelines connecting to the phones in our house and the neighbors' houses. Similar, electricity poles were available in other areas including the cities where you called to talk to your brothers and cousins. Now, imagine that instead of these wired phones, there are computers that are connected. Just like the telephones, computers (yes, the big TV like creatures) were initially connected to each other through these wires. As technology progressed, wired telephones got replaced by mobile phones that didn't need wires to work. Similarly, wired computers now don't need wires to connect to each other. Just like you could call anyone from your home phone in any part of the world with their phone number, the same way you can reach out to anyone on their computers through this network (no you don't need anyone's phone number to reach them). Just like with a telephone you could share a message through your voice, with computers you can share other information like pictures, audio, video calls, music, etc. So, the internet is essentially a connection of millions and millions of computers interacting and sharing all these videos, pictures, music, etc.

How does basic authentication work?

HTTP Basic Authentication Requires that the server request a username and password from the web client and verify that the username and password are valid by comparing them against a database of authorized users. When basic authentication is declared, the following actions occur: A client requests access to a protected resource The web server returns a dialog box that requests the username and password The client submits the username and password to the server The server authenticates the user in the specified realm and, if successful, returns the requested resource.

CPU

How a CPU works: 1. The control unit fetches the instruction from memory. 2. The control unit decodes the instruction and directs that the necessary data be moved from memory to the arithmetic/logic unit. 3. The arithmetic/logic unit executes the arithmetic or logical instruction. That is, the ALU is given control and performs the actual operation on the data. 4. The arithmetic/logic unit stores the result of this operation in memory or in a register. The control unit eventually directs memory to release the result to an output device or a secondary storage device. The steps are referred to as the instruction cycle of the CPU. Program Counter: register in the CPU. The CPU uses a program counter to keep track of which instruction to fetch next. The counter is the address of the memory location that holds the next instruction to be executed. The program counter is incremented to point to the next instruction after each fetch in the instruction cycle.

Explain the concept of protocol to a 4-year-old child using "ice cream store" as an analogy

Imagine going into an ice-cream store. You are talking to the assistant behind the counter. Your conversation will stay within the following set: -- You can ask the assistant for information on all the flavors they have available. When you do this, the assistant will give you a list of available flavors. -- You can request tastings of up to three flavors. The assistant will give you tastings of requested flavors. -- You can ask to purchase ice-cream by asking for a flavor, number of scoops, and specifying cone or cup. The assistant will then tell you the price of your order. When you give adequate money to the assistant, he will give you the requested ice-cream. This set of rules defines the ice-cream shop protocol. That is, it tells you and the assistant how you should talk to each other. What you can ask for, and what you cannot ask for. Take another example: if you ask the assistant for a burger, he may be confused. This is because due to the ice-cream shop protocol, he doesn't expect customers to ask him for burgers.

Object Oriented Programming

In object-oriented programming, we combine a group of related variables and functions into a unit. We call that unit an object. We refer to these variables as properties and the functions as methods. We group related variables and functions that operate on them into objects. In object-oriented programming, the notion of encapsulation refers to the bundling of data, along with the methods that operate on that data, into a single unit. A class is an example of encapsulation in that it consists of data and methods that have been bundled into a single unit. Encapsulation is used to hide the values or state of a structured data object inside a class, preventing unauthorized parties' direct access to them. Abstraction: in our objects, we can hide some of the properties and methods from the outside. This will make the interface of those objects simpler. Inheritance: mechanism that allows you to eliminate redundant code; allows one class to inherit the properties and methods of another class (parent class and child class). Example: Think of HTML elements like text boxes, dropdown lists, and checkboxes. All these elements have a few things in common: they should have properties like hidden and innerHTML and methods like click() and focus(). Instead of redefining all these properties and methods for every type of HTML element, we can define them once in a generic object called HTMLElement and have other objects inherit these properties and methods. Polymorphism: concept that refers to the ability of a variable, function, or object to take on multiple forms. Objects of classes belonging to the same hierarchical tree (inherited from a common base class) may possess functions bearing the same name, but each having different behaviors. Ex: Base class named Animals with subclasses Horse, Fish, Bird. The Animals class has a function called Move, which is inherited by all subclasses mentioned. With polymorphism, each subclass may have its own way of implementing the function. For example, when the Move function is called in an object of the Horse class, the function might respond by displaying trotting on the screen. On the other hand, when the same function is called in an object of the Fish class, swimming might be displayed on the screen. In the case of a Bird object, it may be flying.

World-wide web

Information system where documents and other web resources are identified by Uniform Resource Locators and are accessed via the Internet

Sandbox

Isolated virtual machine in which potentially unsafe software code can execute without affecting network resources or local applications

What is the difference between latency and bandwidth

Latency: measures the round trip time from the browser to the server; determines how fast the data can be transferred from the client to the server and back. The higher your network latency, the longer it will take for a data packet to reach the appropriate destination. Bandwidth: refers to the amount of data that can be transferred over a given time period (theoretically). For instance, if a network has high bandwidth, this means a higher amount of data can be transferred. Unit of measurement: bps, Mbps, Gbps. Bandwidth measures capacity. Throughput: amount of data able to be transferred over a given time period. Provides a practical measurement of the actual delivery of packets; the average throughput of data on a network gives users insight into the number of packets successfully arriving at the correct destination. In most cases, the unit of measurement for network throughput metric is bits per second (bps).

Storage: long-term memory

Let's start by looking at what's available on computers. The traditional way of storing digital data is the hard drive (HDD), which is a spinning metal plate coated with a layer of magnets that store your information. There's a special arm that reads and writes data to the drive.Then there's the newer solid-state drive (SSD), which has no moving parts but instead stores information in a huge grid of tiny boxes, called "cells." Each tiny cell stores a 0 or 1. Since an SSD is little more than a bunch of cells, it has no moving parts. This technology is called "flash memory," and it's very common: SSDs, flash drives, and SD cards all use flash memory to store information.Hard drives vs. solid-state drivesSo which form of storage is better? Hard drives are made of moving parts with fragile arms and discs, so they break down faster (even with normal use), make noise, are heavy, and use lots of power. Meanwhile, SSDs have no moving parts, so they're much sturdier, quieter, lighter, and more efficient. Plus, hard drives need to spin around a moving disc to find information while SSDs just need to send pulses of electricity, making SDs a good deal faster than hard drives. In other words, SSDs beat hard drives in almost every way.Hard drives used to be cheaper per byte, but even that's going away as SSDs get cheaper every year. So while hard drives were traditionally dominant in computers, SSDs are winning out. You can't even get MacBooks or Microsoft Surfaces with hard drives anymore; they all offer only SSDs.Meanwhile, phones, tablets, and cameras can only ever use flash memory (Remember, SSDs are just a special kind of flash memory designed for laptops). One reason is that you can't even make a hard drive small enough to fit in today's mobile devices, since you can only shrink the spinning plates so much. Therefore, smaller devices have had to use flash memory. Plus, flash memory is small, energy-efficient, and resistant to getting dropped, which are all very useful to have in mobile devices.

Streaming

Method of transmitting a media file in a continuous stream of data that can be processed by the receiving computer before the entire file has been completely sent. The term refers to the delivery method of the data rather than the data itself. Another common delivery method for audio and/or video data is downloading the data onto your computer so the data ends up being stored on your computer. Streamed data is not stored on your computer, at least not the entire data file all at one time. Ex: if a person downloads a copy of a movie onto her computer so she can watch it again and again, then she is not streaming the movie when she watches it. She does not need to be connected to the Internet to watch the downloaded movie. However, if this same person goes to her Netflix account and clicks a button to begin watching the movie, then Netflix begins to stream the movie across the Internet to their computer. This transmission is more fleeting. At any one point in time, only a small portion of the movie is on her computer.

How does your tower know where your friend's tower is located?

Mobile Switching Center (MSC): central point of a group of cell towers. Contains the main information of your SIM cards and provides the transfer of the calls to the right recipients. The MSC that recorded the information of your SIM card in its database is called Home MSC - this information can be the serial number of the SIM, your current location, service plan, pin code, telephone number. To understand in which cell location the subscriber is within the MSC area, the MSC uses a few techniques: Time Based: update the subscriber location after a certain period Location Area Based: when the phone crosses a predefined number of towers When Phone Turns On If you travel outside the geographical area covered by your home MSC, a new MSC called the foreign MSC will handle your calls. Communicating with your home MSC, which will always know your cell location and therefore will correctly direct your incoming calls to your phone. Example: Suppose Emma wants to call John. When Emma dials John's number, the call request arrives at Emma's home MSC. Upon receiving John's number, the request will be forwarded to John's home MSC. Now John's MSC checks for his current MSC. If John is in his home MSC, the call requests will be immediately sent to his current cell location. However, if John is not in his home MSC, John's home MSC simply forwards the call request to the foreign MSC. The foreign MSC will follow the previously explained procedure to locate John's phone, and will then establish the call.

Modem vs. Router

Modem brings the Internet into your home or business. Establishes and maintains a dedicated connection to your Internet Service Provider (ISP) to give you access to the Internet. Router comes in after the modem. Router is what routes or passes your Internet connection to all of your devices. Technically you don't need a router if you only want one of your devices to access the Internet. When you have multiple devices that need to connect to the Internet, then you would need a router. A lot of times your Internet Service Provider (ISP) will provide you with a modem-router combination, so it will be a modem with a built-in wireless router in one physical device.

Scaling

Multiple servers, load balancing, replication; CDNs, caching, rate-limiting; database sharding, advanced replication; regional strategies, NoSQL

SSH (secure shell)

Network protocol that allows one computer to securely connect to another computer over an unsecured network like the Internet; different from a VPN in which the SSH connects to a particular computer while a VPN connects to a network. SSH encrypts your data through a tunnel.

Moore's Law

Observation made by Gordon Moore that the number of transistors in a dense integrated circuit (IC) doubles every two years, though the cost of computers is halved. Moore's Law states that we can expect the speed and capability of our computers to increase every couple of years, and we will pay less for them.

Jenkins

Open-source automation tool built for continuous integration purposes. Used to build and test your software continuously, making it easier for developers to integrate changes to the project. How does Jenkins work? First, a developer commits his code Second, Jenkins will pick up the changed code and trigger a build. Third, tests are performed. Fourth, the outputs are then available in the Jenkins dashboard. Automatic notifications can also be sent to the developers.

ETL (Extract, Transform, Load)

Process that extracts the data from different source systems (e.g., database, spreadsheet, other sources) then transforms the data (e.g., applying calculations, concatenations, etc.) and finally loads the data into the data warehouse system. Extract: From a source Passed to staging: acts as a buffer between the data warehouse and the source data; staging area is used for data cleansing and organization Transform: Data Cleaning/Organizing All of the data from multiple sources will be normalized and converted to a single system format, improving data quality and compliance. Single System Format Improving Data Quality Load: Data send to warehouse Depending on your business needs, data can be loaded in batches or all at once

Compression

Reduction in the number of bits needed to represent data Media, sound, and images are all typically stored in some compressed format. Loss less compression: changed the data around to take up less space but haven't given up any fidelity at all (e.g., instead of storing numbers, we store the deltas) -> more rare. Lossy compression (more common): rearrange data to take up less space but it's not going to reproduce the data exactly; it's just going to be qualitatively very close.

Document Object Model (DOM)

Represents the content of an XML or HTML document as a tree structure. By using the DOM functions like getElementsByTagName, we can access the element. It represents the page in a tree structure so that programs can read, access, and change the document structure, style, and content Can easily read, access, update the contents of the document Is a programming interface (API) Can be used with programming languages like Javascript Your browser is taking the HTML sent from the server and converting it into the DOM then any Javascript we write will interact with the DOM as an API for the HTML. In fact the DOM is language-agnostic and we can use any other programming languages to interact with it as well.

Lossless Compression Techniques

Run-Length Encoding: compression technique that reduces repeated or redundant information; takes advantage of the fact that there are often runs of identical values in files. Ex: in the Pac-Man image, there are 7 yellow pixels in a row. Instead of encoding redundant data (yellow pixel, yellow pixel, yellow pixel, etc.), we can just say "there's 7 yellow pixels in a row" by inserting an extra byte that specifies the length of the run, and then we can eliminate the redundant data behind it. We need to preface all pixels with their run-length. In some cases, this actually adds data, but on the whole, we've dramatically reduced the number of bytes we need to encode this image. This is lossless compression because we don't lose anything. The decompressed data is identical to the original before compression. Dictionary Coders: another lossless compression; blocks of data are replaced by more compact representations. To do this, we need a dictionary that stores the mapping from codes to data. Ex: we are going to use pixel pairs (6 bytes long). In our example, there are only four pairings: White-Yellow, Black-Yellow, Yellow-Yellow, White-White. These are the data blocks in our dictionary we want to generate compact codes for. These blocks occur at different frequencies. We want the most common block to be substituted for the most compact representation. Huffman: one of the most efficient and optimal methods of compression -> popular for text compression Frequency Counting First, you layout all the possible blocks and their frequencies. Tree Building A Huffman tree follows the same structure as a normal binary tree, containing nodes and leafs. Each Huffman Leaf contains two values, the block and its corresponding frequency. To build the tree, we traverse our table of frequencies and blocks, and push the blocks with the highest frequencies to the top of the tree. A Huffman Tree helps us assign and visualize the new bit value assigned to existing blocks. If we start at the root node, we can traverse the tree by using 1 to move to the right and 0 to move to the left. The position of a leaf node relative to the root node is used to determine its new bit value. There is no way to have conflicting codes because each path down the tree is unique. Character Encoding Encode the blocks from the initial file and write the encoded bytes to a new file. We must also save the code dictionary; we will need to append it to the front of the image data.

Protocol

Set of rules about how two people (or machines) should talk to one another. Set of rules or procedures for transmitting data between electronic devices, such as computers.

Continuous Integration

Software development practice where everyone on the engineering team is continuously integrating these small code changes back into the codebase. After each change that they're making, there's a suite of tests that runs automatically that checks the code for any bugs or errors or anything like that. Practice of merging code changes back into the main branch as quickly and as often as possible. Before continuous integration: Two developers would work independently on their own features for a couple weeks or months, then they are going to integrate their work together. This would result in many merge conflicts -> "merge hell". With continuous integration: As soon as the developer has something that works even a little bit, her feature is not done, but she's got code that works, it doesn't break things, she submits that to the source code. The other developer pulls down the code with the latest changes from the first developer, makes his changes, and submits his changes to the source code. The first developer can then pull the code with this second developer's latest changes and work off that. And now they're working on these things together and this reduces the number of conflicts.

How do emails work?

Summary: Emails are routed to user accounts via several computer servers. They route the message to their final destination and store them so that users can pick them up and send them once they connect to the email infrastructure. When you click send, the message is transmitted from your computer to the server associated with the recipient's address. This process typically occurs via several other servers before the message gets to its intended recipient's mailbox. Let's say Adam is trying to send an email to Greg. Adam's email is [email protected], while Greg's email is [email protected]. What happens when Adam sends an email to Greg: The email gets sent first to an outgoing SMTP server (in this case, Gmail.com mail server), whose job is to transport emails. His computer will tell the Gmail.com mail server that it wants to initiate a mail transfer / mail message to [email protected] There are a number of different protocols used here: these protocols are designed to allow a mail client to communicate with the mail server to do things like retrieve mail, send mail, organize mail, etc. IMAP (Internet Mail Access Protocol) Leaves emails on the server while caching (temporarily storing) emails locally After it connects to the email server, it fetches whatever content you requested. This is cached locally, so you can work on your device. Once you make changes to your email, the server processes and saves these changes, then disconnects. Biggest aspect to remember is that all changes with IMAP happen on the server. You aren't downloading local copies of all your messages; you're using the email client to manage the email stored on the server. The only information stored on your device (unless you explicitly download something) are cached copies for efficiency. Advantages: allows multiple clients to manage the same inbox which is in line with how most people use email today - accessible from multiple devices; saves local storage space by not requiring your computer to download all messages. POP (Post Office Protocol) - largely outdated today Protocol that extracts and retrieves email from a remote mail server for access by the host machine; application layer protocol in the OSI model that provides end users the ability to fetch and receive email POP downloads emails from the server for permanent local storage When using POP, the email client first connects to the email server. Once it's successfully connected, it grabs all the mail on the server. It then stores this mail locally on your device so you can access it in your email client. Finally, it deletes the mail in question from the email server before disconnecting. This means that the messages then only exist on the device you downloaded them to. Note that while POP will delete mail from the server by default, a lot of POP setups allow you to leave copies of your email on the server. This can be useful if you're worried about losing your mail, but if your mail provider doesn't offer much server space, it can cause you to run out quickly. Advantages: mail is stored locally, so it's always accessible even without an internet connection; it saves server storage space since old messages are deleted from the server automatically. Disadvantages: not designed for checking email from multiple devices (ex: if you delete an email on one device, that deletion doesn't sync to the server, so other devices will still have that message); downloading every message from your POP account can use up a lot of space on your device. The SMTP server contacts the DNS server (the Internet's phone book) to translate the recipient's email address to a computer-friendly IP address in order to locate and deliver the message to the recipient. The SMTP server (Gmail.com mail server) uses another protocol called SMTP (Simple Mail Transfer Protocol) to transmit that request to the recipient's mail server (in this case, Hotmail.com mail server), also known as an MTA Note that the message will need to be routed from server to server via SMTP until it makes its way from the client to the email recipient's email server (there are many hops along the way) -> uses TCP/IP protocol Once an email is sent, the TCP protocol breaks it down into packets ( ); each packet bears the sender and the email recipient's address. The IP protocol routes the packets to the intended destination. Routers over the world wide web examine the addresses in each packet to calculate the most efficient route to the email's destination server. Once a pathway is planned, the packets are forwarded to the next router. Several factors go into how email packets are routed, such as traffic volume on any given network. Once the packets have arrived at the recipient's email server, TCP recombines them into the email format in which it was sent (on that the recipient can read). SMTP: Protocol mail servers use to communicate with each other about sending mail, rejecting mail if they don't know who the recipient is, etc. Contains information regarding the transmission details of an email message and is specifically used for outgoing mail MTA (Mail Transfer Agent): Server that uses SMTP to deliver emails Once the mail arrives at the recipient's mail server (Hotmail.com mail server), the recipient's mail server (MTA) decides exactly where to put the mail and whether the recipient uses a client that works via POP or via the IMAP protocol. The recipient's MTA will forward the email to the incoming mail server (MDA, mail delivery agent), tasked with storing the mail until the user accepts it. To retrieve email on an MDA, a supporting protocol must be used (POP or IMAP). The recipient will then receive a new email notification and fetch the mail, usually using a client that works via POP or IMAP.

Cellular Data

Technology that lets you connect wirelessly using cell towers that transmit and receive radio signals. There are usually two ways you can connect to the internet: WiFi network or cellular data. While both of these use radio waves, WiFi covers a limited area and cellular data, on the other hand, lets you connect as long as you're in a geographical area covered by your mobile network carrier. Does WiFi use cellular data? The simple answer is no. Though they both use radio frequencies, their connections are independent of each other. So, you don't have to worry about running out of your data allotment if you're connected to the internet via WiFi.

Registers: Temporary Storage Areas

Temporary storage areas for instructions or data. They are not a part of memory; rather they are special additional storage locations that offer the advantage of speed.

How does DNS lookup work?

The Resolving name server queries a Root server, which points to the appropriate TLD server. Let's use our phone book metaphor again. When calling John Smith, we would start our search for his number in the phonebook by deciding between the yellow pages for business numbers versus white pages for personal numbers. This is what the Root name server does for us. The Resolving name server then queries the appropriate TLD server, which points to the Authoritative name server. This would be like using the category and alphabetical sorting of the phonebook to find the specific page that lists all the "Smiths." The Resolving name server then queries the appropriate Authoritative name server, which will provide the website's IP address. There could be a lot of "John Smiths" listed in the phonebook, so you would use street addresses to determine the exact phone number for the specific John Smith you're trying to call.

Integrated Circuit

The basic idea was to take a complete circuit, with all its many components and the connections between them, and recreate the whole thing in microscopically tiny form on the surface of a piece of silicon.

Clock Speed

The complete sequence of fetching an instruction, decoding it, getting the data ready, executing the operation and moving the data back is known as a cycle. Normally, 2.4 billion cycles per second. This measure is referred to as the clockspeed and is expressed these days in Giga Hertz, where Giga is the billion and Hertz is the cycles per second.

Video Resolution

The total number of pixels in a given video frame. The higher the number of pixels in a given frame, the better the quality of the video.

Broadcasting

The use of electromagnetic waves to send information in all directions; distribution of audio or video content to a dispersed audience via any electronic mass communications medium, but typically one using the electromagnetic spectrum (radio waves), in a one-to-many model

Analog vs. Digital

The world is basically analog. Ex: sound is just pressure waves in the air and the shape is kept the same as it travels from air to your ear. That is the fundamental quality of analog processes: you're jumping from step to step but keeping the shape. Analog signals use a continuous range of values that help you to represent information; on the other hand digital signals use discrete 0 and 1 to represent information. An analog signal uses a given property of the medium to convey the signal's information, such as electricity moving through a wire. In an electrical signal, the voltage, current, or frequency of the signal may be varied to represent the information. To make a signal digital, we will reduce it to a series of numbers (digitization). The process of analog-to-digital conversion samples a waveform at a consistent rate and measures the level of each sample and assigns it a numerical value. (Sample = computer takes a "snapshot" of the signal every few microseconds). Digital audio conversion: make a pattern of electricity that exactly follows the original signal and then we can feed that electricity into the speaker (amplifier).

How would you explain the Internet to a child?

Tree: - A computer in front of you would be a single leaf - Leaf is connected to a branch with a bunch of other leaves on it (small network, LAN) - That small branch is connected to a larger branch, which is connected to other computers around the world (global networks) If they wonder how one computer or leaf can find another leaf on the other side of the tree, you can tell him/her that they have their own address, like a mailbox (IP address), so that the leaves can find each other.

Frames Per Second

Unit that measures display device performance in video captures and playback and video games. FPS is used to measure frame rate - the number of images consecutively displayed each second - and is a common metric used in video capture and playback when discussing video quality. The human brain can only process about 10 to 12 FPS. Frame rates faster than this are perceived to be in motion. The greater the FPS, the smoother the video motion appears. Full-motion video is usually 24 FPS or greater.

What is the Internet?

Vast network that connects computers all over the world

MapReduce

Way of structuring your computation that allows it to easily be run on lots of machines; programming model that simultaneously processes and analyzes huge data sets logically into separate clusters How does it work? Forces you to break up what you're trying to do into three stages: Map Shuffle Reduce Example: Count the number of times the words "Cat" and "Dog" appear in 4 given documents with 2 computers Map First, divide the documents among the different computers. So assign the first 2 documents to the first computer and the last 2 documents to the second computer. Each computer will count the number of times that a particular word appeared in their assigned documents. There's no communication between these machines that's allowed during the map phase; the machine just processes the documents it's given. Shuffle Second, in the shuffle / sort stage, it is going to sort those keys (words you are looking for in the documents) so that the Reduce stage gets the same key (group all the "Cat" keys together and send to one machine and group all the "Dog" keys together and send to the other machine) Reduce You can use the same two machines as in the Map stage to perform this stage or you can use two new machines; it does not matter. Take the results from the Map stage and combine them together to get the final result you are looking for.

Operating Systems (bridge between hardware and application layer on top)

What makes all parts of a computer system work together One of the first programs that is run when a computer is turned on, and all subsequent programs are launched by the OS. Are just programs, but special privileges on the hardware let them run and manage other programs. The OS handles such critical tasks as scheduling which program to run, reading from and writing to files (in storage), communicating over networks, accessing peripherals. Every modern OS consists of two fundamentally different parts: the kernel and everything else. As the name suggests, the kernel is central and is where all the really hard stuff happens, such as switching between different programs or writing data to an output device. A lot of OS design and development work has gone into figuring out what to put into the Kernel. No matter where the line is drawn, the programs that end users run on top of the operating system are all allowed to only access memory in what is known as user space or sometimes user land, which is kept separate from kernel space. Only code executing inside the kernel can access kernel memory.

How does your mobile phone work?

When you speak on your phone, your voice is picked up by your phone's microphone. The microphone turns your voice into a digital signal; the digital signal contains your voice in the form of zeros and ones. An antenna inside the phone receives these zeros and ones and transmits them in the form of electromagnetic waves. Electromagnetic waves transmit the zeros and ones by altering the wave characteristics, such as the amplitude, frequency, phase, or combinations of these. For example, in the case of frequency, zero and one are transmitted by using low and high frequencies respectively. In cellular technology, a geographic area is divided into hexagonal cells with each cell having its own tower and frequency slot. Generally, these cell towers are connected through wires, specifically optical fiber cables - laid under the ground or the ocean to provide national or international connectivity. The electromagnetic waves produced by your phone are picked up by the tower in your cell and convert them into high frequency light pulses. These light pulses are carried to the base transceiver box, located at the base of the tower for further signal processing. After processing, your voice signal is routed towards the destination tower. Upon receiving the pulses, the destination tower radiates it outwards in the form of electromagnetic waves, and your friend's phone then receives the signal. This signal undergoes a reverse process, and your friend hears your voice.

Lossy Compression Techniques

removing unnecessary or less important information, especially information that human perception is not good at detecting. Ex: Lossy audio compressors encode different frequency bands at different precisions. This is also why you sound different on the cellphone vs. in person. As the signal quality or bandwidth get worse, compression algorithms remove more data, further reducing precision, which is why Skype calls sometimes sound like robots talking. Almost all audio compression codecs are lossy as opposed to lossless. This data reduction is not considered to be a big detriment to sound quality, provided the removed data is deemed inaudible to the vast majority of listeners. Anything beyond the range of human hearing can be removed without us even noticing. In addition, sounds with a higher frequency are more difficult to hear if they're surrounded by louder, lower frequency sounds. Compression exploits this phenomenon through something called "masking" in which a louder, lower frequency covers up the loss of a higher, quieter one. Ex: Lossy compressed image formats (e.g., JPEGs): human perception is good at detecting sharp contrasts like the edges of objects, but our perceptual isn't so great with subtle color variations. Transform encoding: type of encoding used for JPEG images - averages out the color in small blocks of the image to create an image that has far fewer colors than the original. Chroma subsampling: takes into account the human eye perceives changes in brightness more sharply than changes of color, and thus drops or averages some color information while maintaining brightness information. Ex: Video compression (e.g., MPEG-4) - videos are really just long sequences of images, but videos can do more things because between frames, a lot of pixels are going to be the same: temporal redundancy. We don't need to re-transmit those pixels every frame of the video; we can just copy patches of data forward. When there are small pixel differences, most video formats send data that encodes just the difference between patches, taking advantage of inter-frame similarity. Video compression is performed through a video codec that works on one or more compression algorithms. The fanciest video compression formats go one step further. They find patches that are similar between frames, and not only copy them forward, but also can apply simple effects to them like a shift or rotation. They can also lighten or darken a patch between frames. Text Compression: typically works by finding similar strings within a text file, and replacing those strings with a temporary binary representation to make the overall file size smaller.


Kaugnay na mga set ng pag-aaral

Passpoint: Safety and Infection Control

View Set

Psychology 041 Final Exam (CHP 1-14) part 1 and 2

View Set