IDS 200 Exam 1

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Table

The set of records of a particular type, like photos or users

brute force attack

An attack on passwords or encryption that tries every possible password or encryption key.

ROM

(Read-Only Memory) One of two basic types of memory. ROM contains only permanent information put there by the manufacturer. Information in ROM cannot be altered, nor can the memory be dynamically allocated by the computer or its operator.

RDBMS

(Relational Database Management System) A software application that contains tools to manage data, answer queries, create user-friendly forms for data entry, and generate printed reports. Pros: Efficient when data is stable Fields are homogeneous for individual objects New fields not added Data can be stored to optimize query processing speed cons: Not flexible: difficult to change database schema (i.e., add fields) while maintaining relationships between tables Data doesn't always neatly fit relational schema Designed to run on single servers, which caps scale

ACID & Performance

-As you might guess, strict adherence to ACID database properties can inhibit performance -In a very large system with lumps of popular data, users will sometimes have to wait on operations Can't comment on a post until another user has finished his comment Deadlock: Two users simultaneously attempt to post on each other's profiles, creating their own profile entry How much does it matter if a user post on one server isn't immediately echoed to all others?

Security vs Usability

-Because perfect security is impossible, managers must balance usability against security The cost of added inconvenience for system users (i.e., reduced usage up to abandonment) The added system expenses of security The expected damage caused by all attack attempts over time At the ideal point, managers minimize the sum of security costs and attack damage

Accuracy leads to Revenues

-Better user experience à more time using More stuff you want to see Fewer posts from peripheral friends -Pay-per-impression advertising: More time using à more ads viewed More ads viewed à more $$$ -Pay-per-click advertising: More time using à more ads viewed Better ad-user match à more clicks per view More clicks on ads à more $$$

key field

A field in a record that uniquely identifies instances of that record so that it can be retrieved, updated, or sorted

Freshness and aggregation

-More recent data should be emphasized Better reflection of current interests and interactions with friends But, smaller sample size à estimation errors -Facebook isn't going to update the feed sequence formula for every new interaction User histories can be aggregated into single variables for simpler & faster calculations Aggregate history usable for ad content

Encryption

-Storing passwords as plain text in databases is a clear and massive security risk Hackers could potentially download data batches containing associated {id, password} pairs With encryption, would only get the masked data

Caching and Replication

-Worst case for Facebook's network: Content is very popular and bandwidth-intensive Celebrity interviews Video clips from major sporting event -However, most viewers won't watch content as it happens but staggered over time Two-part solution: Caching: Make popular content fast to reach Replication: Copy video segments to servers around the world to prevent bottleneck at the origin server

Access

-ensuring you can.. View your data Update your data

Security

-preventing outsiders from... Viewing your data Tampering with your data Blocking your access to your data

tera

10^12 (trillion) typical hard drive

peta

10^15(quadrillion) one day's facebook traffic

exa

10^18 (quintillion) all data of a large business

zetta

10^21 (sextillion) total annual internet traffic

Kilo

10^3 (thousand) Short text file

Mega

10^6 (million) big image

giga

10^9 (billion) long video

key-value database

A NoSQL database model that stores data as a collection of key-value pairs in which the value component is unintelligible to the DBMS -Every value (record, object) has a key, just like the unique key field in a RDBMS table The database doesn't enforce any structure on records in terms of fields or formats -Features: Flexible: changing an object definition has no effect on other objects Scalable: can easily span storage devices enabling very large databases Speed: system can be optimized for non-ACID cases Query limits: no table JOINs

Hard Disk Drive

A non-volatile computer storage device containing magnetic disks or platters rotating at high speeds. readable and writeable

strong password

A password that is difficult to break. Strong passwords should contain uppercase and lowercase letters, numbers, and punctuation symbols.

weak password

A password that is short in length (less than 15 characters),uses a common word (princess), a predictable sequence of characters (abc123), or personal information (Braden).

Responsive Design

A way to provide content so that it adapts appropriately to the size of the display on any device. Uses percentages to choose size of content.

Consistency

Actions are processed in sequence and according to any existing database rules

Phishing

An attack that sends an email or displays a Web announcement that falsely claims to be from a legitimate enterprise in an attempt to trick the user into surrendering private information

SQL Injection

An attack that targets SQL servers by injecting commands to be manipulated by the database.

Atomicity

Any set of related database operations either {all happen} or {all don't happen} changing one data thing + all it's backups.

ACID

Atomicity, Consistency, Isolation, Durability

Auto Size

Automatic growth in depth or width of a text frame dependent on the amount of text it contains.

Dynamic RAM (DRAM)

Cheap, obsolescent

Thick Client

Client machines relatively powerful, like complete desktops

thin client

Client machines relatively weak, like terminals without permanent storage

Facebook feed

Content stream: Posts (friends & friends of friends) Ads (Facebook & others)

Big Data

Data quantities requiring special tools or systems to manage

Replication

Decentralization: Multiple copies of data are stored at separate server groups worldwide When necessary, requests can be forwarded to other server groups Advantages of decentralization: Less latency Locally important data can be made more accessible (e.g., local news or language) System failure only affects local system Smaller database size facilitates updates

DASD

Direct Access Storage Device Mix of sequential, indexed, and direct access Hard drives, optical drives

Facebook Revenue streams

Direct sources: Ads (per click or per impression) Premium site content (e.g., original programming) Fees from site-enabled transactions Indirect sources: More users Higher engagement

Edge Rank

EdgeRank is Facebook's algorithm that is used to determine the ranking of various fanpages and to what extent messages from those fanpages are shown (including with fans) in the timeline. The higher the EdgeRank of your page, the more people/fans will see your posts.

In-Memory Grids/Database Systems

Essentially, an in-memory grid is the same RAM as in your computer but there's a lot more of it The system can operate at electronic speeds instead of physical speed Up to literally a million times faster Backup: Volatile memory is erased when power is lost, presenting a major reliability concern Although NVRAM systems are in development, backup is usually done on hard drive systems

Cloud System

Essentially, the operational requirements are broken into segments and allocated to computers Cloud details hidden from users for security and ease of use

Synchronous RAM (SDRAM)

Faster than ordinary dynamic RAM Much more expensive DDR levels indicate speeds relative to DRAM

Haystack

Initial problem: FB has lots of pictures, and two options; either hard disk storage (cheap but slow) or going through another CDN (expensive) Solution: restructure retrieval process. instead of multiple db calls per image , embed the metadata in the image URL and arrange photos in albums and runs much faster.

non-volatile memory

Memory stored on a chip which does not lose data when the power is turned off. For eg ROM

Matching feed to user preferences

Like NetFlix, Facebook relies primarily on preferences revealed via user actions Many users will falsely claim preferences in line with an idealized version of themselves Don't ask, just observe With matching preferences, the basic rule is to provide content similar to past interactions: Users whose posts you've commented on If you watch videos, you'll get more videos Also captured in profile dimensions

Hybrid Database System

Most organizational data is historic and rarely used, so instant response time is less important Rarely used data is stored on hard drives Frequently used data stored in RAM for fast access Backups may be synchronous or asynchronous Example: Oracle In-Memory Database Cache Hybrid system with RAM & hard drives Balances cost and speed

Isolation

No database operations affect others directly; each is executed the same regardless of any other operations occurring in parallel

flash memory

Non-volatile random access memory (NVRAM) Very slow for performing computations Used for storage & retrieval USB drives

Hadoop

Open-source software framework that enables distributed parallel processing of huge amounts of data across many inexpensive computers. large scale number-crunching system

Security Threats

Physical - the physical facilities and devices on which data are stored Personnel - the employees entrusted with maintaining adherence to security policies Software - the computerized rules for providing system security (Perfect Security is impossible)

Thick applications

Processing mostly happens on the client

Thin Application

Processing mostly happens on the server, results returned over the network

Static RAM (SRAM)

RAM chips that retain information without the need for refreshing, as long as the computer's power is on. They are more expensive than traditional DRAM. Doesn't require constant power Faster & less power consumption than DRAM Commonly used for high-performance CPU caches L1/L2 for a particular CPU/core L3 shared among CPUs/cores

RAM

Random Access Memory; temporary memory. RAM is expandable, and resides on the motherboard. Data accessed in approximately the same time regardless of its location Faster access but much more expensive Main memory, USB drives, solid-state drives

Row (record)

Represents a particular thing of the table's type, like a specific photo or user

Query

Retrieve data (typically retrieve data more than write)

SAM

Sequential Access Memory; storage accessed sequentially; a cassette tape, a CD Requires an initial "seek" delay to find the starting point Used to be cheap but slow: Used for storage, not computations Surviving example: Tape drives for extreme conditions

Striping

Splits data, instructions, and information across multiple drives in the array. Increases speed for reads and writes

volatile memory

Storage (such as RAM chips) that is wiped clean when power is cut off from a device.

Map-Reduce

a technique for harnessing the power of thousands of computers working in parallel

Durability

The database is robust against failures

Asymmetric Encryption (Two-Key Encryption)

The essential idea is that Yvonne and Zooey each have two "keys" to lock & unlock messages Public & private keys - refer to usage Encrypted with public key, decrypt with private key Encrypted with private key, decrypt with public key RSA encryption algorithm: keys are large primes Each user: Shares the public key Keeps the private key private This model extensible to a pool of n users, requiring 2n total keys, rather than n2 - n for unique pairs

Caching

The local storage of frequently needed files that would otherwise be obtained from an external source. Keep the stuff you use a lot where it's easy to reach "Hot" data - used often/recently "Cold" data - not used often/recently

Columns (fields)

The various attributes stored regarding each record in a table, such as: Photos: size, location, poster User: name, password, profile picture

Hot data

Used often/Recently

Server

a computer or computer program that manages access to a centralized resource or service in a network.

denial of service attack DDos

a cyber attack in which an attacker sends a flood of data packets to the target computer, with the aim of overloading its resources

spear phishing

a phishing expedition in which the emails are carefully designed to target a particular person or organization

Adaptive Design

a process that adjusts content to the screen size of a device used to access a webpage

NoSQL

aren't as strict as RDB's (table might allow flexible fields for different records or individual fields can be composite not atomic). typically more efficient for storage but you also typically lose the ability to do cross - table queries Flexibility: easy to add or remove data fields Accommodating different data sizes or structures Auto-sharding: distributed operation across servers

Modify

change data within a record

Clickbait

content whose main purpose is to attract attention and encourage visitors to click on a link to a particular web page

encryption algorithm

convert plain text to some scrambled gibberish using a fixed pattern Key point: Reversible! Related: Hashing (scrambling into groups, not necessarily reversible or with a 1-to-1 correspondence

Metadata

data that describes other data

Cold Data

data that is rarely accessed and therefore stored on an organization's slowest storage option

Facebook ACID Solution

has in many cases chosen a policy of "eventual consistency" The system will ultimately be consistent, but if a person is commenting while you are watching a video you may not be able to view it intially

Content Management System (CMS)

information systems that support the management and delivery of documents including reports, web pages, and other expressions of employee knowledge

Redundant Arrays of Independent Disks (RAID)

involves using parallel disks that contain redundant elements of data and applications. If one disk fails, the lost data are automatically reconstructed from the redundant components stored on the other disks. A group of physically independent hard drives Single I/O interface Allows operation as a single logical unit

Mirroring

making copies of files on different disks; doesn't inherently add time but it will add to the system load, so the RAID system can be much slower if busy

parity

means including error-checking for any file. writes can be substantially slower when combined with mirroring

read & write memory

memory that can be read and written.

Insert (table or record)

put or introduce into something

multifactor authentication

the use of two or more types of authentication credentials in conjunction to achieve a greater level of security

delete (table or record)

to erase, wipe out, cut out

Hashing

transforming plaintext of any length into a short code called a hash


Ensembles d'études connexes

PATH 370 - W7 Check Your Understanding

View Set

online mcc microecon test 4 (ch 26,27,31)

View Set

Foundations of Nursing Study Guide 4

View Set

Microeconomics Assignment #4 - ECU

View Set

A/P - The Endocrine System (Ch 10)

View Set

Responsive Web Design with Bootstrap

View Set