SQL/databases/servers/OS/VM/MongoDB

¡Supera tus tareas y exámenes ahora con Quizwiz!

operating system

the software that supports a computer's basic functions, such as scheduling tasks, executing applications, and controlling peripherals

delay-line memory

think of tube filled with mercury, send a pressure wave down (= 1), at any point there can be many pressure waves(data) memorized in the tube!

creating pattern to use with LIKE keyword

think regular expression! i.e. a% = start with a %a = end with a %or% = or in any position _r% r in second position a_% start with a and at least 2 characters in length

having clause

to filter the results grouped by a GROUP BY clause. Can only reference aggregate functions or column that are also specified in the GROUP BY clause.

union vs union all

union all allows duplicates

update

update certain entries, make sure to use where clause to stop ALL records in table from being updated UPDATE table_name SET column1 = value1, column2 = value2, etc. WHERE condition;

Count(*)

use with group by to just count number of entries grouped together

system call

way for programs to interact with kernel

page split

when you insert a record into a full page (especially an index page) and you have to split the records between a new page and the old page to balance the index again expensive!

Transact-SQL (T-SQL)

the primary means of programming and managing SQL Server. It exposes keywords so that you can create and manage databases and their components and monitor and manage the server itself. programming extension from Microsoft that adds features to SQL has local variables, mathematical operation functions, etc. passes turing completeness test ! SQL does not

common SQL commands

SELECT - extracts data from a database UPDATE - updates data in a database DELETE - deletes data from a database INSERT INTO - inserts new data into a database CREATE DATABASE - creates a new database ALTER DATABASE - modifies a database CREATE TABLE - creates a new table ALTER TABLE - modifies a table DROP TABLE - deletes a table CREATE INDEX - creates an index (search key) DROP INDEX - deletes an index **not case sensitive

union syntax

SELECT City FROM Customers UNION SELECT City FROM Suppliers combines results from two select statements

join example syntax

SELECT Orders.OrderID, Customers.CustomerName, Orders.OrderDateFROM Orders INNER JOIN Customers ON Orders.CustomerID=Customers.CustomerID;

EXISTS syntax

SELECT column_name(s) FROM table_name WHERE EXISTS(SELECT column_name FROM table_name WHERE condition);

a database can have multiple

tables, each with its own name

alias

temporary name for table/column use AS keyword

EXISTS keyword

tests for existence of any record in a subquery

where keyword

the keyword that specifies the criteria that records must match to be included in the results

operators in where clause

=, >, <, >=, <=, not equal, between(within a range), like, in(specify multiple values for a column, think tuple)

relational database

A group of database tables that is connected or linked by a defined relationship that ties the information together (i.e. rows are records, and columns are fields)...data is stored as TABLES You can imagine for a medical patient, their data is spread over multiple tables (one for phone #, allergies, etc.)

bind variable

A placeholder for SQL literal values, such as numbers or character strings (like if u want to accept inputs/parameters at run time)

database vs table

A table is an object inside a database. ... database is a collection of several components like tables, indexes, stored procedures and so on. A table is a two dimensional structure that contains several columns and rows. It is contains all the data in form of several records.

SQL wildcards

A wildcard character can be used to substitute for any other character(s) in a string. In SQL server... % (zero or more characters) _ (single character) [] (represents any single character within brackets) --> h[oa]t ^ (represents any character not in brackets) -> h[^oa]t - (range of characters, like c[a-b]t)

how to alter table in database

ALTER TABLE table_name OPERATION (i.e. ADD columnname datatype) or DROP COLUMN col_name

where conditional keywords

AND, OR, NOT (for further specifying what types of entries you want to select, etc.)

SQL join

An SQL command that temporarily joins data from more than one table in order to allow simultaneous access to both tables (INNER) JOIN: Returns records that have matching values in both tables LEFT (OUTER) JOIN: Returns all records from the left table, and the matched records from the right table RIGHT (OUTER) JOIN: Returns all records from the right table, and the matched records from the left table FULL (OUTER) JOIN: Returns all records when there is a match in either left or right table

SQL injection

An attack that targets SQL servers by injecting commands to be manipulated by the database (i.e. user input, you input 1=1 which is always true, which affects what values are selected if there is an OR statement) Also OR = is always true (may user input this) Also 105; DROP TABLE suppliers (creates a new batch, a group of SQL statements separated by semicolons...) use parameters to prevent injection (via @ symbol)

SQL contraints

Constraints can be specified when the table is created with the CREATE TABLE statement, or after the table is created with the ALTER TABLE statement. specifies rules for data that goes into table common ones include: NOT NULL - Ensures that a column cannot have a NULL value UNIQUE - Ensures that all values in a column are different PRIMARY KEY - A combination of a NOT NULL and UNIQUE. Uniquely identifies each row in a table FOREIGN KEY - Uniquely identifies a row/record in another table CHECK - Ensures that all values in a column satisfies a specific condition DEFAULT - Sets a default value for a column when no value is specified INDEX - Used to create and retrieve data from the database very quickly

CASE syntax

CASE WHEN condition 1 THEN result 1 WHEN condition2 THEN result2 WHEN conditionN THEN resultN ELSE result END;

stored procedure syntax

CREATE PROCEDURE procedure_name @param1_name data type, #param2_name datatype) AS operation (i.e. SELECT * FROM CUSTOMERS WHERE City = @City) GO; Then to run it use EXEC procedure_name @param1 = value1, @param2 = value2

CREATE UNIQUE INDEX

CREATE UNIQUE INDEX index_name ON table_name (column1, column2, ...); no duplicate values allowed

delete syntax

DELETE FROM table_name WHERE some_column=some_value; or just DELETE FROM table_name(deletes all rows without deleting table itself)

how to delete table in database

DROP TABLE tablename TRUNCATE TABLE tablename deletes data inside table

insert into select syntax

INSERT INTO table2SELECT * FROM table1WHERE condition;

SQL server ISNULL()

ISNULL(column_name, alternative value) allows you to return alternate value when expression is null

connection between server and database

If you want your web site to be able to store and retrieve data from a database, your web server should have access to a database-system that uses the SQL language.

returning only certain # of records

MySQL uses SELECT column_name(s) FROM table_name WHERE condition LIMIT number; versus SQL server syntax: SELECT TOP number|percent column_name(s) FROM table_name WHERE condition;

AS keyword

One way to name a column is by using the AS keyword, although it is optional, it enhances readability

steps to opening a db connection

Open a connection to the database using the database driver. Open a TCP socket for CRUD operations Perform CRUD operations over the socket. Close the connection. Close the socket very expensive if all these steps have to happen all the time!

INSERT INTO SELECT

SELECTs data from one table and INSERTs it INTO an existing table, rows in target table are unaffected

optical disk

Secondary storage device using lasers to read (and write) data to a reflective surface. For storing files to be distributed or transferred or for backup of important files (i.e. CD and DVD)

types of pooling

Session pooling: One of the connections in the pool is assigned to a client until the timeout is reached. Transaction pooling: Similar to session polling, it gets a connection from the pool. It keeps it until the transaction is done. If the same client wants to run another transaction, it has to wait until it gets another transaction assigned to it. Statement pooling: Connection is returned to the pool as soon as the first query is completed.

MySQL schema vs database

Synonymous But usually, schema=collection of tables while database =collection of schemas, ao database schema represents logical configuration of database

ALL operator

The ALL operator returns true if all of the subquery values meet the condition. SELECT column_name(s) FROM table_name WHERE column_name operator ALL(SELECT column_name FROM table_name WHERE condition); i.e.SELECT ProductName FROM Products WHERE ProductID = ALL (SELECT ProductID FROM OrderDetails WHERE Quantity > 99); so like a conditional return (may not always return anything, such as when ANY or ALL returns false)

ANY operator

The ANY operator returns true if any of the subquery values meet the condition. SELECT column_name(s)FROM table_name WHERE column_name operator ANY(SELECT column_name FROM table_name WHERE condition); so like a conditional return (may not always return anything, such as when ANY or ALL returns false)

DEFAULT

The DEFAULT constraint is used to provide a default value for a column. The default value will be added to all new records IF no other value is specified.

Stored procedure

basically a function, so you don't have to rewrite queries

virtual machine

a computer existing within another computer, physical computer is host, virtual is guest, VM uses host system resources

server

a computer or computer program that manages access to a centralized resource or service in a network It is a software or hardware device that receives requests over a network and responds, such as by sending requested info from a resource

floppy disk

a flexible removable magnetic disk, typically encased in hard plastic, used for storing data.

MS SQL Server

a microsoft database server that hosts relational databases...has a database engine (controls data storage) and server operating system (memory, I/O management, job scheduling, data locking...), also has network interface layer to facilitate request/response interactions with database servers

virtual memory

because finite amount of RAM, need way to prevent running out...uses hard disk space for additional memory allocation **INDIRECTION, MAPPING VIRTUAL TO PHYSICAL program has its own address space, broken into pages where each page is contiguous range of addresses pages mapped onto physical memory but some go to hard drive rather than RAM if not necessary...what's mapped to the hard drive changes (oldest thing on RAM is moved to hard drive when sth new needs RAM) if shared memory is needed, can map both program addresses to same RAM memory location! otherwise, program addresses are isolated (mapped to diff RAM spaces, not necessarily contiguous for any given program)

null field

blank field (use IS NULL or IS NOT NULL to test for nullness)

ACID

key properties of a relational db atomicity - all queries must succeed, if one fails all should rollback consistency isolation - can my inflight transaction see changes made by other transactions? durability

how to solve index fragmentation

1) rebuild it (will need more disk space as you are copying the original, and then deleting the original) 2) reorganize it (will require lots of log space due to logging, but won't require more disk space as the pages are reordered to be contiguous)

database connection pool

As scale of app increases, opening and closing a DB connection for every query really adds up and takes too much time, so a connection pool of opened connections is maintained and these connections get passed between operations, saving a lot of resources (though keeping connections open also takes up a lot of resources...) if each connection = its own thread, then Too few connections in the pool will result in high latency as each operation waits for a connection to open up. But too many connections to the pool can also result in high latency as each connection thread is being run concurrently by the system. The time it takes for many threads to complete concurrently is typically higher than the time it takes a smaller number of threads to run sequentially due to context switching while running concurrently When a new request to access data from the backend service comes in, the pool manager checks if the pool contains any unused connection and returns one if available. If all the connections in the pool are active, then a new connection is created and added to the pool by the pool manager. When the pool reaches its maximum size, all new connections are queued until a connection in the pool becomes available.

AUTO INCREMENT

Auto-increment allows a unique number to be generated automatically when a new record is inserted into a table. Often this is the primary key field that we would like to be created automatically every time a new record is inserted.

Non-Relational Databases

Developed to handle large data sets of data that is not easily organized into tables, columns, and rows i.e. image for medical patients, all their info is stored as a document about that patient vs many tables in relational databases Types: document database (stores data similar to JSON objects, each document contains pairs of fields/values) key-value databases (every item is a key-value) wide-column stores: stores data in tables/rows/dynamic columns, but better than relational cuz not every row has same # columns or even same column name for every row...like 2D key-value database graph database: store data in nodes/edges, edges are for relationships **mongoDB is a popular NoSQL database

guid vs pk

GUIDs can be considered as global primary keys. Local primary keys are used to uniquely identify records within a table. On the other hand, GUIDs can be used to uniquely identify records across tables, databases, and servers.

CREATE VIEW

In SQL, a view is a virtual table based on the result-set of an SQL statement. A view contains rows and columns, just like a real table. The fields in a view are fields from one or more real tables in the database. You can add SQL functions, WHERE, and JOIN statements to a view and present the data as if the data were coming from one single table. CREATE VIEW view_name AS SELECT column_name(s) FROM table_name WHERE condition i.e. easy visualization

ORM

Object Relational Mapping...allows u to interact with ur database using language of choice instead of SQL....so SQLAlchemy is an ORM... benefits: abstracts away the database system (MySQL vs PostgreSQL), leverages language fluency, more features cons: learning pain sometimes, configuration can be difficult

peripheral

Peripheral management controls peripheral devices by sending them commands in their own computer language (i.e. device connected to main computer)

hypervisor (virtual machine manager [VMM])

Software that runs on a physical computer and manages one or more virtual machine operating systems.

CASE statement

The CASE statement goes through conditions and returns a value when the first condition is met (like an IF-THEN-ELSE statement). So, once a condition is true, it will stop reading and return the result. If no conditions are true, it returns the value in the ELSE clause. If there is no ELSE part and no conditions are true, it returns NULL

helpful mathematical methods

The COUNT(col_name) function returns the number of rows that matches a specified criterion. The AVG(col_name) function returns the average value of a numeric column. The SUM(col_name) function returns the total sum of a numeric column.

group by

The GROUP BY statement groups rows that have the same values into summary rows, like "find the number of customers in each country". The GROUP BY statement is often used with aggregate functions (COUNT, MAX, MIN, SUM, AVG) to group the result-set by one or more columns. creates frequency distributions! i.e. SELECT COUNT(CustomerID), Country FROM Customers GROUP BY Country; will group records with the same country and count each of their customer ID's

insert into statement

The INSERT INTO statement is used to insert new records in a table. INSERT INTO table_name VALUES (value1,value2,value3,...); *inserts for all columns, make sure order of values match order of columns INSERT INTO table_name (column1,column2,column3,...) VALUES (value1,value2,value3,...); *inserts only some values for a given entry

order by keyword

The ORDER BY keyword is used to sort the result-set by one or more columns. The ORDER BY keyword sorts the records in ascending order by default. To sort the records in a descending order, you can use the DESC keyword. SELECT column_name, column_name FROM table_name ORDER BY column_name ASC|DESC.... will do alphabetical order by default for strings

SELECT INTO

The SELECT INTO statement selects data from one table and inserts it into a new table. SELECT * INTO newtable [IN externaldb] FROM table1; SELECT column_name(s) INTO newtable [IN externaldb] FROM table1;

dialect

The dialect of the database is simply a term that defines the specific features of the SQL language that are available when accessing that database...it is how you communicate with a specific type of database

primary key

The field selected as a unique identifier for the database (uniquely defines each record)...primary keys must contain unique values are cannot have null values

SQL (Structured Query Language)

a popular language to run operations on databases (inherently for relational databases: NoSQL is for non-relational)

random access memory

access any aspect of memory at any time (in contrast to cyclic or delay line memory, where you have to wait for ur data to show up in the cycle)

unique vs primary key constraint

both primary key and unique guarantee uniqueness primary key automatically has unique contraint can have many unique contraints per table, only 1 primary key per table

SQL keywords are not

case sensitive

how does splitting hashtables/databases across machines to save memory work?

certain keys go to certain machines which can make it unbalanced however, as long as load is balanced, you can easily parallelize while knowing exactly which data is on which machine

transaction

collection of queries one unit of work it is atomic either all operations will succeed or be rolled back start transaction and commit (remember INFO 2300)

mongoDB organization

collections (equivalent of relational tables), and each record is called a document documents use JSON like formatting ex. db.collection.insertOne({name: "sue", age: 25})

shrink database

compact database pages aka move records into empty spaces on pages

metadata

container holding many features of a database (including on tables and child objects)

CREATE DATABASE databasename

create a new database

CREATE INDEX

create index on table (like back of book) for easier searching (speeds up select and where statements, slows insert/updates) essentially creates copy of table but indexed/sorted as specified CREATE INDEX index_name ON table_name (column1, column2, ...);

CREATE TABLE table_name ( column_1 data_type, column_2 data_type, column_3 data_type);

creates a new table in database

DROP DATABASE databasename

delete database

isolation read phenomena

dirty read - you read a value that another transaction hasn't committed...issue with updates non-repeatable reads - you read the same value twice, the second time after an update was committed (or similar situation), so ur second read is different...then your reads were not repeatable phantom reads - reading new inserted values before you meant to lost updates!

database collation

how data is sorted and compared in a database. Collation provides the sorting rules, case, and accent sensitivity properties for the data in the database. For example, when you run a query using the ORDER BY clause, collation determines whether or not uppercase letters and lowercase letters are treated the same.

sql server fill factor

how full each index page will be upon creation too full and you'll definitely need to add pages when records get inserted too low and you risk wasting space, especially if you have very few additions to the index, also bad read performance because more pages to read mostly reads = higher fill factor is probably good mostly writes = lower fill factor is better

index fragmentation

kind of like external fragmentation: pages are out of order (logical order of index pages does not match physical order) sometimes used to refer to internal fragmentation: each page has lots of empty space (microsoft calls this page density issue instead) as you add records to indices, you would need to add more pages eventually, but they may not be contiguous with the original index pages or if you delete a record, you would leave empty space on a page (microsoft classifies this as page density not index fragmentation) this leads to poor performance because internal fragmentation leads to more pages than necessary (ideally have all pages be 100% full, not <100%), so this is more disk reads/writes! slower! also poor performance because external fragmentation = random disk access rather than sequential, which is often slower for many hard drives

how low page density (internal fragmentation, sometimes classified under index fragmentation, but not by microsoft) degrades performance

leads to more levels in tree index = longer traversal times less relevant records get cached, as only a certain number of pages can fit in a cache more page reads required may result in a different query plan being chosen as the disk I/O increases and the cost of this plan is worse and worse

construct

like a word/syntactical thingy

CHECK keyword

limit value range that can be in column

magnetic core memory

magnetic donuts(cores) with wire loop, magnetized in certain directions to store 1's and 0's...donuts arranged into grids

hard disk

magnetic surface, allowing read/write heads to operate on it (sequential memory access though cuz disks are spinning)

solid state

no moving parts, uses integrated circuits (tiny chip!)

kernel vs operating system

operating system = package of datasoftware managing resources of a system kernel is the important program in the OS, connection between hardware and software

sharded database

physical sharding = splitting a database across multiple physical instances logical sharding = splitting up a table into multiple tables horizontal sharding = moving rows from one table into multiple tables vertical sharding = storing different columns of the same logical table into different tables while sharding is complex and queries can be harder to manage, it can speed up queries bc u can search each shard in parallel, rather than searching all rows in one monolithic table...then merge the results!

mysqli object oriented vs procedural

prefer object oriented over procedural (series of steps) php is moving towards object oriented

isolation levels

read uncomitted = no isolation, any change from outside is visible to transaction read committed = each query only sees committed stuff repeatable read = each query in transaction only sees committed updates at start of transaction [reading SAME VERSION within a transaction] serializable

foreign key

relates two tables together foreign key is a field/collection of fields in one table that refers to primary key of another table table with foreign key = child table, table with candidate key = referenced or parent table The FOREIGN KEY constraint is used to prevent actions that would destroy links between tables. The FOREIGN KEY constraint also prevents invalid data from being inserted into the foreign key column, because it has to be one of the values contained in the table it points to.

MIN(col_name) and MAX(col_name)

returns smallest or biggest field in a column works for strings (A = smallest)

order by several columns

selects all customers from the table, sorted by the A and the B column. This means that it orders by A, but if some rows have the same A, it orders them by B

storing objects in SQL databases

serialize the object as JSON, XML, or even Pickle (depending on what database accepts), then send that to database (will be stored as a string, but then you can retrieve and rebuild object w/ the string)

server vs mainframe

server manages resources and responds to requests on the network meanwhile mainframe is a workhorse that controls and handles a large userbase (meant for large-scale data processing or data storage) >> high volume of transactions

SQL comments

single line -- multiline is /* and */

database schema

skeleton structure representing the logical view of entire database, showing how data is organized and relations among them

database driver

software that connects a given application program to a particular DBMS mySQLi is the new improved driver (i = improved) implements protocol for connecting to database

odbc

standard api for accessing database management services


Conjuntos de estudio relacionados

Macroeconomics Module One Exam (Ch. 1-5)

View Set

Communication Accommodation Theory: Chapter 31

View Set

Install DC Motors and Generators

View Set

Muscles of Facial Expression (region, origin, insertion, action, & innervation)

View Set

Chapter 8 Recombinant DNA Technology

View Set