Database chapter 1-5
Data cude
- Multidimensional arrays, like stacked tables - Supports pivoting, rollup, drilldown of data
Characteristics of a conceptual database model
- Supports many different user views - Does not depend on the model used by a database management system
Design tools
-Data dictionary May be freestanding System catalog -Project management software Graphs, charts, document control, communication -Diagramming tools E-R, UML diagrams -CASE (Computer-Aided Software Engineering) Tools System analysis, project management, design
NewSQL
-stronger data consistency model - new distributed architectures - cloud computing
What is a record or row?
A row of data is each individual entry that exists in a table
The key elements of a Data Dictionary Attribute Name
A unique identifer
Syntax Create Database
Basic syntax of CREATE DATABASE statement is as follows: CREATE DATABASE DatabaseName;
Primary key
Candidate key actually used for identifying entities and accessing records
Star schema
Central fact table-observed data Dimension tables-data about attributes
SQL Process
Classic query engine handles all non-SQL queries, but SQL query engine won't handle logical files
Semi-structured
Collection of nodes, each with data, with different schemas Node contains a descriptive
The 5 V's of big data Value
Competitive advantage
External level
Consists of many user models or views Has external records - records seen by users May include calculated or virtual data Described in external schemas Used to create user interface
Uses of databases
Consumer websites Search engines Travel reservations Online banking Health care Libraries
SQL Commands CREATE
Creates a new table, a view of a table, or other object in database
SQL Commands INSERT
Creates a record
Syntax DROP DATABASE statement
DROP DATABASE DatabaseName;
DML
Data Manipulation Language
DCL
Data control language
The 5 V's of big data Velocity
Data is generated at great speed Speed needed for organizing, storing, processing
DQL
Data query language
The key elements of a Data Dictionary Attribute Type
Defines what type of data is allowable in a field
Attributes
Defining properties or qualities of entity type • Represented by oval on E-R diagram • Domain - set of allowable values for attribute • Attribute maps entity set to domain • May have null values for some entity instances - no mapping to domain for those instances • May be multi-valued - use double oval on E-R diagram • May be composite - use oval for composite attribute, with ovals for components • May be derived - use dashed oval
SQL Commands DROP
Deletes an entire table, a view of a table or other object in the database
SQL Commands
Deletes records
What is SQL?
Different dialects -MS SQL Oracle using PL/SQL -MS Access version of SQL is called JET
People in Integrated Database Environement
End users see a view of data Casual users-use query language Naive users- use programs Secondary users-use database output
NOT NULL Constraint
Ensures that a column cannot have NULL value.
Database management phase
Ensuring database security • Monitoring performance • Tuning and reorganizing • Keeping current on database Improvements
Logical level
Entire information structure of database • "community view" as seen by DBA • Collection of logical records • Derived from conceptual model • All entities, attributes, relationships represented • Includes all record types, data item types, relationships, constraints, semantic information, security and integrity information • Relatively constant over time • Described in logical schema • Used to create logical record interface
Conceptual model
Entities, entity sets, attributes, relationships. Often represented as E-R, EE-R or UML diagram.
What is field?
Every table is broken up into smaller entities. A column in a table that is designed to maintain specific information about every record in the table
Three-level Database Architecture
External, logical, internal
Purpose of E-R Model
Facilitates database design • Express logical properties of mini-world of interest within enterprise - Universe of Discourse • Conceptual level model • Not limited to any particular DBMS • E-R diagrams used as design tools • A semantic model - captures meanings
Query tool example
Find the names of all students enrolled in ART103A
NoSQL
For real-time queries and row-level inserts, updates, deletes horizontal scaling with replication and distribution over servers - flexible schema, weaker concurrency model, simple interface, parallel processing - types: key-value pairs, column-oriented, documented-oriented, graph-oriented
Advanced applications of databases
Geographic information systems Software development Scientific research Decision support systems Customer relations managements
SQL Commands GRANT
Gives a privilege to user
Graph-oriented systems
Graph consists of nodes, properties, and edges
The 5 V's of big data Volume
Huge amount of data from wide array of sources
Logical data independence
Immunity of external models to changes in the logical model Occurs at the user interface level
Physical data independence
Immunity of logical model to changes in internal model Occurs at logical interface level
Internal level
Implementation level • Includes data structures, file organizations used by DBMS • Depends on DBMS used • Described in internal schema • Used to create stored record interface with operating system • Operating system creates physical files and physical record interface, below DB
The integrated database environment database
Large repository of data Shared resource, used by many departments and applications of an enterprise Contains several different record types Managed by database administrator- DBA
Big table systems
Map indexed by a row key, column key, and timestamp
Logical model of database-intension
Metadata, data about data. Record types, data item type, data aggregates Schema- stored in system catalog
SQL Commands ALTER
Modifies an existing database object, such as a table
SQL Commands UPDATE
Modifies records
The 5 V's of big data Veracity
Must ensure correctness of data
Database is a resource because
Operational data has value Database incurs cost Professionally managed by DBA
Planning and design stage
Preliminary planning - Identifying user requirements - Developing and maintaining the data dictionary - Designing the conceptual model- may use ER or UML diagram - Choosing a DBMS - Developing the logical model-writes schema - Developing the physical model
Reporting tool example
Print a report showing each class number, the ID and name of the faculty member teaching the class, and the IDs and names of all the students in that class
Information
Processed data, useful for decision-making Example: formatted report using database
Data
Raw facts Example: Printout of tables as they are stored without headings saying what they mean
RDBMS
Relational Database Management System, is the basic for SQL and for all modern database systems
SQL Commands SELECT
Retrieves certain records from one or more tables
Key-value pairs
Schema-less; uses associative array of <key, value> pairs
A sample database
Simple university database Keeps information about -students -faculty -classes-links faculty to their classes -enrollment-links students to their classes
The 5 V's of big data Variety
Source data has many different formats
SQL
Structured Query Language, which a computer language for storing, manipulating and retrieving data stored in relational data
SQL Commands REVOKE
Takes back privileges granted from user
What is table?
The data in RDBMS is stored in database objects. A collection of related data entries and it consists of columns and rows
Basic syntax of USE statement is as follows:
USE DatabaseName;
Examples of Integrated Database Environment
University database Both users and applications go through DBMS Applications produce standard output
Data Dictionary
also called Data Definition Matrix, provides detailed information about the business data, such as standard definitions of data elements, their meaning and allowable values
Resource
an asset that has value and incurs cost
Column level constraints
are applied only to one column where as table level constraints are applied to the whole table.
Constraints
are the rules enforced on data columns on table.
Superkey
attribute or set of attributes that uniquely identifies an entity
Secondary key
attribute or set of attributes used for accessing records, but not necessarily unique
Big data technologies Hadoop
batch-oriented retrieval of large amounts of data Fault tolerance by dividing large files into blocks
Alternate key
candidate key not used for primary key
DBMS, Database management system
controls access to database has facilities to - set up database structure -load the data - retrieve requested data and format it for users -hide sensitive data -accept and perform updates -handle concurrency -perform backup and recovery Users Applications
DBMS uses a ____________ _____________, with at least two parts
data sublanguage
Basic Symbols for E-R Diagram Relationship
diamond
Real world
enterprise in its environment mini-world part of the world represented in the database
Recursive relationship
entity set relates to itself
Data occurences
extension database itself data instances files
Role
function that an entity plays in a relationship
Data sublanguage may be embedded in a
host language-general programming language, such as C, C++, C#
The key elements of a Data Dictionary Optional/Required
indicated whether information is required in a attribute before a record can be saved
What is column?
is a vertical entity in a table that contains all information associated with a specific field in a table
CREATE TABLE
is the keyword telling the database system what you want to do.
The SQL DROP DATABASE statement
is used to drop an existing database in SQL schema
The SQL USE statement
is used to select any existing database in SQL schema
Composite key
key with more than one attribute
Basic Symbols for E-R Diagram Link
line
Basic Symbols for E-R Diagram Attribute
oval
Basic Symbols for E-R Diagram Entity
rectangle
Candidate key
superkey such that no proper subset of its attributes is also a superkey (minimal superkey -no unnecessary attributes)
Data represented as
tables
Foreign key
term used in relational model (but not in the E-R model) for an attribute that is primary key of a table and is used to establish a relationship, usually with another table, where it appears as an attribute also
Project management software
tools to plan and manage projects, especially with many people
Data definition language (DDL)
used to define the database
Data manipulation language (DML)
used to process the database
Application programmers
write programs for other users
When you have multiple databases in your SQL Schema, then before starting your operation
you would need to select a database when all the operations would be performed.
Stages in Database Design
• Analyze user environment • Develop conceptual data model • Choose a DBMS • Develop logical model, by mapping conceptual model to DBMS • Develop physical model • Evaluate physical model • Perform tuning, if indicated • Implement physical model
Entity
• Object that exists and that can be distinguished from other objects • Can be person, place, event, object, concept in the real world • Can be physical object or abstraction • instance is a particular person, place, etc. • type is a category of entities • set is a collection of entities of same type-must be well-defined • type forms intension of entity - permanent definition part • instances form extension of - all instances that fulfill the definition at the moment • In E-R diagram, rectangle represents entity set, not individual entities
Entity-relationship model
•A semantic model, captures meanings • Conceptual level model • Proposed by Peter Chen in 1970s • Entities: real-world objects about which we collect data • Attributes: describe the entities • Relationships: associations among entities • Entity set: set of entities of the same type • Relationship set: set of relationships of same type • Relationships sets may have descriptive attributes • Represented by E-R diagrams