CSCI 103 Chapter 11 - Final
How do databases manage large amounts of data efficiently?
Databases - organize the data in a specific ways - store in multiple lists (tables) Database programs are designed specifically to manage large amounts of data accurately as it is updated and manipulated
How do databases make our lives easier?
Databases can: - manage large amounts of data efficiently - enable information sharing - promote data integrity
What do databases turn data into?
Databases turn data into information
What fields are needs in my database?
Each field should describe a unique piece of data, it cannot combine two pieces of data. This allows for better filtering and sorting.
Records :
Groups data of related fields
In retail, different departments of the retailer are responsible for each type of product and track the products they sell in different ____________?
In retail, different departments of the retailer are responsible for each type of product and track the products they sell in different databases.
Field size determines what?
It determines the maximum number of characters that can be used.
How to include data dictionary details in a database:
It is used to create the tables - In Microsoft Access it can be created in Datasheet View and Design View
When is a list not sufficient for organizing data?
LISTS ARE NOT SUFFICIENT FOR: - complex information - when multiple people need to access information
Modern systems provide ___________ for creating queries
Modern systems provide wizards for creating queries
Most databases focus on (how many) specific operational aspects of business operations?
Most databases focus on one specific operational aspects of business operations
Due to the capturing of data of multiple time periods, this enables an analysis of the what?
OF THE PAST don't be dumb on this question
Can you add/delete fields from the database?
Once relationships are established and data is entered, it is difficult to add or delete fields.
What does enforcing referential integrity prevent?
Orphan records
Is the market share of relational databases large or small?
Relational databases have the largest market share
Retrieving data from a database can generate what?
Retrieving data from a database can generate reports from data in tables or from queries
How are data tables created?
Step 1= Input unique field names Step 2= Define the data type Step 3= Set a maximum field size Step 4= Set a default value if necessary
Data warehouse data is (what kind of data)
TIME-VARIANT DATA (this means that the data doesn't all pertain to one period in time)
The focus in storing data in a warehouse / database is on ___________ (real life example below)
The focus in storing data is on one specific aspect of the operation
If two source databases contain very similar information, is the format of the data the SAME or DIFFERENT?
The format of the data is most likely different for each database
What are alphabetic and numeric checks?
These checks confirm that only text or numbers are entered in fields
What are the functions of a database?
They - allow users to extract subset of data from the database - output the data in a meaningful and presentable format
What is the important about databases?
They help you interact more effectively with the system AND help get the information for which you are looking for
To handle smaller amounts of data, what do warehouses use?
They use data marts
Can data be imported, exported, and converted to and from other formats?
YES, data can be imported, exported, and converted into different formats.
Do databases keep information in a centralized location?
Yes
Can adjustments be made to the report (retrieved data)?
Yes adjustments, such as grouping and compiling summary reports, can be made to the report.
Is the data in a data warehouse organized the same way as data in a normal database?
Yes, both a data warehouse and normal database organize data the same way.
Should business owners examine current data in light of historical data?
Yes, business ppl should examine the present in light of historical data
Can data be transferred from a database to another software application?
Yes, it can by exporting the data.
Field constraint:
a property that must be satisfied
The most common output of retrieving data from a database is :
a viewable or printable report
Binary large object (BLOB):
an unstructured data encoded in binary form
The data being captured is
contained in database's files.
Current and historical data can be contained by
data warehouses duh
Referential integrity:
each value in the foreign table has a corresponding value in the primary table
The data stored in a data warehouse can contain information from .......
from multiple databases
Can data from the past be used to make projections about the future?
if you get this wrong drop out
A field
is a space allocated for a particular item of information. In database systems, fields are the smallest units of information you can access.
Databases make data _________ :
more meaningful and more useful
Data cube pic
on laptop lol
DATA WAREHOUSE DO NOT CAPTURE DATA ONLY FROM ____
one time period!!!!!!!!!
Normalization ensures that data is:
organized efficiently
Exporting :
putting data into a format that another application can also understand
Data mart :
small slices of the data warehouse
why are data marts created?
so that companies can analyze a related set of data that are grouped together, but separated out from the main body of data in the warehouse
Clickstream data :
software used on company websites can capture information about each click that users make as they navigate through the site
Databases promote data integrity because :
the centralization ensure integrity and data only needs to be updated in one place
Since the warehouse holds information from many databases, the data for many programs and procedures might need to be ________ and ________.
the data for many programs and procedures might need to be extracted and reformatted due to the different formats of different databases
Data integrity :
the data is accurate and reliable
How to think of data warehouse / data markets ? (think it out don't guess GSnyde)
the grocery store sells baby powder that they get from the warehouse. the warehouse has baby powder, the store is just a smaller version of the warehouse. (same thing just a real life version)
Structure Query Language (SQL)
the most popular language used when displaying a subset of data in a database
Normalization is:
the process of ensuring that data is organize efficiently. - reduces data redundancy - separates data into tables
Field names must be _________
unique within a table
Warehouses are _________ , while markets pertain to _ _______ component
warehouses have enterprise-wide depth markets pertain to a single component of the business
How are data warehouses populated with data?
with the use of : - internal sources - external sources - clickstream data
Internal sources include: regarding populating data warehouses
(company's databases & tools) sales, billing, inventory, and customer databases spreadsheets & other analysis tools contain data that can be loaded into the data warehouse.
External sources include: regarding populating data warehouses
(data provided by vendors, suppliers, etc.) vendors and suppliers often provide data regarding product specifications, shipment methods & dates, billing information, etc.
Fields :
- A space designed for one particular item of information - Store each category of information - Display data as columns - Identify data by a field name - Smallest units of information that you can access in a database system
What are the problem with lists?
- Data redundancy occurs when there is a repetition of data - Would require the updating of multiple lists - Data inconsistency happens when not all of the duplicated data is updated properly - not in a centralized location
Tables :
- Groups data of related records - Common subject
Databases are NOT needed for managing all types of data such as:
- Lists, they are adequate for simple tasks - World - Excel
Different types of relational databases:
- One-to-many: a record appears once in one table and many times in a related table - One-to-one: for each record in one table there is only one record in a related table - Many-to-many: records in one table can be related to multiple records in a related table
How do databases make information sharing possible?
- Only one file is maintained (data centralization) - Centralized database becomes a shared source of information - No files reconcile with each other - Controlled access increases security
What are common validation rules?
- Range check - Completeness check - Consistency check - Alphabetic and numeric checks
What are the three major types of databases?
- Relational - Object-oriented - Multidimensional
Common Types of DATA:
- Text: letters, symbols, or combinations of up to 255 characters (can be phone numbers / zip codes too) - Memo: long blocks of text, up to 65,535 characters - Numeric: can be used in calculations or currency - Date/Time: in standard notation. Data may be used in calculations for counting the number of days/ length of time. - Calculated: results of a calculation; often involves date/time and numeric data - Object: pictures, charts, or files from another Windows-based application - Hyperlink: alphanumeric data stored as a hyperlink address to a web page, an email address, or an existing file.
What is a good primary key?
- a set of UNIQUE characters - should not violate privacy concerns
Different ways to view data in a database :
- all data displayed at once - one record at a time - display individual records - from only one table - from related tables
What are the advantages of multidimensional databases?
- customized to provide information to a variety of users - process data fast - critical for larger databases
Characteristics of Query Language :
- has it's own vocabulary and sentence structure - easier to use than full-blown programming languages - similar to ^^
Data warehouses :
- large-scale collection of data **even though data comes from multiple databases it is stored in one place** - data comes from multiple databases - contains and organized data in one place - consolidate information from various systems to present enterprise-wide view of operations
Data cube:
- measure attribute: main type of data that cube is tracking - feature attributes: describe measure attribute in meaningful way
What are the disadvantages of databases?
- more time consuming and expensive to set up/administer - need to be careful in database design - administrator is responsible for designing. constructing, and maintaining databases (needed for larger databases) - Review of database must be ongoing to ensure a smooth flow of data
Relational database:
- organizes data in table format - logically groups similar data into a relation - links data between tables through relationships on common keys - need to keep data in related tables, synchronized
Multidimensional database:
- stored data can be analyzed from different perspectives (dimensions) - relational database has only two dimensions (fields and records) - multidimensional databases organize in cube format
Object-oriented database:
- stores data in objects rather than tables - contain methods for processing or manipulating data - can store more types of data than relational databases - can access data faster - unstructured data includes audio&video clips, pictures, and EXTRA large documents - structured data is analytical data - based on complex models for manipulating data
How can one ensure that only valid data is entered into a field?
- validation is the process of ensuring that data meets specified guidelines - validation rule is defined in data dictionary - specified in field properties for each field - violations result in error message
Four operations of a DBMS =
1. Creating databases and entering data 2. Viewing and sorting data 3. Querying (extracting) data 4. Outputting data
the THREE stages of staging are :
1. Extraction of the data from source databases 2. Transformation (reformatting) of the data 3. Storage of the data in the data warehouse
What are two ways to display records that match particular criteria? (Extracting / Querying Data)
1. Use a filter: - temporarily displays records that match criteria - results cannot be saved - ONLY can be applied to fields in one table 2. Create a query: - retrieves data from a particular subset - data can be extracted from one or multiple tables
Online analytical processing (OLAP)
= software that provides standardized tools for viewing and manipulating data
What is a completeness check?
A completeness check ensures that all required fields have been completed
What is a data dictionary?
A data dictionary: - is like a map of the database - defines the features of the field - need to define for each field before data entry - attributes include field name, data type, description, properties, and field size
What is a database?
A database is a collection of related data that can be - stored - sorted - organized - queried
How are databases created?
A database management system (DBMS) is specially designed software used to capture and analyze data
What is a database query?
A database query is a way of retrieving information that defines a particular subset of data.
A good primary key doesn't have to ___________________.
A good primary key doesn't have to represent something
What is a range check?
A range check is data that falls within range of values
Simple level: (data warehousing / storing)
At the simple level, - data is retrieved as needed - popular/most common in small databases, simple enterprises, and a single database.
How can one create a database with a DBMS?
By describing the data that is to be captured and by using a data dictionary.
How does one display a subset of data in a database?
By using a Query language
Clickstream data is most commonly used to see what should be _____ on the company website
Clickstream data is most commonly used to see what should be changed on the company website
What is the term data warehousing?
Data Warehousing is the storing of data at different database levels.
How can data get into the database?
Data can be : - directly keyed into the database - imported from other files - saves time - reduces data error - must match the format of the database exactly - filters are often applied - nonconforming data is flagged
The database files are referred to as the _____________________?
Data dictionary (schema)
Data in the data warehouse is organized by _____________
Data in the data warehouse is organized by *SUBJECT*
How is data stored in databases?
Data is stored in fields, records, and tables.
Data must be _______ before being entered into a data warehouse
Data must be staged