IS: Chapter 6
A data market is smaller, decentralized data warehouse that stores a subset of the organizational data and/or serve a specific population of users.
True
DBMS simplify how end users work with databases by separating the logical and physical view of the data.
True
Every record in a relational table should contain at least one key field.
True
Web mining is the discovery of useful patterns on the Web.
True
Hadoop is an open source software framework that enables distributed parallel processing of huge amounts of data across inexpensive computers.
True or False
Key Field
Uniquely identifies each record.
Text Mining
Unstructured data (mostly text files) account for 80% of an organization's useful information.
Referential Integrity Rules
Used by rational databases to ensure that relationships between coupled tables remain consistent.
Forecasting (Data Mining Types of Information)
Uses series of values to forecast future values.
Can you leave one of the supplier's entries empty.
Yes
Fields
(Columns) Store data representing an attribute.
Database
Collection of related files containing records on people, places, or things.
Clustering (Data Mining Types of Information)
Discovering as yet unclassified groupings.
Data mining is more ____ than OLPA
Discovery driven
Sequences (Data Mining Types of Information)
Events linked over time.
What kind of relationship does the Figure demonstrate?
Example
Each relation ship table can have only one foreign key.
False
In linking databases to the Web, the role of the application server is to host the DBMS.
False
Data Mining
Finds hidden patterns and relationships in large databases and differs rules from them to predict future behavior.
Entity
Generalized category representing person, place, thing on which we store and maintain information.
A one-to-one relationship between two entities is symbolized in a diagram by a line that ends:
In two short marks
A table that links two tables that have many-to-many relationship is often called a(n):
Intersection relation.
An example of Poor DB Design
Make sure all the data is independent
Memory Computing
Memory is the primary source and the hard drive is the secondary source. The data saved is temporary.
Structure Mining (Type of Web Mining)
Mines Web site structural elements, such as links.
Content Mining (Type of Web Mining)
Mines content of Web sites.
Usage Mining ( Type of Web Mining)
Mines user interaction data gathered by Web servers.
The process of streamlining data to minimize redundancy and awkward many-to many relationships is called:
Normalization
Associations (Data Mining Types of Information)
Occurrences linked to a single event.
Primary Key
One field in each table Cannot be duplicated Provides unique identifier for all information in any row.
Rational Database table may have:
One to one: Marriage, President and Country One to Many: Professors and a class. Many to Many: Requires a join table
Relation Database
Organize data into two-dimensional tables (relations) with columns and rows.
Classifications (Data Mining Types of Information)
Patterns describing a group an item belongs to.
A field identified in a table as holding the unique identifier of the table's records is called the:
Primary Key
Normalization
Process of streamlining complex groups of data to (a) minimize redundant data elements. (b) minimize awkward many-to-many relationships. (c) Increase stability and flexibility.
The tool that enables users to view the same data in different ways using multiple dimensions is:
QLAP
Web Mining
Recommendation system on Amazon, discovery and analysis of useful patterns and information.
Bridge/ Entity Table
Relates to each table. Usually used in order to break up a many-to-many relationship.
Attributes
Specific characteristic of each entity
Rows
Store data for separate records, or tuples.
Businesses use ___ tools to search and analyze unstructured data sets, such as e-mails and memos.
Text Mining
Which of the following best illustrates the relationship between entities and attributes?
The entity CUSTOMER with the attributes ADDRESS.