chapter 6
Big data
A collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools
___________ maintains information about various types of objects, events, people, and places.
A database
Which of the following is correct in reference to a database?
A database can support many logical views.
Foreign key
A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables
Extraction, transformation, and loading(ETL)
A process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse
Information cleansing or scrubbing
A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
Relational database management system
Allows users to create, read, update, and delete data in a relational database
Data-driven website
An interactive website kept constantly updated and relevant to the needs of its customers using a database
How would the airline industry use business intelligence?
Analyze popular vacation locations with current flight listings
Structured query language
Asks users to write lines of code to answer questions against a database
Classification
Assigns records to one of the predefined set of classes
Record
Collection of related data elements
Data dictionary
Compiles all of the metadata about the data elements in the data model
Data mart
Contains a subset of data warehouse information
Database management system
Creates, reads, updates, and deletes data in a database while controlling access and security
Attributes
Customer name, distributor ID, product price
Entity
Customer, distributor, product
What is the ultimate outcome of a data warehouse?
Data marts
Data visualization
Describes technologies that allow users to see or visualize data to transform information into a business perspective
Data quality audits
Determine the accuracy and completeness of its data
Estimation
Determines values for an unknown continuous variable behavior or estimated future value
Affinity grouping
Determines which things go together
Variety
Different forms of structured and unstructured data Data from spreadsheets and databases as well as from email, videos, photos, and PDF's, all of which must be analyzed
Primary key
Field (or group of fields) that uniquely identifies a given record in a table
Dynamic information
Includes data that change based on user actions
Static information
Includes fixed data incapable of change in the event of a user action
What is static information?
Includes fixed data incapable of change in the event of a user action
______ can present the results of large data analysis looking for patterns and relationships that monitor changes in variables over time.
Infographics
Which of the following occurs when a system produces incorrect, inconsistent, or duplicate data?
Information integrity issue
Complete
Is a value missing from the information? Example: Is the address complete including street, city, state, and zip code?
Which of the following would not be considered part of the unique characteristics of high-quality information?
Is aggregate information in agreement with detailed information?
Consistent
Is aggregate or summary information in agreement with detailed information? Example: Do all total columns equal the true total of the individual item?
Unique
Is each transaction and event represented only once in the information? Example: Are there any duplicate customers?
Timely
Is the information current with respect to business needs? Example: Is information updated weekly, daily, or hourly?
Accurate
Is there an incorrect value in the information? Example: Is the name spelled correctly?
Data model
Logical data structures that detail the relationships among data elements using graphics or pictures
Database
Maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
What is the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, employees, and other critical entities that are commonly integrated across organizational systems?
Master data management
Transactional information
Packaging slip, airline ticket, sales receipt, hotel reservation, restaurant receipt, shipping invoice
How would the technology industry use business intelligence?
Predict hardware failures
Analytical information
Product sales, sales projections, future growth, trends, quarterly sales forecast, industry growth perspective, market trend analysis
Metadata
Provides details about data
Clustering
Segments a heterogeneous population of records into a number of more homogeneous subgroups
Dynamic website information
Stored in a dynamic catalog, or an area of a website that stores information about products in a database
Entity
Stores information about a person, place, thing, transaction, or event
Relational database model
Stores information in the form of logically related two-dimensional tables
____________ has a defined length, type, and format and includes numbers, dates, or strings such as customer address.
Structured data
____________ asks users to write lines of code to answer questions against a database.
Structured query language
Velocity
The analysis of streaming data as it travels around the internet Analysis necessary of social media messages spreading globally
Information cube
The common term for the representation of multidimensional information
Attribute
The data elements associated with an entity
Content creator
The person responsible for creating the original website content
Content editor
The person responsible for updating and maintaining website content
Data mining
The process of analyzing data to extract information not offered by the raw data alone
Volume
The scale of data Includes enormous volumes of data Massive volume created by machines and networks Big data tools necessary to analyze zettabytes and brontobytes
Date element
The smallest or basic unit of information
Veracity
The uncertainty of data, including biases, noise, and abnormalities Uncertainty of untrustworthiness of data Data must be meaningful to the problem being analyzed Must keep data clean and implement processes to keep dirty data from accumulating in systems
In terms of big data, what includes the uncertainty of data, including biases, noise, and abnormalities?
Veracity
___________ analyzes unstructured data associated with websites to identify consumer behavior and website navigation.
Web analytics
A(n) ________ integrity constraint does not allow someone to create an order for a nonexistent customer.
relational