Ch 6 MIS
Record
- A collection of related data elements
Data warehouse
- A logical collection of information - gathered from many different operational databases - that supports business analysis activities and decision-making tasks The primary purpose of a data warehouse is to aggregate information throughout an organization into a single repository for decision-making purposes
Information cleansing or scrubbing
- A process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information
Cluster analysis
- A technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible .example: create targer-marketing strategies based on zip codes.
Statistical analysis
- Performs such functions as information correlations, distributions, calculations, and variance analysis Forecast - Predictions made on the basis of time-series information Time-series information - Time-stamped information collected at a particular frequency
Association detection
- Reveals the relationship between variables along with the nature and frequency of the relationships
Database
- maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses)
Primary key
A field (or group of fields) that uniquely identifies a given entity in a table
Entity
A person, place, thing, transaction, or event about which information is stored(the rows in the table)
Foreign key
A primary key of one table that appears an attribute in another table and acts to provide a logical relationship among the two tables
ETL(Extraction,transformation, and loading)
A process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse
Database management system(DBMS)
Allows users to create, read, update, and delete data in a relational database.
Data-driven website
An interactive website kept constantly updated and relevant to the needs of its customers using a database
Web mining
Analyzes unstructured data associated with websites to identify consumer behavior and website navigation
Text mining
Analyzes unstructured data to find trends and patterns in words and sentences
Information Quality
Business decisions are only as good as the quality of the information used to make the decisions You never want to find yourself using technology to help you make a bad decision faster
The problem:data rich, information poor
Businesses face a data explosion as digital images, email in-boxes, and broadband connections doubles by 2010 The amount of data generated is doubling every year Some believe it will soon double monthly
the business benefits of data warehousing
Data warehouses extend the transformation of data into information In the 1990's executives became less concerned with the day-to-day business operations and more concerned with overall business functions The data warehouse provided the ability to support decision making without disrupting the day-to-day operations
Common forms of data-mining analysis capabilities include
Cluster analysis Association detection Statistical analysis
Data dictionary
Compiles all of the metadata about the data elements in the data model
Structured data
Data already in a database or a spreadsheet
Unstructured data
Data does not exist in a fixed location and can include text documents, PDFs, voice messages, emails
Multidimensional analysis
Databases contain information in a series of two-dimensional tables In a data warehouse and data mart, information is multidimensional, it contains layers of columns and rows Dimension - A particular attribute of information Cube - Common term for the representation of multidimensional information
Physical view
Deals with the physical storage of information on a storage device
Data-driven website advantages
Easy to manage content Easy to store large amounts of data Easy to eliminate human errors
Transactional information
Encompasses all of the information contained within a single business process or unit of work, and its primary purpose is to support the performing of daily operational tasks
Analytical information
Encompasses all organizational information, and its primary purpose is to support the performing of managerial analysis tasks
Logical view
Focuses on how individual users logically access information to meet their own particular business needs
A well-designed database should:
Handle changes quickly and easily Provide users with different views Have only one physical view
The solution: business intelligence
Improving the quality of business decisions has a direct impact on costs and revenue BI enables business users to receive data for analysis that is: Reliable Consistent Understandable Easily manipulated
Database advantages from a business perspective include
Increased flexibility Increased scalability and performance Reduced information redundancy Increased information integrity (quality) Increased information security
Complete
Is a value missing from the information? Ex. is the address complete including street, city, state, and zip code?
Unique
Is each transaction and event represented only once in information? ex. are there any duplicate customers?
Consistent
Is summary information in agreement with detailed information? ex. do all total collumns equal the true total of the individual team?
Timely
Is the information current with respect to business needs? ex. is information updated weekly, daily, or hourly?
Accurate
Is there an incorrect value in the information? ex. is the name spelled correctly? is the $ amy recorded properly?
Data models
Logical data structures that detail the relationships among data elements using graphics or pictures .
Performance
Measures how quickly a system performs a certain process or transaction
Databases offer several security features
Password - Provides authentication of the user Access level - Determines who has access to the different types of information Access control - Determines types of user access, such as read-only access
Metadata
Provides details about the data
Scalability
Refers to how well a system can adapt to increased demands
Attribute(field,column)
The data elements associated with an entity.
Data redudancy
The duplication of data or storing the same information in multiple places .Inconsistency is one of the primary problems with redundant information
Data mining
The process of analyzing data to extract information not offered by the raw data alone
Data element(data field)
The smallest or basic unit of information(customer's name, address,email..)
Market basket analysis
analyzes such items as websites and checkout scanner information to detect customer's buying behavior and predict future behavior by identifying affinities among customers' choices of products and services.
Structured query language SQL
asks users to write lines of code to answer questions against a database.
Classification
assigns record ti one of a predefined set of classes
Data mart
contains a subset of data warehouse information.
Estimation
determines values for an unknown continuous variable behavious or estimated future value
Affinity groups
determines which things go together
Business-crtical integrity constraint
enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints.
Information granularity
extent of detail within the information(fine and detailed or coarse and abstract)
Query-by-example tool QBE
helps users graphically design the answer to a question against a database
Real-time information
immediate, up-to-date information
Information timeliness
is an aspect of information that depends on the situation
Information integrity
measures the quality of information
Real-time system
real-time information in response to requests.
Relational integrity constraint
rules that enforce basic and fundamental information-based constraints.
Integrity constraint
rules that help ensure the quality of information
Clustering
segments a heterogeneous population of records into a number of more homogeneous subgroups
Data mining tools
use a variety of techniques to find patterns and relationships in large volumes of information