MGMT 382 CH. 6
Data Visualization
describes technologies that allow users to see or visualize data to transform information into a business perspective
Data Quality Audits
determine the accuracy and completeness of its data
Dirty Data
erroneous or flawed data
Data Cleansing or Scrubbing
Process that weeds out and fixes or discards inconsistent, incorrect, or incomplete data.
Business-Critical Integrity Constraints
enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints - Tend to mirror the very rules by which an organization achieves success.
Source Data
identifies the primary location where data is collected
Data Validation
includes the tests and evaluations used to determine compliance with data governance policies to ensure correctness of data
Business Advantages of a Relational Database
increased flexibility, increased scalability and performance, reduced information redundancy, increased information integrity, increased information security
Distributed Computing
processes and manages algorithms across many machines in a computing environment
Metadata
provides details about data
Data aggregation
the collection of data from various sources for the purpose of data processing
Data Redundancy
the duplication of data, or the storage of the same data in multiple places
Data Stewardship
the management and oversight of an organization's data assets to help provide business users with high-quality data that is easily accessible in a consistent manner
Master Data Management (MDM)
the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems
Data latency
the time it takes for data to be stored or retrieved
Query-by-example (QBE)
tool that helps users graphically design the answer to a question against a database
Business Intelligence Dashboards
track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis
Competitive Monitoring
where a company keeps tabs of its competitor's activities on the web using software that automatically tracks all competitor website activities such as discounts and new products
Foreign Key
Primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between two tables.
Data Point
Individual item on a graph or chart.
Dirty Data Problems
-Duplicate data -Misleading data -Incorrect data -Non-formatted data -Violates business rules data -Non-integrated data -Inaccurate data
The four primary reasons for low-quality information
1. Online customers intentionally enter inaccurate data to protect their privacy 2. Different systems have different data entry standards and formats 3. Data-entry personnel enter abbreviated data to save time or erroneous data by accident 4. Third-party and external data contains inconsistencies, inaccuracies, and errors.
Answers to Tough Business Questions using BI
* Where has the business been? (Historical perspective offers important variables for determining trends and patterns) * Where is the business now? ((Looking at the current business situation allows managers to take effective action to solve issues before they grow out of control) * Where is the business going? (Setting strategic direction is critical for planning and creating solid business strategies)
Primary Key
A field (or group of fields) that uniquely identifies a given record in a table
Genesis Block
First block created in a block chain.
Real-time system
Provides real-time data in response to requests
Data Governance
refers to the overall management of the availability, usability, integrity, and security of company data
Data Steward
responsible for ensuring the policies and procedures are implemented across the organization and acts as a liaison between the MIS department and the business
ETL (extraction, transformation, and loading) or Integration Layer
A process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse
Blockchain
A type of distributed ledger technology consisting of data structure blocks that may contain data or programs, with each block holding batches of individual transactions and the results of any executables. Each block contains a time stamp and a link to a previous block.
Immutability
Ability for a blockchain ledger to remain a permanent, indelible, and unalterable history of transactions.
Five Common Characteristics of High-Quality Data
Accurate - is there an incorrect value in the data? Ex: is the name spelled correctly? Is the dollar amount recorded properly? Complete - is a value missing from the data? Ex: Is the address complete including street, city, state, and zip code? Consistent - Is aggregate or summary data in agreement with detailed data? Ex: Do all total columns equal the true total of the individual term? Timely - is the data current with respect to business needs? Ex: Is data updated weekly, daily, or hourly? Unique - Is each transaction and event represented only once in the data? Ex: Are there any duplicate customers?
Record
Collection of related data elements. Each record in an entity occupies one row in its respective table.
Data Mart
Contains a subset of data warehouse data
Database Management System (DMBS)
Creates, reads, updates, and deletes data in a database while controlling access and security. - Managers send requests to the DBMS< and the DBMS performs the actual manipulation of the data In the database.
Attributes (columns or fields)
Data elements associated with an entity.
Blocks
Data structures containing a hash, previous hash, and data.
Physical View of Data
Deals with the physical storage of data on a storage device.
Transactional Data
Encompasses all of the data contained within a single business process or unit of work, and its primary purpose is to support daily operational tasks. Ex: Sales Receipt, Airline Ticket, Packing Slip
Analytical Data
Encompasses all organizational information, and its primary purpose is to support the performing of managerial analysis tasks Ex: Product statistics, Sales Projections, Future Growth, Trends
Logical View of Data
Focuses on how individual users logically access data to meet their own particular business needs.
Hash
Function that converts an input of letters and numbers into an encrypted output of fixed length. Hashes are the links in the blockchain.
Real-time data
Immediate, up-to-date data.
Reasons Business Analysis Is Difficult from Operational Databases
Inconsistent Data Definitions = every department had its own method for recording data so when trying to share information, data did not match and users did not get the data they really needed. Lack of Data Standards = Managers needed to perform cross-functional analysis using data from all departments, which differed in granularities, formats, and levels. Poor Data Quality = The data, if available, were often incorrect or incomplete. Therefore, users could not rely on the data to make decisions. Inadequate Data Usefulness = Users could not get the data they needed; what was collected was not always useful for intended purposes. Ineffective Direct Data Access = most data stored in operational databases did not allow users to direct access; users had to wait to have their queries or questions answered by MIS professionals who could code SQL.
Data warehouse
Logical collection of data-gathered from many different operational databases-that supports business analysis activities and decision-making tasks.
Database
Maintains data about various types of objects (inventory), events (transactions), people (employees), and places (warehouses).
Data integrity
Measure of the quality of data
Dataset
Organized collection of data
Ledger
Records classified and summarized transactional data
Data Granularity
Refers to the extent of detail within the data (fine and detailed or coarse and abstract)
Proof-of-work
Requirement to define an expensive computer calculation, also called mining, that needs to be performed in order to create a new group of trustless transactions (blocks) on the distributed ledger or blockchain.
Relational Integrity Constraints
Rules that enforce basic and fundamental information-based constraints
Integrity Constraints
Rules that help ensure the quality of data
Immutable
Simply means unchangeable
Data Element (Data Field)
Smallest or basic unit of data
Entity (Table)
Stores data about a person, place, thing, transaction, or event.
Relational Database Model
Stores data in the form of logically related two-dimensional tables.
Type:
Transactional and Analytical
4 Primary Traits of the Value of Data
Type, Quality, Timeliness, Governance
Proof-of-stake
Way to validate transactions and achieve a distributed consensus
Database Reflection
While a database only has one physical view, it can easily support multiple logical views that provide for flexibility.
Identity Management
a broad administrative area that deals with identifying individuals in a system (such as a country, a network, or an enterprise) and controlling their access to resources within that system by associating user rights and restrictions with the established identity
Data artist
a business analytics specialist who uses visual tools to help people understand complex data
Data Lake
a storage repository that holds a vast amount of raw data in its original format until the business needs it
Data map
a technique for establishing a match, or balance, between the source data and the target data warehouse
Bitcoin
a type of digital currency in which a record of transactions is maintained and new units of currency are generated by the computational solution of a mathematical problems and which operates independently of a central bank.
Relational Database Management System
allows users to create, read, update, and delete data in a relational database
Data-driven decision management
approach to business governance that values decisions that can be backed up with verifiable data
Structured Query Language (SQL)
asks users to write lines of code to answer questions against a database
Comparative Analysis
can compare two or more data sets to identify patterns and trends
Data cube
common term for the representation of multidimensional information
Data Dictionary
compiles all of the metadata about the data elements in the data model
Raw Data
data that has not been processed for use
Business Rule
defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer
Data models
logical data structures that detail the relationships among data elements using graphics or pictures
Data Visualization Tools
move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more
Data integrity Issues
occur when a system produces incorrect, inconsistent, or duplicate data
Data Gap analysis
occurs when a company examines its data to determine if it can meet business expectations, while identifying possible data gaps or where missing data might exist
Data Inconsistency
occurs when the same data element has different values
Analysis Paralysis
occurs when the user goes into an emotional state of over-analysis (or over-thinking) a situation so that a decision or action is never taken, in effect paralyzing the outcome
Infographics
present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format