Chapter 5: Data & Knowledge Management
Historical data
(In data warehouses) used for identifying trends, forecasting, and making comparisons over time.
Instance
(of an entity class) is the representation of a particular entity.
Attribute
A particular characteristic or quality of a particular entity.
Data Standardization
A business rules engine that ensures that data conforms to quality rules.
Primary Key
A field that uniquely identifies a record.
Field
A group of logically related characters (e.g., a word, small group of words, or identification number).
Record
A group of logically related fields (e.g., student in a university database).
Database
A group of logically related files.
File
A group of logically related records.
Data Warehouse
A repository of historical data organized by subject to support decision makers in the organization.
DBMS (Database Management System)
A set of programs that provide users with tools to add, delete, access, and analyze data stored in one location. Minimizes the following problems: Data redundancy Data isolation Data inconsistency Maximizes: Data security Data integrity Data independence DBMSs provide all users with access to all the data that they need to do their jobs.
Data Security
Allows users to access only data that they are authorized to access.
Query by Example (QBE)
Allows users to fill out a grid or template to construct a sample or description of the data he or she wants.
Structured Query Language (SQL)
Allows users to perform complicated searches by using relatively simple statements or keywords.
Difficulties in Managing Data
Amount of data increases exponentially - Data deluge. Data are scattered and collected by many individuals using various methods and devices. Data come from many sources. Data degrade over time (Data Rot) Data security, quality and integrity are critical.
Data Governance
An approach to managing data and information across an entire organization.
Intellectual Capital (or intellectual assets)
Another term often used for knowledge.
Data Isolation
Applications cannot access data associated with other applications. Data can be accessed by 1 program or system only.
Identifiers
Attributes that are unique to that entity instance. Entity instances have them.
Relational Database Model
Based on the concept of two-dimensional tables.
ER Diagrams
Consists of entities, attributes and relationships. Entities are boxes and relationships diamonds.
Matching or Linking
Compares data so that similar, but slightly different records can be aligned. Matching may use "fuzzy logic" to find duplicates in the data.
Knowledge Management System Cycle
Create knowledge Capture knowledge Refine knowledge Store knowledge Manage knowledge Disseminate knowledge
Examples of Data Sources
Credit card swipes RFID Tags Digital video surveillance E-mails Radiology scans Blogs Sales transactions - at stores & E-Commerce
Tacit Knowledge
Cumulative store of subjective or experiential learning. Examples: experiences, insights, expertise, know-how, trade secrets, understanding, skill sets, and learning.
Entity-Relationship (ER) Modeling
Database designers plan the database design in a process called...
Data Model
Diagram that represents the entities in the database and their relationships.
Byte
Eight bits and represents a single character (e.g., a letter, number or symbol).
Benefits of Data Warehousing
End users can access data quickly and easily via Web browsers because the data are located in one place. End users can conduct extensive analysis with data in ways that may not have been possible before. End users have a consolidated view of organizational data.
Data Integrity
Ensures that data is correct and can not be corrupted by unauthorized users.
Geocoding
For name and address data. Corrects data to US and Worldwide postal standards.
Entity Classes
Groups of entities of a certain type.
Online Transaction Processing (OLTP)
In contrast to OLAP, typically involves a database, where data from business transactions are processed online as soon as they occur.
Knowledge
Information that is contextual, relevant, and actionable.
Data Profiling
Initially assesses the data to understand its quality challenges.
Online Analytical Processing (OLAP)
Involves the analysis of accumulated data by end users (usually in a data warehouse).
Monitoring
Keeps track of data quality over time, reports variations in the quality of data, and auto-corrects the variations based on pre-defined business rules.
Normalization
Method for analyzing and reducing a relational database to its most streamlined form for: Minimum redundancy Maximum data integrity Best processing performance This type of data is when attributes in the table depend only on the primary key. A systematic way of ensuring that a database structure is suitable for general-purpose querying and free of certain undesirable characteristics—insertion, update, and deletion anomalies—that could lead to a loss of data integrity.
Explicit Knowledge
Objective, rational, technical knowledge that has been documented. Examples: policies, procedural guides, reports, products, strategies, goals, core competencies.
Secondary keys
Other field that have some identifying information but typically do not identify the file with complete accuracy.
Data Mining
Process of searching for valuable business info in a large database, data warehouse, or data mart.
Knowledge Management (KM)
Process that helps organizations manipulate important knowledge that is part of the organization's memory, usually in an unstructured format.
Master Data Management
Process that spans all of an organization's business processes and applications.
Knowledge Management Systems (KMSs)
Refer to the use of information technologies to systematize, enhance, and expedite intra-firm and inter-firm knowledge management.
Data Independence
Separation of data from the computer programs.
Master Data
Set of core data that span all enterprise information systems.
Data Mart
Small data warehouse, designed for the end-user needs in a strategic business unit (SBU) or a department.
Best Practices
The most effective and efficient ways of doing things.
Data Redundancy
The same data stored in many places.
Data Cube
Three dimensions: customer, product, and time.
Data Inconsistency
Various copies of the data do not agree.
Entity
a person, place, thing, or event about which information is maintained. A record generally describes an entity.
Bit
binary digit, or a "0" or a "1".