IS312 - Ch3 - Data and Knowledge Mgt
Data Governance
managing information, an entire organization, formal set of unambiguous rules for creating, collecting, handling, and protecting its information.
Sarbanes-Oxley Act of 2002 requires that:
(1) public companies evaluate and disclose the effectiveness of their internal fi nancial controls (2) independent auditors for these companies agree to this disclosure.
6 sizes of data Heirarchy IN ORDER
1) Bit (binary digit): represents the smallest unit of data a computer can process and it consists only of a 0 or a 1. 2) Byte: A group of eight bits represents a single character (letter, number, or symbol). 3) Field: A column of data containing a logical grouping of characters into a word, a small group of words (e.g., last name, social security number, etc.). 4) Record: A logical grouping of related fi elds in a row (e.g., student's name, the courses taken, the date, and the grade). 5) Data File: logical grouping of related records is called a data file or a table similar in appearance to a speadsheet in Excel consisting of multiple columns and multiple rows. 6) Database: logical grouping of related data files (aka database tables).
6 Steps of the Knowledge Mgt System (KMS)
1) Create knowledge: Knowledge is created as people determine new ways of doing things or develop know-how. Sometimes external knowledge is brought in. 2) Capture knowledge: New knowledge must be identified as valuable and be represented in a reasonable way. 3) Refine knowledge: New knowledge must be placed in context so that it is actionable. This is where tacit qualities (human insights) must be captured along with explicit facts. 4) Store knowledge: Useful knowledge must then be stored in a reasonable format in a knowledge repository so that other people in the organization can access it. 5) Manage knowledge: Like a library, the knowledge must be kept current. It must be reviewed regularly to verify that it is relevant and accurate. 6) Disseminate knowledge: Knowledge must be made available in a useful format to anyone in the organization who needs it, anywhere and anytime.
3 things that Database Mgt Systems MINIMIZE
1) Data Redundancy 2) Data Isolation 3) Data Inconsistency
3 things that Database Mgt Systems MAXIMIZE
1) Data Security 2) Data Integrity 3) Data Independence
3 Types of data sources
1) Internal Data Sources (e.g., corporate databases and company documents) 2) Personal Data Sources (e.g., personal thoughts, opinions, and experiences) 3) External Data Sources (e.g., commercial databases, government reports, and corporate Web sites).
Using Big Data
1) Making Big Data Available to stockholders 2) Conducting Experiments 3) MIcro-segmentation of customers 4) Creating new business models 5) Analyzing more data than every before - dont have to sample small pops.
7 Parts of the Relational Database Model / Database Mgt Sys (DBMS)
Based on the concept of two-dimensional tables and is usually designed with a number of related tables with each of these tables contains records (listed in rows) and attributes (listed in columns). 1) Data Model: a diagram that represents entities in the database and their relationships. 2) Entity: a person, place, thing, or event (e.g., customer, an employee, or a product). 3) Record: generally describes an entity and an instance of an entity refers to each row in a relational table. 4) Attribute: each characteristic or quality of a particular entity. 5) Primary Key: a field in a database that uniquely identify each record so that it can be retrieved, updated, and sorted. 6)Secondary Key: a field that has some identifying information, but typically does not identify the record with complete accuracy and therefore cannot serve at the Primary Key. 7) Foreign Key: a field (or group of fields) in one table that uniquely identifies a row of another table. It is used to establish and enforce a link between two tables.
Big Data
DIVERSE, HIGH-VOLUME, HIGH-VARIETY information assets that require new forms of processing to enable enhanced decision making, insight discovery, and process optimization. Can come from untrusted sources, can be dirty, and can be volatile NoSQL (not only SQL) - organizes data Can use distributed processing
Managing Big Data
First Step: integrate information silos into a database environment and develop data warehouses for decision making. Second Step: making sense of their proliferating data. Information Silo: is an information system that does not communicate with other, related information systems in an organization.
Data Warehouse (and data mart)
Repository of historical data (Data mart is a mini warehouse) Uses Online Analytical Processing (OLAP) Data is: Integrated - Data are collected from multiple systems and then integrated around subjects. Time variant - Data warehouses and data marts maintain historical data (i.e., data that include time as a variable). Nonvolatile - Data warehouses and data marts are nonvolatile—that is, users cannot change or update the data. Multidimensional - Typically the data warehouse or mart uses a multidimensional data structure. Recall that RELATIONAL DATABASES store data in TWO DIMENSIONAL TABLES
Generic Data Warehouse Environment
Source Systems: Systems that provide a source of organizational data. Data Integration: reflects the growing number of ways that source system data can be handled. Typically organizations need to Extract, Transform, and Load (ETL) data from source system into a data warehouse or data mart. Storing the Data: A variety of architectures can be used to store decision-support data and the most common architecture is one central enterprise data warehouse, without data marts. Metadata: data maintained about the data within the data warehouse. (e.g., database, table, and column names; refresh schedules; and data-usage measures. Data Quality: quality of the data in the warehouse must meet users' needs. If it does not, users will not trust the data and ultimately will not use it. Some of the data can be improved with data-cleansing soft ware, but the better, long-term solution is to improve the quality at the source system level. Governance: To ensure that BI is meeting their needs, organizations must implement governance to plan and control their BI activities. Governance requires that people, committees, and processes be in place. Users: There are many potential BI users, including IT developers; frontline workers; analysts; information workers; managers and executives; and suppliers, customers, and regulators. Common Examples of Source Systems Include: operational/transactional systems enterprise resource planning (ERP) systems Web site data third-party data (e.g., customer demographic data) operational databases
Data File
a collection of logically related records.
Master Data
a set of core data (e.g., customer, product, employee, vendor, geographic location, etc.) that span the enterprise information systems.
Master Data Mgt
strategy for data governance involving a process that spans all organizational business processes and applications providing companies with the ability to store, maintain, exchange, and synchronize a consistent, accurate, and timely for the company's master data.