IS chapter 3

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

benefits of data warehousing

-access data quickly and easily via web browsers bc they are located in one place -conduct expensive analysis with data in ways that may not have been possible before -consolidate view or organizational data

characteristics of high quality info include

-accuracy -completeness -consistency -uniqueness -timeliness

data warehouse

-captures organizational level and information -data is cleaned -data does not change -used for mining the data for historical trends and patterns

master data management

a strategy for data governance involving a process that spans all organizational business processes and applications providing companies with the ability to store, maintain, exchange, and synchronize a consistent, accurate, and timely for the company's master data.

relational database model

based on the concept of 2D tables and its usually designed with a number related tables with each of these tables contains records (listed in rows) and attributes (listed i columns)

variety

big data formats change rapidly and can include satellite imagery broadcast audio streams digital music flies and web page content

what is the order of the data hierarchy

bit byte field record data file database

explicit

can be articulated and written

volume

lots of big data

database management system

- collection of programs to store, delete, access, analyze data -security -data dictionary/ meta data - a set of programs that provide users with tools to create and manage database

bid data institure (TBDI) defines big data as

-exhibit variety -includes structured, unstructured, and semi structured data -are generated at high velocity with an uncertain pattern -do not fit neatly into traditional structured relationsl databases -can be captured, processed, transformed, and analyzed in a reasonable amount of time only by sophisticated information systems.

sources of error of information include

-intentionally inaccurate information to protect privacy -different entry standards and formats -abbreviated or erroneous information by accident or to save time -external information contains inconsistencies, inaccuracies and errors

normalization

-minimum redundancy -maximum data integrity -best processing performance

data warehouse and data marts characteristics

-organized by business dimension or use online analytical processing (OLAP) -integrated -time variant -nonvolatile -multidimensional

what does big data consist of

-traditional enterprise data -machine generated / sensor data -social data -images captured by billions of devices located throughout the world

data mart

A low-cost, scaled-down version of a data warehouse that is designed for the end-user needs in a strategic business unit (SBU) or a department.

knowledge management KM

A process that helps organizations identify, select, organize, disseminate, transfer, and apply information and expertise that are part of the organization's memory and that typically reside within the organization in an unstructured manner.

Master Data

A set of core data, such as customer, product, employee, vendor, geographic location, and so on, that spans an enterprise's information systems.

storing the data

A variety of architectures can be used to store decision-support data and the most common architecture is one central enterprise data warehouse, without data marts.

integrated

Data are collected from multiple systems and then integrated around subjects.

unstructured data

Data does not exist in a fixed location and can include text documents, PDFs, voice messages, emails

nonvolatile

Data warehouses and data marts are nonvolatile—that is, users cannot change or update the data.

time variant

Data warehouses and data marts maintain historical data (i.e., data that include time as a variable).

create knowledge

Knowledge is created as people determine new ways of doing things or develop know-how. Sometimes external knowledge is brought in.

disseminate knowledge

Knowledge must be made available in a useful format to anyone in the organization who needs it, anywhere and anytime.

manage knowledge

Like a library, the knowledge must be kept current. It must be reviewed regularly to verify that it is relevant and accurate.

Capture Knowledge

New knowledge must be identified as valuable and be represented in a reasonable way.

Refine Knowledge

New knowledge must be placed in context so that it is actionable. This is where tacit qualities (human insights) must be captured along with explicit facts.

Knowledge Management Systems

Refer to the use of modern information technologies - the Internet, intranet, extranets, databases - to systematize, enhance, and expedite intrafirm and interfirm knowledge management.

source systems

Systems that provide a source of organizational data.

tacit knowledge

The cumulative store of subjective or experiential learning, which is highly personal and hard to formalize.

data quality

The quality of the data in the warehouse must meet users' needs

bit (binary digit)

The smallest unit of data stored in a computer. A bit can have the value of 0 or 1.

users

There are many potential BI users, including IT developers; frontline workers; analysts; information workers; managers and executives; and suppliers, customers, and regulators.

governance

To ensure that BI is meeting their needs, organizations must implement governance to plan and control their BI activities. Governance requires that people, committees, and processes be in place.

multidimensional

Typically the data warehouse or mart uses a multidimensional data structure. Recall that relational databases store data in two-dimensional tables.

issues with big data

Untrusted data sources Big Data is dirty Big Data changes, especially in data streams

store knowledge

Useful knowledge must then be stored in a reasonable format in a knowledge repository so that other people in the organization can access it.

data file

a collection of logically related records

field

a column of data containing a logical grouping of characters into a word, a small group of words ( last name, social security )

data model

a diagram that represents entities in the database and their relationships

what is big data according to gartner.com

a diverse, high volume, high velocity information assets that require new forms of processing to enable enhance decision making, insight discovery and process optimization

primary key

a field in a database that uniquely identify each record so that it can be retrieved, uploaded and stored,

foreign key

a field or group of fields in one table that uniquely identifies a row of another table. it is used to establish and enforce a link between two tables

secondary key

a field that has some identifying information, but typically does not identify the record with complete accuracy and therefore cannot serve the primary key

byte

a group of 8 bites represents a single character

data base

a logical grouping of related data files aka database tables

record

a logical grouping of related fields in a row ( students name, the courses taken, the date)

data warehouse

a repository of historical data that are organized by subject to support decision makers in the organization

data governance

an approach to managing information across an entire organization involving a formal set of unambiguous rules for creating, collecting, handling, and protecting its info

information silos

an info system that does not communicate with other related info system in an org

normalization data occurs when

attributes in the table depend on the primary key

data file

logical grouping of related records is a data file or a table similar in appearance to a spreadsheet in excel consisting of multiple columns and rows

entity

a person, place, thing, or event

external data sources

commercial databases, government reports, and corporate web sites

internal data sources

corporate databases and company documents

the KMS cycle consists of six steps what are they

create knowledge capture knowledge refine knowledge store knowledge manage knowledge disseminate knowledge

metadata

data maintained about the data within the data warehouse. (e.g., database, table, and column names; refresh schedules; and data-usage measures.

a DSMS minimizes the following problems

data redundancy data isolation data inconsistency

a DBMs maximizes the following issues

data security data integrity data independence from applications

what does DBMS stand for

database management system

big data is dirty

dirty data refers to inaccurate incomplete incorrect duplicate or erroneous data

attribute

each characteristic or quality of a particular entity

what do ER diagrams consist of

entities attributes and relationships

database designers plan the database design in a process called

entity relationship modeling (ER)

big data changes

especially in data streams: Organizations must be aware that data quality in an analysis can change, or the data itself can change, because the conditions under which the data are captured can change.

social data

examples are customer feedback comments; microblogging sites such as Twitter; and social media sites such as Facebook, YouTube, and LinkedIn.

traditional enterprise data

examples are customer information from customer relationship management systems, transactional enterprise resource planning data, Web store transactions, operations data, and general ledger data

Machine-generated data

examples are smart meters, manufacturing sensors, sensors integrated into smartphones, automobiles, airplane engines and industural machines, and trading system data

managing big data

first step- integrate info silos into a database environment and develop data warehouses for decision making second step- making sense of their proliferating data

NoSQL database

many organizations are turning into them, it can manipulate structured as well as unstructured data and inconsistend or missing data providing an alternative for firms that have more and different kinds of data (big data) in addition to the traditional structured data that fit neatly into the rows and columns of relational dataabase

cardinality

maximum number of times an instance of an entry can be associated with another instance of entity

modality

minimum number of times an instance of entity can be associated with another instance of an entity

use online analytical processing is

olap

common examples of source systems include

operational/transactional systems enterprise resource planning (ERP) systems Web site data third-party data (e.g., customer demographic data) operational databases

whats the problem with big data

organizations collect more data than they can hope to analyze and use

personal data sources

personal thoughts, opinions, and experiences

data rot

refers primarily to problems with the media on which the data are stored. Over time, temperature, humidity, and exposure to light can cause physical problems with storage media and thus make it difficult to access the data.

data integration

reflects the growing number of ways that source system data can be handled. Typically organizations need to Extract, Transform, and Load (ETL) data from source system into a data warehouse or data mart.

federal regulations of managing data

sarbanes - oxley act of 2002 requires that - public companies evaluate and disclose the effectiveness of their financial controls - independent auditors for these companies agree to this disclosure

What does SQL stand for

structured query language

tacit

that is difficult to encode and one that cannot be fully written

explicit knowledge

the more objective, rational, and technical types of knowledge

velocity

the rate at which data flow into an org is rapidly increasing and it is critical because it increases the speed of the feedback loop between a company and its customers

how are ER relationships described as

their chardinality and modality

clickstream data

those data that visitors and customers produce when they visit a Web site and click on hyperlinks

what can big data reveal

valuable patterns, trends and infor that were previously hidden -spot business trends more rapidly and accurately -tracking the spread of disease -crime -detecting fraud

characteristics of big data

volume, velocity, variety

how are database relationships established

with a primary key


Ensembles d'études connexes

Ch 18 Intraoperative Nursing Management

View Set

AP Gov. Chapter 2 Quiz Questions

View Set

Sleep, Internal Regulation (Exam 3)

View Set

Political science 2302 Test 2 lamar university

View Set

Chapter 68: Management of Patients With Neurologic Trauma

View Set

Flannery O'Connor "A Good Man is Hard to Find"

View Set

Writing and Naming Ionic Formulas

View Set