chapter 6 MIS

Ace your homework & exams now with Quizwiz!

entity examples

Examples: Customers, employees, parts, suppliers

in memory computing is enabled

High-speed processors, multicore processing, falling computer memory prices

Data collected by OLTP (on-line transaction processing)

Lacks tools for fast retrieval, data analysis, and does not contain historical data

Hadoop

Open-source software framework from Apache Designed for big data Breaks data task into sub-problems and distributes the processing to many inexpensive computer processing nodes Combines result into smaller data set that is easier to analyze Can process large quantities of any kind of data (structured transactional data, complex data, loosely structured data, unstructured audio and video data)

Data quality problems can be caused by

Redundant & inconsistent data produced by multiple systems Data input errors

Which of the following is not a step a firm might take to make sure they have a high level of data quality?

Using in-memory computing

Foreign key

a "look-up" field in a database that allows users to find related information in another database table

Data definition

a DBMS capability that specifies the structure and content of the database.

Entity

a general category representing a person, place, thing about which we store and maintain information; "subject" of the data

Field

a group of related bytes or characters representing an attribute (characteristic) about an entity (subject of the data); an individual element of data

Database

a group of related files about a specific entity (subject); ex. HR database

File

a group of related records about an entity (subject of the data); ex. Employee Benefits file, Employee Payroll file, Employee Job History file

An entity-relationship diagram

a methodology for documenting a database illustrating the relationship between various entities in the database.

In a database, each row represents __________.

a record

The select, project, and join operations

enable data from two different tables to be combined and only selected attributes to be displayed to create a report

Sequences:

events linked over time

Hadoop

is used to process large quantities of structured and transnational data

Database administration

responsible for defining and organizing the structure and content of the database, and maintaining the database.

Traditional programming

separates the data from the program code (the operations or actions that act on them)

Which of the following best describes a data manipulation language?

A data manipulation language is associated with a database management system that end users and programmers use to manipulate data in the database.

Which of the following best describes a data quality audit?

A data quality audit is a structured survey of the accuracy and level of completeness of the data in an information system

Characteristics of high quality information include:

Accurate Complete Relevant Timely

Business Intelligence Infrastructure

Array of tools for obtaining useful information from internal and external systems and big data

Which of the following functions of an organization is responsible for information policy, as well as for data planning, data dictionary development, and monitoring data usage in the firm?

Data administration

Which of the following can be used to automatically enforce consistency among different sets of data?

Data cleansing

Big Data

Datasets with volumes so large they are beyond the ability of typical relational DBMS to capture, store and analyze Are often unstructured or semi-structured data from Internet and networked services and applications Billions or trillions of records that accumulate more rapidly than traditional data Provide more patterns and insights than smaller datasets

data driven website characteristics

Improves access to and updates to information Useful for e-commerce sites, news sites, forums & discussion groups, subscription services

a key field

In a database, __________ is used to uniquely identify each record for retrieval or manipulation.

Sentiment analysis

Mines online text comments or in e-mail to measure customer sentiment

Analytical Tools for business intelligence

OLAP, Data Mining, Text Mining, Web Mining

Relational database model:

Organize data into two-dimensional tables (relations) with columns and rows One table for each entity: E.g., (CUSTOMER, SUPPLIER, PART, SALES)

Analytic Platforms

Preconfigured hardware-software systems like IBM's Netezza, Oracle Big Data Appliance Designed for query processing and analytics for very large datasets Use both relational and non-relational technology to analyze large data sets Also Include In-Memory systems, NoSQL DBMS

Normalization

Process of streamlining complex groups of data to: Minimize redundant data elements Minimize awkward many-to-many relationships Increase stability and flexibility

Cloud Databases

Relational database engines provided by cloud computing services, such as Amazon Pricing based on usage Reduced investment in HW, SW Appeal to Web-focused businesses, small or medium-sized businesses seeking lower costs than developing and hosting in-house databases

Referential integrity rules

Rules used by relational databases to ensure that relationships between coupled tables remain consistent

Data Mart

Smaller version of a data warehouse - contains a subset of data, usually for a single aspect of a firm's business Used by a single department or function More limited scope than a data warehouse and customized to support decision making for a particular end-user group

Database Management System (DBMS)

Software for creating, storing, organizing, and accessing data from a database separates the logical and physical views of the data

data definition capabilities:

Specify structure of content of database Used to create the database tables and to define the characteristics of the fields in each table

Information policy

States organization's rules for organizing, managing, storing, sharing information

Data dictionary

Stores definitions of data elements and their characteristics Name of data item SSN Description Social Security Number xxx Size 11 bytes Type alpha-numeric Format xxx-xx-xxxx Default value/range of allowable values ex: $10-12/hr for a specific job classification

data warehouse characteristics

Supports business analysis activities & decision-making tasks Stores 3 - 10 year's of historical data Supports managerial decision making (analysis) Can be accessed but not altered Is regularly updated and cleansed Requires heavy-duty processing power and storage capacity Uses queries to analyze past data in order to spot trends, patterns

Online Analytical Processing (OLAP)

Supports multidimensional data analysis, enabling users to view the same data in different ways using multiple dimensions Enables users to obtain online answers to ad hoc questions such as these in a fairly rapid amount of time

database administration

The __________ functions of an organization are responsible for defining and maintaining a database. These functions are performed by a database design and management group

Which of the following statements about the power of a relational DBMS is false?

The relational database has become the primary method for organizing and maintaining data in information systems because it is so rigidly controlled.

Record

a group of related fields; a collection of attributes (characteristics) about an entity (subject of the data) - there will be one record for every entity in the file; (there will be one record for every employee, part, etc.)

In a database, each column represents __________.

an attribute or a field

Data-driven web site

an interactive web site that serves as an interface to a database & is kept constantly updated and relevant to the needs of its customers/users

Data Mining

analyzes large pools of data to find hidden patterns and relationships in large databases, data warehouses or data marts and infers rules from them to predict future behavior and guide decision-making

hadoop

breaks a big data problem down into sub-problems, distributes them among up to thousands of inexpensive computer processing nodes, and then combines the result into a smaller data set that is easier to analyze.

Poor data quality

can create a major obstacle to successful customer relationship management as well as serious operational and financial problems

Data manipulation language

commands used to add, delete, change and retrieve data from the database

intrusion detection system

computer program that senses when another computer is attempting to scan the disk or otherwise access a computer.

Objects

data and actions that can be performed on the data (methods)

David creates a central database by extracting, transforming, and loading metadata from various internal and external sources of information. He plans to use this database for executing various functions such as intelligence gathering, strategic planning, and analysis. This central repository of information is referred to as a

data warehouse

Clustering:

discovering "as yet unclassified" groupings

Text Mining

discovery of patterns and relationships from large sets of unstructured data allows businesses to extract key elements from large unstructured data sets, discover patterns & relationships, and summarize the information

Primary key

field that uniquely identifies a given record (row) in a table

physical view:

how data are actually structured and organized; where is the data actually located? There is only 1 physical view

Attribute

specific characteristic of an entity Examples: supplier name, supplier street address, part number, zip code, employee last name

Data Warehouse

stores current and historical data of potential interest to organizational decision makers; Gathered from many different operational databases

Data quality audit

structured survey of the accuracy and completeness of data

A non-relational database management system

system for working with large quantities of structured and unstructured data that would be difficult to analyze with a relational model.

Online analytical processing (OLAP) is best defined as

the capability for manipulating and analyzing large volumes of data from multiple perspectives

Data Hierarchy

the structure and organization of data (involves bits, characters/bytes, fields, records, files, databases)

Object Oriented programming and databases

tie the data and program code together in objects and then manipulate the objects to create a program

Key field

used to identify individual records

Report generation

users can define report formats

Forecasting

uses a series of existing values to forecast future values

An information policy

would specify that only selected members of the payroll and human resources department would have the right to change sensitive employee data, such as an employee's salary and social security number, and that these departments are responsible for making sure that such employee data are accurate.

Database administration

Database design and management group responsible for defining and organizing the structure and content of the database, and maintaining the database

Non-Relational Databases

Developed to handle large sets of data that are not easily organized into tables, columns, and rows

Web Mining

Discovery and analysis of useful patterns and information from the Web

Why is a relational DBMS so powerful?

Relational database products are available as cloud computing services.

In-Memory Computing -

Relies on computer's main memory (RAM) for data storage Accesses data stored in system primary memory, which eliminates bottlenecks in retrieving and reading data from hard-disk based databases Dramatically shortens query response times Lowers processing costs by optimizing the use of memory and accelerating processing performance

Data administration

Responsible for specific policies and procedures through which data can be managed as a resource

NoSQL"databases: Non-relational DBMS

Use more flexible data model Don't require extensive structuring Can manage unstructured data, such as social media and graphics

Which of the following is not a step a firm might take to make sure they have a high level of data quality?

Using data mining

data dictionary

automated or manual tool for storing and organizing information about the data maintained in a database.

Data cleansing:

detects and corrects incorrect, incomplete, improperly formatted, and redundant data

Organizations perform data quality audits to __________.

determine accuracy and level of completeness of data

Logical view:

how end users view data; what data does an individual user need? There can be more than 1 logical view

The organization's rules for sharing, disseminating, acquiring, standardizing, classifying, and inventorying information are specified by

information policies

Normalization

involves the process of creating small, stable, yet flexible data structures from complex groups of data when designing a relational database.

Data mining

is a type of intelligence gathering that uses statistical techniques to explore records in a data warehouse, hunting for hidden patterns and relationships that are undetectable in routine reports.

Structure mining

mines Web site structural elements, such as links

Content mining

mines content of Web sites

Usage mining

mines user interaction data gathered by Web servers

Associations:

occurrences linked to single event

relational database

organizes its data in the form of tables and represents relationships using foreign keys.

Classifications:

patterns describing a group an item belongs to


Related study sets

Ch 9- Corporate Strategy: Strategic Alliances and Mergers and Acquisitions

View Set

Chapter 11 - Shock, Sepsis, and Multiple Organ Dysfunction Syndrome

View Set

CLTECH 100-890: Collaboration Environment Overview

View Set

Test 4, part 2 bad day to be alive

View Set

Chapter 31: Orthopedic Emergencies

View Set

BIOL 116 Ch. 16 Learning Outcomes

View Set