DSS Chapter 6

अब Quizwiz के साथ अपने होमवर्क और परीक्षाओं को एस करें!

database server

A computer in a client/server environment that is responsible for running a DBMS to process SQL statements and perform database management tasks

database administration

Database design and management group responsible for defining and organizing the structure and content of the database, and maintaining the database. large companies also use this

Data Mining

Finds hidden patterns and relationships in large databases and infers rules from them to predict future behavior discovery driven

Databases and the Web 1. web server 2. Application servers of CGI 3. Database server

Firms use the web to make information from their internal databases available to customers and partners. What makes it possible Web interfaces provide familiarity to users and savings over redesigning legacy systems.

-redundant and inconsistent data produced by multiple systems -data input errors

How are Data quality problems caused

Amazon Relational Database Service

Offers My S Q L, Microsoft S Q L Server, Oracle Database engines

Software for database querying and reporting Multidimensional data analysis (O L A P) Data mining

Once data is gathered, tools are required for consolidating, analyzing, to use insights to improve decision making

byte

a group of bits, represents a single character, which can be a letter, a number, or another symbol

Data quality audit

a survey and/or sample of files to determine accuracy and completeness of data in an information system can survey entire data files, samples from data files, end users for their perceptions of data quality

data dictionary

automated or manual file storing definitions of data elements and their characteristics

project

creates a subset consisting of columns in a table permits user to create new tables containing only desired information

select

creates a subset of all records meeting stated criteria

clustering

discovering as yet unclassified groupings

querying and reporting

1. Data Manipulation a.Structured query language (S Q L) b. Microsoft Access query-building tools 2.Report generation, e.g., Crystal Reports

relational database

1. minimizing the number of times a piece of information appears in our database does these things a. reduces the possibility of error b. simplifies the process of updating the database

key field

A field in a record that uniquely identifies instances of that record so that it can be retrieved, updated, or sorted

field

A grouping of characters into a word, a group of words, or a complete number, such as a person's name or age.

Common Gateway Interface (CGI)

A specification for processing data on a web server, application server

1. customer behavior 2. weather patterns

Big datasets offer more patterns and insights than smaller datasets....bc (2 examples)

1. Data Warehouse 2. Data marts 3. Hadoop 4. In-memory computing 5. Analytical platforms

Business Intelligence Infrastructure -Array of tools for obtaining useful information from internal and external systems and big data

join

Combines relational tables to present the server with more information than is available from individual tables

big data

Data sets with volumes so huge that they are beyond the ability of typical relational DBMS to capture, store, and analyze. The data are often unstructured or semi-structured. Massive quantities of unstructured and semi-structured data from Internet and more

Non-relational database management systems

Database management system for working with large quantities of structured and unstructured data that would be difficult to analyze with a relational model. "No S Q L" Handle large data sets of data that are not easily organized into tables, columns, and rows Use more flexible data model Don't require extensive structuring Can manage unstructured data, such as social media and graphics E.g. Amazon's Simple D B, MetLife's Mongo D B

Data Warehourse

Database that stores current and historical data that may be of interest to decision makers Consolidates and standardizes data from many systems, operational and transactional databases Data can be accessed but not altered

web mining

Discovery and analysis of useful patterns and information from the web-E.g. to understand customer behavior, evaluate website, quantify success of marketing Content mining - mines content of websites Structure mining - mines website structural elements, such as links Usage mining - mines user interaction data gathered by web servers

Sentiment analysis

Mines online text comments online or in email to measure customer sentiment

Hadoop

Open-source software framework for big data Breaks data task into sub-problems and distributes the processing to many inexpensive computer processing nodes then combines the result into a smaller data set that is easier to analyze. Key services Hadoop Distributed File System (H D F S) MapReduce You probably have used this to find the best airfare on the internet, get directions, do a search on google, or connect of FB

Analytic Platforms

Preconfigured hardware-software systems Designed for query processing and analytics Use both relational and non-relational technology to analyze large data sets Include in-memory systems, No SQL DBMS E.g. I B M Pure Data System for Analytics-query for processing and analytics Integrated database, server, storage components Data lakes

Pricing based on usage Appeal to small or medium-sized businesses

Relational database engines provided by cloud computing services

in-memory computing

Relies on computer's main memory (RAM) for data storage users access data stored in system's primary memory by Eliminates bottlenecks in retrieving and reading data Dramatically shortens query response times Enabled by high-speed processors, multicore processing Lowers processing costs

Data administration

Responsible for specific policies and procedures through which data can be managed as a resource in large organization you need his these responsibilities include developing information policy, planning for data, overseeing logical database design, and data dictionary development, and monitoring how information system's specialists and end-user groups use data

database management systems (DBMS)

Software for creating, storing, organizing, and accessing data from a database Separates the logical and physical views of the data Logical view: how end users view data Physical view: how data are actually structured and organized Examples: Microsoft Access, D B 2, Oracle Database, Microsoft S Q L Server, My S Q L

attributes

Specific characteristics of each entity: SUPPLIER(entity) name, address (attributes) PART description, unit price, supplier

normalization

Streamlining complex groupings of data to minimize redundant data elements and awkward many-to-many relationships the process of creating, small, stable data structures from complex groups of data when designing a relational database

text mining

Unstructured data (mostly text files) accounts for 80 percent of an organization's useful information. Text mining allows businesses to extract key elements from, discover patterns in, and summarize large unstructured data sets. when businesses want to turn to this for analyzing calls to customer.

1. Associations 2. Sequences 3. Classifications 4. Clustering 5. Forecasting

What are the types of information obtainable from data mining

database

a collection of related files containing records on people, places, or things are at the heart of information systems because they keep track of the people, places, and things that a business must deal with on a continuing, often instant basis

record

a group of related fields, such as a student's identification number (ID), the course taken, the date, and the grade

file

a group records of the same type

Data Manipulation Language (DML)

a language associated with a database management system that end users and programmers use to manipulate data in the database used to add, change, delete, and retrieve the data in the database contains commands that permit end users and programming specialists to extract data from the database to satisfy information requests and develop applications

data lake

a repository(a place where things are stored) for raw unstructured data or structured data that for the most part have not yet been analyzed and the data can be accessed in many ways form from large types of analytic platform

query

a request for information from a database

data cleansing

also known as data scrubbing. consists of activities for detecting and correcting data in a database that are incorrect, incomplete, improperly formatted, or redundant enforces consistency

3Vs 1. extreme volume 2. wide variety 3. velocity

big Data are characterized by the

distributed databases

databases spread stored in multiple physical locations Stored in multiple physical locations Google's Spanner cloud service

referential integrity rules

ensure the relationships between coupled tables remain consistent

sequences

events linked over time

columns

fields in a relational database are also called

entity

generalized category representing person, place or thing on which we store information

foreign key

is essentially a lookup field to find data about the supplier of a specific part

Structured Query Language (SQL)

most prominent data manipulation language today the standard data manipulation language for relational database management systems

associations

occurrences linked to single event

select join project

operations of relational DBMS

relational database

organizes data into two two-dimensional tables (relations) with columns and rows most common type of database today it can relate data stored in one table to data in another as long as the two table share a common data element

classifications

patterns describing a group an item belongs to

One-to-one relationship One-to-many relationship Many-to-many relationship Requires "join table" or intersection relation that links the two tables to join information

relationship database tables may have:

bit

represents the smallest unit of data a computer can handle

tuples

rows or records in a relational database

report generator

software designed to take data from a source such as a database and use the data to produce a report in a polished format

data defintion

specifies the structure of the content of a database

information policy

stats organization's rules for organizing, managing, storing, sharing information

data mart

subset of a data warehouse (smaller and decentralized) that is highly focused and isolated for a specific population of users

Online Analytical Processing (OLAP)

supports multidimensional data analysis, enabling users to view the same data in different ways using multiple dimensions Each aspect of information—product, pricing, cost, region, or time period—represents a different dimension E.g., comparing sales in East in June versus May and July Enables users to obtain online answers to ad hoc questions such as these in a fairly rapid amount of time

row

the actual information about a single supplier that resides in a table, separate records or tuples

primary key

unique identifier for all the information in any row of a database table cannot be duplicated

Entity Relationship Diagram

used to clarify table relationships in a relational database

entity-relationship diagram

used to clarify table relationships in a relational database

forecasting

uses series of values to forecast future values


संबंधित स्टडी सेट्स

2.1 Compare and contrast TCP and UDP ports, protocols, and their purposes

View Set

CompTIA CertMaster Linux+ LXO-103 ALL

View Set

Chapter 14-The Federal Reserve System

View Set

NU471 Week 3 EAQ #2 Evolve Elsevier: Quality Improvement - 30 Questions

View Set

Dosage Calculations Assignment Quiz

View Set