MIS ch 6
data artist
a business analytics specialist who uses visual tools to help people understand complex data.
big data
a collection of large, complex data sets, including structured and unstructured data, which cannot be analyzed using traditional database methods and tools.
record
a collection of related data elements
primary key
a field (or group of fields) that uniquely identifies a given record in a table Using Steve Smith's unique ID allows a manager to search the database to identify all information associated with this customer.
data warehouse
a logical collection of information, gathered from many operational databases, that supports business analysis activities and decision-making tasks. The primary purpose of a data warehouse is to combine information, more specifically, strategic information, throughout an organization into a single repository in such a way that the people who need that information can make decisions and undertake business analysis
information integrity
a measure of the quality of information
foreign key
a primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables
Extraction, transformation, and loading (ETL)
a process that extracts information from internal and external databases, transforms it using a common set of enterprise definitions, and loads it into a data warehouse.
information cleansing or scrubbing
a process that weeds out and fixes or discards inconsistent, incorrect, or incomplete information.
data driven website
an interactive website kept constantly updated and relevant to the needs of its customers using a database. Data-driven capabilities are especially useful when a firm needs to offer large amounts of information, products, or services.
structured query language
asks users to write lines of code to answer questions against a database
data dictionary
compiles all of the metadata about the data elements in the data model.
data mart
contains a subset of data warehouse information. To distinguish between data warehouses and data marts, think of data warehouses as having a more organizational focus and data marts as having a functional focus
database management system
creates, reads, updates, and deletes data in a database while controlling access and security. Managers send requests to the DBMS, and the DBMS performs the actual manipulation of the data in the database.
attributes
data elements associated with an entity. customer name, distributor ID, product price more specific columns or fields
physical view of information
deals with the physical storage of information on a storage device.
business rule
defines how a company performs certain aspects of its business and typically results in either a yes/no or true/false answer. Stating that merchandise returns are allowed within 10 days of purchase is an example of a business rule
data visualization
describes technologies that allow users to see or visualize data to transform information into a business perspective. Data visualization is a powerful way to simplify complex data sets by placing data in a format that is easily grasped and understood far quicker than the raw data alone. prevents analysis paralysis
business critical integrity constraints
enforce business rules vital to an organization's success and often require more insight and knowledge than relational integrity constraints. Consider a supplier of fresh produce to large grocery chains such as Kroger. The supplier might implement a business-critical integrity constraint stating that no product returns are accepted after 15 days past delivery. That would make sense because of the chance of spoilage of the produce. Business-critical integrity constraints tend to mirror the very rules by which an organization achieves success.
data scientist
extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information.
advanced analytics
focuses on forecasting future trends and producing insights using sophisticated quantitative methods, including statistics, descriptive and predictive data mining, simulation, and optimization. Advanced analytics uses data patterns to make forward-looking predictions to explain to the organization where it is headed.
logical view of information
focuses on how individual users logically access information to meet their own particular business needs.
structured data
has a defined length, type, and format and includes numbers, dates, or strings such as Customer Address. Structured data is typically stored in a traditional system such as a relational database or spreadsheet Machine-generated data, created by a machine without human intervention. Machine-generated structured data includes sensor data, point-of-sale data, and web log (blog) data. Human-generated data is data that humans, in interaction with computers, generate. Human-generated structured data includes input data, click-stream data, or gaming data.
query by example tool
helps users graphically design the answer to a question against a database
dynamic information
includes data that change based on user actions. For example, static websites supply only information that will not change until the content editor changes the information. Dynamic information changes when a user requests information. movie ticket availability, airline prices, or restaurant reservations
static information
includes fixed data incapable of change in the event of a user action.
data models
logical data structures that detail the relationships among data elements by using graphics or pictures.
database
maintains information about various types of objects (inventory), events (transactions), people (employees), and places (warehouses
data analysis tools
move beyond Excel graphs and charts into sophisticated analysis techniques such as controls, instruments, maps, time-series graphs, and more.
unstructured data
not defined, does not follow a specified format, and is typically free-form text such as emails, Twitter tweets, and text messages. Machine-generated unstructured data: satellite images, scientific atmosphere data, and radar data. Human-generated unstructured data: text messages, social media data, and emails.
information integrity issues
occur when a system produces incorrect, inconsistent, or duplicate data
infographics
present the results of data analysis, displaying the patterns, relationships, and trends in a graphical format. Inforgraphics are exciting and quickly convey a story users can understand without having to analyze numbers,
distributed computing
processes and manages algorithms across many machines in a computing environment. Big data tools use distributed computing to store and analyze data across databases stored around the globe.
real time systems
provide real-time information in response to requests. Many organizations use real-time systems to uncover key corporate transactional information. The growing demand for real-time information stems from organizations' need to make faster and more effective decisions, keep smaller inventories, operate more efficiently, and track performance more carefully.
metadata
provides details about data. For example, metadata for an image could include its size, resolution, and date created. Metadata about a text document could contain document length, data created, author's name, and summary. Each data element is given a description, such as Customer Name; metadata is provided for the type of data (text, numeric, alphanumeric, date, image, binary value) and descriptions of potential predefined values such as a certain area code; and finally the relationship is defined.
data governance
refers to the overall management of the availability, usability, integrity, and security of company data.
relational integrity constraints
rules that enforce basic and fundamental information-based constraints. For example, a relational integrity constraint would not allow someone to create an order for a nonexistent customer, provide a markup percentage that was negative, or order zero pounds of raw materials from a supplier.
integrity constraints
rules that help ensure the quality of information. The database design needs to consider integrity constraints. 1. relation 2. business critical
entity
stores information about a person, place, thing, transaction, or event. customer, distributor, product
relational database model
stores information in the form of logically related two-dimensional tables managers need to query or search for the answers to business questions such as which artist sold the most albums during a certain month
master data management
the practice of gathering data and ensuring that it is uniform, accurate, consistent, and complete, including such entities as customers, suppliers, products, sales, employees, and other critical entities that are commonly integrated across organizational systems
data mining
the process of analyzing data to extract information not offered by the raw data alone. Data mining can also begin at a summary information level (coarse granularity) and progress through increasing levels of detail (drilling down) or the reverse (drilling up). Companies use data-mining techniques to compile a complete picture of their operations, all within a single view, allowing them to identify trends and improve forecasts
data element/data field
the smallest or basic unit of information. Data elements can include a customer's name, address, email, discount rate, preferred shipping method, product name, quantity ordered, and so on
business intelligence dashboards
track corporate metrics such as critical success factors and key performance indicators and include advanced capabilities such as interactive controls, allowing users to manipulate data for analysis.
data mining tools
use a variety of techniques to find patterns and relationships in large volumes of information that predict future behavior and guide decision making. enables these companies to determine relationships among such internal factors as price, product positioning, or staff skills, and external factors such as economic indicators, competition, and customer demographics. With data mining, a retailer could use point-of-sale records of customer purchases to send targeted promotions based on an individual's purchase history. By mining demographic data from comment or warranty cards, the retailer could develop products and promotions to appeal to specific customer segments.
data mining tools
1) classification: assign records to one of a predefined set of classes 2) estimation: determine values for an unknown continuous variable behavior or estimation future value 3) affinity grouping: determine which things go together 4) clustering: segment a heterogenous population of records into a number of more homogeneous subgroups