ITSCM 280, Exam 3 Study Guide

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

For example, the following expression uses the AVERAGEIF function to calculate the sum of values in the range A1:A100, but only those cells whose value is greater than 50:

AVERAGEIF(A1:A100, ">50")

To calculate the average of values for cells that satisfy specified criteria, use:

AVERAGEIF(range, criteria, [average_range])

The syntax of the AVERAGEIFS function is:

AVERAGEIFS(average_range, range1, criteria1, [range2], [criteria2], ...)

What are the nine characteristics of quality information?

Accessible, Accurate, Complete, Economical, Relevant, Reliable, Secure, Timely, Verifiable

Accurate Information

Accurate information is error free. In some cases, inaccurate information is generated because inaccurate data is fed into the transformation process from data to information. This is commonly called garbage in, garbage out.

Solver

An Excel add-in that searches for the optimal solution to a problem involving several variables.

Array formula

An Excel formula that performs multiple calculations within a single step.

AVERAGEIFS function

An Excel function that calculates the average of values within a range that meet multiple criteria.

For example, the following expression counts the number of cells in the range D5:D789 whose value equals "Facebook":

COUNTIF(D5:D789, "Facebook")

To count the number of cells within a range that satisfy specified criteria, use:

COUNTIF(range, criteria)

The syntax of the COUNTIFS function is:

COUNTIFS(range1, criteria1, [range2], [criteria2], ...)

AVERAGEIF function

Calculates the average of values for cells that match a specified criteria.

SUMIF function

Calculates the sum of values for cells that match a specified criteria.

Changing cells

Cells whose values will change within each scenario

If you can not see your PivotTable menu what do you do? (EX7-31)

Click on top of your PivotTable. Once you click off the table, the menu will disappear.

Joining:

Combining two or more tables through common data attributes to form a new table with only the unique data attributes. - This is the true key and power of relational databases

Complete Information

Complete information contains all the important facts. For example, an investment report that does not include all important costs is not complete.

Nodes

Computers that perform one or more types of tasks within the system

COUNTIF function

Counts the number of cells that match a specified criteria.

Scenerio Manager

Creates scenarios of several what-if analyses within the worksheet.

Variable expenses

Expenses that change in proportion to the volume of production. In other words: (An expense that the per each doesn't change but the total overall value will go up or down depending on sales.)

Mixed expense

Expenses which are part variable and part fixed.

Components Required for Effective BI and Analytics:

For an organization to get real value form its BI.... it needs... - Existence of a solid data management program, including data governance --- Data governance defines the roles, responsibilities, and processes for ensuring that data can be trusted and used by the entire organization - Creative data scientists that understand the company - Management team --- Must have a strong commitment to data-driven decision making

NoSQL database

Provides a means to store and retrieve data that is modeled using some means other than the simple two-dimensional tabular relations used in relational databases

Data

Raw facts such as an employee number or total hours worked in a week.

Conditional average

Refers to the AVERAGEIF function, which averages the values in a range that match criteria you specify.

Conditional count

Refers to the COUNTIF function, which calculates the number of cells in a range that match criteria you specify.

Conditional sum

Refers to the SUMIF function, which adds the values in a range that match criteria you specify.

Relevant Information

Relevant information is important to the decision maker. Information showing that lumber prices might drop is probably not relevant to a computer chip manufacturer.

Reliable Information

Reliable information can be trusted by users. In many cases, the reliability of the information depends on the reliability of the data-collection method. In other instances, reliability depends on the source of the information. A rumor from an unknown source that oil prices might go up may not be reliable.

By default Values are summarized by which function? (EX7-31)

SUM function

For example, the following expression uses the SUMIF function to calculate the sum of values in the range A1:A100, but only those cells whose value is greater than 50:

SUMIF(A1:A100, ">50")

To calculate the sum of values for cells that satisfy specified criteria, use:

SUMIF(range, criteria, [sum_range])

The syntax of the SUMIFS function is:

SUMIFS(sum_range, range1, criteria1, [range2], [criteria2], ...)

Add-in

Software that adds commands and features to Microsoft Office applications.

Data lake

Takes a "store everything" approach to big data, saving all the data in its raw and unaltered form. - Also called an enterprise data hub - Raw data is available when users decide just how they want to use the data - Only when the data is accessed for a specific analysis is it extracted from the data lake

ETL (extraction, transformation, and loading) process

Takes data from a variety of sources, edits and transforms it into the format used in the data warehouse, then loads this data into the warehouse. In other words: (A data handling process that takes data from a variety of sources, edits and transforms it into the format used in the data warehouse, and then loads this data into the warehouse.) - This process is essential in ensuring the quality of the data.

Genetic algorithm

Technique that employs a natural selection-like process to find approximate solutions to optimization and search problems. In other words: (An approach to solving problems based on the theory of evolution; uses the concept of survival of the fittest to find approximate solutions to optimization and search problems.)

Knowledge

The awareness and understanding of a set of information and the ways that information can be made useful to support a specific task or reach a decision.

Result cell are:

The cells that contain the outcome of formulas involving input cells.

Input cells are:

The cells that contain values that are used in formulas of a what-if analysis

Business Analytics (BA)

The extensive use of data and quantitative analysis to support fact-based decision making within organizations

Analytics

The extensive use of data and quantitative analysis to support fact-based decision making within organizations.

Value field

The field containing a summary statistic in the PivotTable.

Moving a PivotTable:

You can move any PivotTable to a new location in the workbook. To move a PivotTable within its current worksheet, select the entire PivotTable range and then drag the table to a new location, being careful not to overwrite other content in the process. To move a PivotTable to a different worksheet, click anywhere within the PivotTable to select it, and then click the Move PivotTable button in the Actions group on the PivotTable Tools Analyze tab. The Move PivotTable dialog box opens so you can choose the new location for the PivotTable. You can create a new worksheet for the PivotTable or choose an existing worksheet in the workbook. You can also select the entire PivotTable and then use the Cut and Paste buttons in the Clipboard group on the Home tab to move the PivotTable within the workbook.

True or False: the field order in the ROWS area determines the PivotTable organization? (EX7-33)

True

True or False; Solver answer report can be displayed with subtotals.

True

Optimization

Used every day in an organization to allocate scare resources in a such a manner as to minimize costs and maximize profits.

HLOOKUP function

Uses row_index_num instead of column num. Horizontal Lookup across in rows

For vertical lookup tables, use the function:

VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup=TRUE])

To perform an approximate match lookup with a vertical lookup table, you use the same VLOOKUP function used for exact match lookups:

VLOOKUP(lookup_value, table_array, col_index_num, [range_lookup=TRUE])

What do you use to change the formatting of a field inside a PivotTable?

Value Field Settings

Result values are:

Values in a data table that come from formulas applied to one or more input value.

Example of Data Warehouse

Walmart uses separate warehouse for Walmart and Sam's club, UPS has 16.3 million packages per day, GE uses one for data from sensors on jet engines

Y = a + b(x)

Y = is the overall total A = total fixed cost B = per each variable cost X = level of activity

Slicers are available in Excel tables. Can you insert a Slicer in a PivotTable? (EX7-48)

YES! and they are awesome!

Two-variable data table

contains a reference to the result cell at the intersection of the first row and column. Let's you view the relationship between two input cells and one result cell.

DBMS is responsible for:

providing access to a database

Data consists of:

raw facts (Such as employee number, daily hours worked, inventory part number)

Another way of filtering PivotTable and PivotCharts is with a __________.

slicer

Concurrency control deals with:

the situation in which two or more users or applications need to access the same record at the same time

The goal of Business Intelligence (BI) is:

to get the most value out of information and present the results of analysis in an easy to understand manner.

Self-service analytics

- Includes training, techniques, and processes that empower end users to work independently to access data from approved sources to perform their own analyses using an endorsed set of tools - Encourages nontechnical users to make decisions based on facts and analyses rather than intuition

What are some of the characteristics of data warehouse?

- Large - Multiple sources - Historical - Cross organizational access and analysis - Supports various types of analyses and reporting

A database management system (DBMS) is a group of application programs that:

- Manipulate the database - Provide an interface between the database and its users and other application programs

Activating Solver

- On the Data tab, confirm whether Solver appears in the Analyze group. If it appears, Solver is already active. If it does not appear, continue with these steps. - On the ribbon, click the File tab, and then click Options in the navigation bar. - Click Add-ins in the left pane, click the Manage arrow, and then click Excel Add-ins. - Click Go to open the Add-Ins dialog box. - Click the Solver Add-in check box, and then click OK. - Follow the remaining prompts to install Solver, if it is not already installed.

Setting Solver's Objective and Variable Cells

- On the Data tab, in the Analyze group, click the Solver button. - In the Set Objective box, specify the cell whose value you want to set to match a specific objective. - Click the Max, Min, or Value Of option button to maximize the objective cell, minimize the objective cell, or set the objective cell to a specified value, respectively. - In the By Changing Variable Cells input box, specify the changing cells.

Creating a Scenario Summary Report or a Scenario PivotTable Report

- On the Data tab, in the Forecast group, click the What-If Analysis button, and then click Scenario Manager. - Click Summary. - Click the Scenario summary or Scenario PivotTable report option button. - Select the result cells to display in the report. - Click OK.

To insert a PivotChart:

- On the Insert tab, in the Charts group, click the PivotChart button. - Specify whether to insert the PivotChart in a new worksheet or at a cell location within an existing worksheet. - Drag fields from the field list on the PivotChart Fields pane and drop them on the Filters, Legend (Series), Axis (Categories), or Values area boxes.

Saving and Loading a Solver Model

- Open the Solver Parameters dialog box. - Click Load/Save. - Select an empty range containing the number of cells specified in the dialog box, and then click Save. - Select the range containing the saved model, and then click Load.

Big Data Uses, Examples:

- Retail organizations monitor social networks to engage brand advocates, identify brand adversaries - Advertising and marketing agencies track comments on social media - Hospitals analyze medical data and patient records - Consumer product companies monitor social networks to gain insight into consumer behavior - Financial service organizations use data to identify customers who are likely to be attracted to increasingly targeted and sophisticated offers

Every PivotTable includes the following four primary areas:

- Rows area —displays category values from one or more fields arranged in separate rows - Columns area —displays categories from one or more fields arranged in separate columns - Values area —displays summary statistics for one or more fields at each intersection of each row and column category - Filters area —contains a filter button that limits the PivotTable to only those values matching specified criteria

To add a PivotChart to an existing PivotTable:

- Select a PivotTable from the workbook. - On the PivotTable Tools Analyze tab, in the Tools group, click the PivotChart button. - Select the PivotChart chart type. - If necessary, move the PivotChart to a different worksheet than the PivotTable.

Data scientists are individuals who combine:

- Strong business insight - A deep understanding of analytics - A healthy appreciation of the limitations of their data, tools, and techniques to deliver real improvements

Extremely large and complex data collections

- Terabytes or more - Traditional data management software, hardware, and analysis processes are incapable of dealing with them

Primary Key

A field/attribute or set of fields/attributes that uniquely identifies the record. - Must be unique and can not be null

A two-variable data table lets you:

view the relationship between two input cells and one result cell.

Veracity

Key characteristics of big data: measure of the quality of the data

Velocity

Key characteristics of big data: rate of which new data is being generated

Value

Key characteristics of big data: worth of the data in decision making

What are the four main categories of NoSQL?

Key-value, column, document, and graph.

The syntax of the MATCH function is:

MATCH(lookup_value, lookup_array, [match_type=1])

The syntax of the MAXIFS function is:

MAXIFS(max_range, range1, criteria1, [range2], [criteria2], ...)

The syntax of the MINIFS function is:

MINIFS(min_range, range1, criteria1, [range2], [criteria2], ...)

Projecting:

Manipulating data to eliminating columns in a table. - Think of this as pulling out just the fields that you would like

Selecting:

Manipulating data to eliminating rows according to certain criteria. - Think of this as filtering the records through your data

Write Once Read Many (WORM)

Model simplifies concurrency issues and improves data throughput. A file is created written into the system is then closed. No changes can be made.

Mike chooses a DaaS arrangement for his small business because it reduces costs. Why is DaaS less costly than the traditional alternative?

No in-house database installation, maintenance, or monitoring.

For example, the following expression returns a value of TRUE if cell A1 equal 2 or if cell B1 equals 4 or if cell C1 equals 10:

OR(A1=2, B1=4, C1=10)

To return a true value if any one of multiple conditions are true, use:

OR(logical1, [logical2], [logical3], ...)

Steaming access

Optimized for batch processing of entire files as a continuous stream of data.

Scenario analysis

Process for predicting future value based on certain potential events (our Module #8 - session #2)

ACID properties

Properties (atomicity, consistency, isolation, durability) that guarantee relational database transactions are processed reliably and ensure the integrity of data in the database.

Example of in-memory database

Japanese telecommunications company KDDI for mobile service - 40 million customers

SQL databases conform to ACID properties:

- Atomicity - Consistency - Isolation - Durability

In-memory database (IMDB)

- A database management system that stores the entire database in random access memory (RAM) - Provides access to data at rates much faster than storing data on some form of secondary storage - Enables the analysis of big data and other challenging data-processing applications - Performs best on multiple multicore CPUs

The following key assumptions must be satisfied when using linear regression on a set of data:

- A linear relationship between the independent (X) and dependent (Y) variables must exist - Errors in the prediction of the value of Y are distributed in a manner that approaches the normal distribution curve - Errors in the prediction of the value of Y are all independent of one another

Regression analysis

- A mathematical technique for predicting the value of a dependent variable based on a single independent variable and the linear relationship between the two - Consists of finding the best-fitting straight line through a set of observations of the dependent and independent variables In other words: (A method for determining the relationship between a dependent variable and one or more independent variables.)

Text analysis

- A process for extracting value from large quantities of unstructured text data such as consumer comments, social media postings, review - Used to recognize patterns, sentiment analysis

Monte Carlo Simulation

- A simulation that enables you to see a spectrum of thousands of possible outcomes, considering not only the many variable involved, but also the range of potential values for each of those variables. - Example: financial planning retiree's nest egg

Linear programming

- A technique for finding the optimum value (largest or smallest) of a linear expression that is calculated based on a the value of a set of decision variables that are subject to a set of constraints. - Example: Microsoft Excel add-in Solver (our Module #8 - session 8.3)

Difference from SQL:

- Ability to spread data over multiple servers so that each server contains only a subset of the total data - Can handle a variety of different data found in Big Data - Do not require a predefined schema - Data structures are more flexible and can provide improved access speed and redundancy - NoSQL does not conform to true ACID properties when processing transactions

What are the advantages of NoSQL databases?

- Ability to spread data over multiple servers so that each server contains only a subset of the total data - Do not require a predefined schema - Data structures are more flexible and can provide improved access speed and redundancy

Without data and the ability to process it:

- An organization could not successfully complete most business activities - Provides timely, accurate, and relevant information build on the data

Binding constraint

A constraint that must be included in the Solver model and is a limiting factor in arriving at a solution.

Nonbinding constraint

A constraint that needs not be included in the Solver model and is not a limiting factor in arriving at the solution.

Scenerio

A defined collection of changing cells used to perform a what-if analysis.

Examples showing how data mining can be used:

- Based on past responses to promotional mailings, identify those consumers most likely to take advantage of future mailings - Examine retail sales data to identify seemingly unrelated products that are frequently purchased together - Monitor credit card transactions to identify likely fraudulent requests for authorization - Use hotel booking data to adjust room rates so as to maximize revenue - Analyze demographic data and behavior data about potential customers to identify those who would be the most profitable customers to recruit - Study demographic data and the characteristics of an organization's most valuable employees to help focus future recruiting efforts - Recognize how changes in an individual's DNA sequence affect the risk of developing common diseases such as Alzheimer's or cancer

To Filter a PivotTable with a Timeline Slicer:

- Click anywhere in the PivotTable to select it. - On the Insert tab, in the Filters group, click the Timeline button. - Click the check boxes of the fields contains date values which you want to create timelines. - Click OK. - Format the size, position, and appearance of the timeline. - Select intervals within the timeline to filter the PivotTable.

To create a PivotTable:

- Click anywhere within a data range or Excel table. - On the Insert tab, in the Tables group, click the PivotTable button; or on the Table Tools Design tab, in the Tools group, click the Summarize with PivotTable button. - Specify whether to insert the PivotTable in a new worksheet or at a cell location within an existing worksheet. - Click OK. - Drag fields from the field list on the PivotTable Fields pane and drop them on the Filters, Columns, Rows, or Values area boxes.

To apply a Slicer to Multiple PivotTables:

- Click the PivotTable slicer to select it. - On the Slicer Tools Options tab, in the Slicer group, click the Report Connections button. - Click the check boxes for all the PivotTables to be associated with the slicer. - Click OK.

Considerations when building a database:

- Content: what data should be collected? cost? - Access: what data should be provided to which users and when? - Logical structure: how should data be arranged so that it makes sense? - Physical organization: where should data be physically located? - Response time: how quickly must the data be updated and retrieved so it can be viewed by the users? - Archiving: how long to store? - Security: how can data be protected?

Some of the fundamental characteristics of relational models include:

- Data is organized into relations - Rows represent entities and columns represent attributes - Rows uniquely identified by a primary key - Column table data -- Integer number, decimal number, date, text, etc. -- Constrained to be certain type, length, or to have a value between two limits - Primary and foreign keys enable table relationships - User queries perform operations on the database

Database as a service (DAAS):

- Database stored on a service provider's servers - Database accessed by service subscriber over the Internet - Database administration handled by the service provider - DaaS advantage: -- Eliminates the installation, maintenance, and monitoring of in-house databases In other words: (An arrangement where the database is stored on a service provider's servers and accessed by the service subscriber over a network, typically the Internet, with the database administration handled by the service provider.)

Traditional approach to data management

- Each distinct operational system used data files dedicated to that system - Separate systems which would lead to redundant data

Defining a Scenerio

- Enter the data values in the worksheet for the scenario. - On the Data tab, in the Forecast group, click the What-If Analysis button, and then click Scenario Manager. - Click Add in the Scenario Manager dialog box. - In the Scenario name box, type a name for the scenario. - In the Changing cells box, specify the changing cells. - Click OK. - In the Scenario Values dialog box, specify values for each input cell, and then click Add. - Click OK.

Business analytics can be used to:

- Gain a better understanding of current business performance - Reveal new business patterns and relationships - Explain why certain results occurred - Optimize current operations - Forecast future business results

Using self-service analytics, users can:

- Gather insights - Analyze trends - Uncover opportunities and issues - Accelerate decision making by rapidly creating reports, charts, dashboards, and documents

What are some of the challenges with big data?

- How to choose what subset of the data to store? - Where and how to store the data? - How to find the nuggets of data that are relevant to the decision making at hand? - How to derive value from the relevant data? - How to identify which data needs to be protected from unauthorized access?

Setting Constraints on the Solver Solution

- In the Solver Parameters dialog box, click Add. - Enter the cell reference of the cell or cells containing the constraint. - Select the constraint type (<=, =, >=, int, bin, or dif). - Enter the constraint value in the Constraint box. - Click OK to add the constraint and return to the Solver Parameters dialog box. - Repeat for each constraint you want to add to the model.

Creating a One-Variable Data Table

- In the upper-left cell of the table, enter a formula that references the input cell. - In either the first row or the first column of the table, enter input values. - For input values in the first row, enter formulas referencing result cells in the table's first column; for input values in the first column, enter formulas referencing result cells in the table's first row. - Select the table (excluding any row or column headings). - On the Data tab, in the Forecast group, click the What-If Analysis button, and then click Data Table. - If the input values are in the first row, enter the cell reference to the input cell in the Row input cell box; if the input values are in the first column, enter the cell reference to the input cell in the Column input cell box. - Click OK.

Creating a Two-Variable Data Table

- In the upper-left cell of the table, enter a formula that references the result cell. - In the first row and first column of the table, enter input values. - Select the table (excluding any row or column headings). - On the Data tab, in the Forecast group, click the What-If Analysis button, and then click Data Table. - Enter the cell reference to the first row input values in the Row input cell box, and then enter the cell reference to the first column input values in the Column input cell box. - Click OK.

Data dictionary

A detailed description of the data stored in the database

A PivotChart has four primary areas:

- The Axis (Category) area displays categories that each data series is plotted against. - The Legend (Series) area breaks up the data values into separate data series. - The Values area contains the data values that are plotted on the PivotChart. - The Filters area contains a filter button that limits the PivotChart to only those values satisfying specified criteria.

Video analysis

- The process of obtaining information or insights from video footage. - Identify trends ad patterns - Example: airports security for congestion of passengers

Security management ensures data is protected against access by:

- Unauthorized users - Physical damage - Operating system failure - Simultaneous updating

What are the five characteristics of big data?

- Volume - Velocity (speed) - Value - Variety - Veracity (quality)

Data node communicates with name node and send back:

- block reports every 6 hours - informs the name node of which blocks are on that data node - and heartbeats are sent every 3 seconds which let the name node know that the data node is still available.

As with the VLOOKUP function, searches can be done as exact matches or approximate matches depending on the following match_type values:

- match_type = 1—(default ) An approximate match lookup that finds the largest value less than or equal to lookup_value. The values must be arranged in ascending order. - match_type = 0—An exact match lookup that finds an exact match to lookup_value. - match_type = −1—An approximate match lookup that finds the smallest value greater than or equal to lookup_value. The values must be arranged in descending order.

Database functions include:

1) Provide a user view of the database 2) Add and modify data 3) Store and retrieve data 4) Manipulate the data and generate reports 5) Provide security management 6) Provide database backup and recovery services

The three most commonly used data mining techniques are:

1.) Association analysis: a specialized set of algorithms sorts through data and forms statistical rules about relationships among the items 2.) Neutral computing: historical data is examined for patterns that are then used to make predictions 3.) Case-based reasoning: historical if-then-else cases are used to recognize patterns

What are the three types of nodes?

1.) Data node store the actual file data 2.) Name node contains file system metadata 3.) Client node makes requests to the file system as needed to support user applications

What are the two most important components of Hadoop?

1.) Hadoop Distributed File System (HDFS): low-level distributed file processing system that can be used directly for data storage 2.) MapReduce: programming model that supports processing large data sets

Data model

A diagram of data entities and their relationships - Understanding of specific business problem

Benefits gained through use of high-quality data

1.) Improves decision making - Using high-quality data in decision making - Removes guesswork and risk taking 2.) Increases customer satisfaction - Leads to high customer satisfaction - Bad data causes unfavorable data errors 3.) Increases sales - Provides accurate consumer targeting and communications - Enables successful up-sell and cross-sell suggestions 4.) Improves innovation - Improves worker efficiency, product and/or service quality, or the customer experience 5.) Raises productivity - Employees can focus on the core mission 6.) Ensures compliance - Ensures compliance with requirements - Example: HIPAA,

Approximate match lookups can reference a vertical or horizontal lookup table. A vertical lookup table for approximate match lookups must be organized as follows:

1.) The first column of the lookup table stores the values that will be compared to the lookup value. 2.) The values in the first column must be sorted in ascending order. 3.) The first entry in first column contains the lowest possible comparison value. 4.) Each subsequent entry in the first column contains the lower end of the range that the lookup value is compared to.

Successful social media marketing is all about getting the user engaged. An engaged user is more likely to turn into a loyal customer. There are four general types of social media engagement with increasing levels of user involvement:

1.) The user has viewed the social media post. 2.) The user has indicated approval or "like" of the post. 3.) The user has taken the time and effort to comment on the post. 4.) The user has shared the post with others.

Data mining

A BI analytics tool used to explore large amounts of data for hidden patterns to predict future trends and behaviors for use in decision making

Evolutionary method

A Solver method used for complicated expressions involving discontinuous functions.

GRG Nonlinear method

A Solver method used for complicated expressions involving nonlinear functions.

Simplex LP method

A Solver method used for simple linear expressions.

Answer report

A Solver report summarizing the results of a successful solution.

Sensitivity/ Limits report

A Solver report that displays the mathematical aspects of a successful solution.

Cost-Volume-Profit (CVP) Analysis

A branch of financial analysis that studies the relationship between expenses, sales volume, and profitability.

Information

A collection of data organized and processed so that it has additional value beyond the value of the individual facts.

Data Definition Language (DDL)

A collection of instructions and commands used to define and describe data and relationships in a specific database

Record

A collection of related data fields (row) In other words: (A collection of attributes about a specific entity.)

File

A collection of similar entities.

MapReduce program

A composite program that consists of a Map procedure that performs filtering and sorting and a Reduce method that performs a summary operation.

A constraint is:

A condition that limits the solution to a set of possible values.

MAXIFS function

A function that calculates the maximum value in a range that meets multiple criteria.

MINIFS function

A function that calculates the minimum value in a range that meets multiple criteria.

COUNTIFS function

A function that counts the number of cells within a range that meet multiple criteria.

INDEX function

A function that returns the value from a data range at the intersection of a specific row and column.

IFS function

A function that tests for multiple conditions without nesting.

A PivotChart is:

A graphical representation of PivotTable. Like PivotTables, PivotCharts include interactive tools that you can use to easily add and remove fields from different sections of the chart. This lets you explore data from a wide variety of viewpoints.

Conversation funnel

A graphical representation that summarizes the steps a consumer takes in making the decision to buy a product and become a customer

Database Management System (DBMS)

A group of programs used to access and manage a database as well as provide an interface between the database and its users and other application programs.

Data warehouse

A large database that collects business information from many sources in the enterprise in support of management decision making. In other words: (A large database that holds business information from many sources in the enterprise, covering all aspects of the company's processes, products, and customers.)

OR function

A logical function that returns TRUE if either part of the condition is true. In other words: (A logical function that returns a TRUE value if ANY of the logical conditions are true and a FALSE value if NONE of the logical conditions are true.)

AND function

A logical function that returns the word TRUE if both arguments are true. In other words: (A logical function that returns a TRUE value if ALL of the logical conditions are true and a FALSE value if any of the logical conditions are false.)

An approximate match lookup is:

A lookup in which the lookup value is compared to ranges of values rather than a single, specific value. In other words: (Uses the value of true in the logic of the syntax of the function or leave the Range_lookup blank.)

Field/Attribute

A name, number, or combination of characters that describes an aspect of a business object or activity (column) In other words: (A characteristic of an entity.)

Local optimum

A nearby highpoint or local solution to a Solver problem.

Entity

A person, place, or thing for which data is collected, stored, and maintained. (tables)

Data Lifecycle Management (DLM)

A policy-based approach to managing the flow of an enterprise's data, from its initial acquisition or creation and storage to the time when it becomes outdated and is deleted.

Descriptive Analysis:

A preliminary data processing stage used to identify patterns in the data and answer questions about who, what, where, when and to what extent. 1.) Visual Analytics - Word Cloud - Conversion Funnel 2.) Regression Analysis

Iterative process

A process used by Solver that starts with an initial solution that it uses as a basis to calculate new solutions.

Data table

A range of cells that shows the resulting values when one or more input values are varied in a formula; in a chart, a grid containing the chart data. In other words: (An Excel table that displays the results from several what-if analyses. Consists of input and result cells.)

An objective cell is:

A result cell that is maximized, minimized, or set to a specific value.

Predictive Analysis:

A set of techniques used to analyze current data to identify future probabilities and trends, as well make predictions about the future 1.) Time Series Analysis 2.) Data Mining

Relational Database Model

A simple but highly useful way to organize data into collections of two-dimensional tables called relations

Cross-Industry Process for Data Mining (CRISP-DM)

A six-phase structured approach for the planning and execution of a data mining project

Database Administrator (DBA)

A skilled and trained IS professional who holds discussions with business users to define their data needs; applies database programming languages to craft a set of databases to meet those needs; tests and evaluates databases; implements changes to improve the performance of databases; and assures that data is secure from unauthorized access.

Structured Query Language (SQL)

A special-purpose programming language for accessing and manipulating data stored in a relational database. - Originally, from IBM - "Structured English Query Language"

Data Manipulation Language (DML)

A specific language, provided with a DBMS. - Allows users to access and modify the data, to make queries, and to generate reports

Data mart

A subset of a data warehouse that is used by small- and medium-sized businesses and departments within large companies to support decision making. - A specific area in the data mart might contain greater detailed data than the data warehouse

Hadoop Distributed File System (HDFS)

A system used for data storage that divides the data into subsets and distributes the subsets onto different servers for processing.

Query by Example (QBE)

A visual approach to developing database queries or requests

Word cloud

A visual depiction of a set of words that have been grouped together because of the frequency of their occurrence

Database

A well designed, organized, and carefully managed collection of data. In other words: (a collection of integrated and related files (tables))

For example, the following expression returns the value TRUE only if A1 equals 2 and B1 equals 4 and C1 equals 10:

AND(A1=2, B1=4, C1=10)

To return a true value only if all conditions are true, use

AND(logical1, [logical2], [logical3], ...)

Database approach to data management

An approach to data management where multiple information systems share a pool of related data. - Information systems share a pool of related data - Offers the ability to share data and information resources - A database management system (DBMS) is required - Meaning... need an interface between the database and the user

Foreign Key

An attribute in one table that refers to the primary key in another table - This is the power of a database on how the tables "talk" to each other

Data Steward

An individual responsible for the management of critical data elements, including identifying and acquiring new data sources; creating and maintaining consistent reference data and master data definitions; and analyzing data for quality and reconciling data issues.

Data scientists:

An individual who combines strong business acumen, a deep understanding of analytics, and a healthy appreciation of the limitations of data, tools, and techniques to deliver real improvements in decision making. - View a situation from many angles - Determine what data and tools can help further an understanding of the situation - Often work in a team setting with business managers and specialists - Are highly inquisitive

A variable cell is:

An input cell that changes so that the objective cell can meet its defined goal.

Data management

An integrated set of functions that defines the processes by which data is obtained, certified fit for use, stored, secured, and processed in such a way as to ensure that the accessibility, reliability, and timeliness of the data meet the needs of the data users within an organization.

A PivotTable is:

An interactive table that groups and summarizes data in a concise tabular format. In other words: (An interactive table that summarizes data from a data range or an Excel table in a concise tabular format. Key word is summarize.)

Hadoop

An open-source software framework including several software modules that provide a means for storing and processing extremely large data sets. De facto standard for most Big Data storage and processing - Java-based framework for distributing and processing very large data sets across clusters of computers

Break-even analysis

Another term for Cost-Volume-Profit (CVP) analysis.

Benefits achieved from BI and Analytics:

BI and analytics are used to achieve a number of benefits: - Detect fraud: MetLife example - Improve forecasting: Kroger example - Increase sales: Auto manufacturers - Optimize operations: Chevron mix of products - Reduce costs: Coca-Cola

Enterprise data modeling

Data modeling done at the level of the entire enterprise. - Needs of the whole company, roadmap for building the information system In other words: (A data model that identifies the data entities and data attributes of greatest interest to the organization along with their associated standard data definitions, data length and format, domain of valid values, and any business rules for their use.)

Entity Relationship (ER) Diagram

Data models that use basic graphical symbols to show the organization of and relationships between data. - Like a flow chart - visually seeing the flow of information

High volume

Default block sizes is 64 MB and can be configured to even larger values.

Data governance

Defines the roles, responsibilities, and processes for ensuring that data can be trusted and used by an entire organization. In other words: (The core component of data management; it defines the roles, responsibilities, and processes for ensuring that data can be trusted and used by the entire organization, with people identified and in place who are responsible for fixing and preventing issues with data.)

Scenarios can be displayed in what two reports?

Excel Table or PivotTable

Variety

Key characteristics of big data: data comes in a variety of formats

Schema

Description that defines the logical and physical structure of the database by identifying the tables, attributes in each table, and the relationships between attributes and tables.

Fault tolerance

Designed to replicate data across many different devices so that when one fails, data is still available from another device with the idea that hardware errors will be experienced.

To drill down a PivotTable:

Double-click any cell within the table. Excel then opens a new sheet displaying the records used to calculate that PivotTable value.

For horizontal lookup tables, use the function:

HLOOKUP(lookup_value, table_array, col_index_num, [range_lookup=TRUE])

To test one condition against two possible outcomes, use:

IF(logical_test, value_if_true, [value_if_false])

To test for multiple possible outcomes, use:

IF(logical_test1, value_if_true1, logical_test2, value_if_true2, ...)

Choosing a Recommended PivotTable:

If you're not sure what to include in a PivotTable or how to structure it, you can use the Recommend PivotTables tool. To use the tool, select any cell within a data range or Excel table, and then click the Recommended PivotTables button in the Tables group on the Insert tab. A gallery of PivotTable layouts suitable for that data opens. Choose the one you find most useful and relevant. The PivotTable is inserted on a new sheet in the workbook using the layout you selected.

SUMIFS function

In Excel, a function that adds values in a range that meet multiple criteria.

MATCH function

In Excel, the function that returns the position number of an item in a range or table.

Drilling down

In Excel, the process in data analysis in which the data view goes from the most general to the most specific. In other words: (From summary data to displaying the detail records responsible for the summary calculation.)

Slack

In Solver, the difference between the value in the cell and the value at the limit of the constraint, showing how close the constraint came to be a binding constraint.

Simulation

Involves using a model expressed in the form of a computer program to emulate the dynamic responses of a real-world system to various inputs.

Choosing a Report Layout:

In addition to changing the structure of a PivotTable by moving fields into the different areas, you can change the overall table layout. PivotTables have possible three report layouts: - Compact Form —(the default layout) places all fields from the Rows area in a single worksheet columns and indents values to distinguish nested fields from other fields. - Outline Form —places each field in the Rows area in its own column and includes subtotals above every field category group. - Tabular Form —places each field in the Rows area in its own column and includes subtotals below every group. To switch between PivotTable layouts, click the Report Layout button from the Layout group on the PivotTable Tools Design tab. Choose the PivotTable layout that presents your data in the informative and effective format.

Text and Video analysis

Involves various techniques to view text and video to glean insights and data relevant to decision making

Time series analysis

Is the use of statistical methods to analyze time series data and extract meaningful statistics and characteristics about the data. - Sequence of well-defined data points measured at uniform time intervals over a long period of time. - Used for forecasting and monitoring - Example: hourly temperature in Malibu dating back to 1976

Fixed expense

In financial analysis, an expense that must be paid regardless of sales volume. In other words: (An expense that doesn't change in total as more or less units are produced or sold. Example: manager's salary)

Business Intelligence (BI)

Includes a wide range of applications, practices, and technologies for the extraction, transformation, integration, visualization, analysis interpretation, and presentation of data to support improved decision making

Economical Information

Information should also be relatively economical to produce. Decision makers must always balance the value of information with the cost of producing it.

Accessible Information

Information should be easily accessible by authorized users so they can obtain it in the right format and at the right time to meet their needs.

Secure Information

Information should be secure from access by unauthorized users.

Verifiable Information

Information should be verifiable. This means that you can check it to make sure it is correct, perhaps by checking many sources for the same information.

Volume

Key characteristics of big data: huge size of data

Global optimum

The overall high point or optimal solution to a Solver problem.

Break-even point

The point where total revenue equals total expenses. Net Income would be zero.

Visual Analytics

The presentation of data in a pictorial or graphical format.

Data cleansing/scrubbing

The process of detecting and then correcting or deleting incomplete, incorrect, inaccurate, irrelevant records that reside in a database (compact & repair).

Data Normalization

The process of organizing the data in a relational database to eliminate data redundancy (all data is stored in only one place) and to ensure the data dependencies make sense (only storing related data in a table).

Domain

The range of allowable values for a data attribute.

Data item

The specific value of an attribute - Found in the field for that one record

Big Data

The term used to describe data collections that are so enormous (terabytes or more) and complex (from sensor data to social media data) that traditional data management software, hardware, and analysis processes are incapable of dealing with them.

Timely Information

Timely information is delivered when it is needed. Knowing last week's weather conditions will not help when trying to decide what coat to wear today.

Refresh

To update a PivotTable so it reflects changes to the underlying data.

=COUNTA(Media[Site]) What does site mean? (EX7-18)

[Site] is a structured reference referring to a column or field name.

A DBMS can produce:

a wide variety of documents, reports, and other output that can help organizations achieve their goals

The process of defining relationships among data to create useful information requires:

knowledge (Which is the awareness and understanding of a set of information and the ways in which it can be made useful to support a specific task or reach a decision)

Another type of slicer is a timeline slicer, which:

filters a PivotTable to include only those records from a specified date interval. For example, you can limit the PivotTables only to those results between January and April or for an interval of specific years.

Break-even point formula

fixed expenses / sale price per unit - variable expenses per unit

Data must be organized in a meaningful way to transform it into useful:

information (Such as total sales for the month, inventory levels reaching zero)

Hadoop uses several types of __________.

nodes

One-variable-data table (EX8-2)

performs several what-if analyses by specifying one input cell and any number of result cells.


Ensembles d'études connexes

Health Assessment Ch 15 Head and Neck

View Set

Essentials of Networking 10,11,12

View Set

Miro-Economics Mixon Final Troy University

View Set