WPC 300 Midterm

¡Supera tus tareas y exámenes ahora con Quizwiz!

Data Models

1. Descriptive 2. Explanatory 3. Predictive 4. Prescriptive

Database Management System (DBMS)

1. Helps us create the structural rules that our data will adhere to 2. helps load data into framework we've already established 3. provides additional support such as tracking users and providing log in credentials performing maintenance and routine backups, as well as a host of additional admin. tasks that protect/secure data

problem finding

1. Tell an interesting and complete story • The problem you address should be meaningful • Solution could be reused for related problems • Assumptions, boundaries 2. Find an appropriate solution framework • Break down the problem into pieces • Iterative process (agile vs. waterfall) • Identify appropriate analytical/modeling techniques 3. Routinize the procedure • Documentation • Next similar question can be solved quickly • Build a system (macros, codes, programs)

Typical IT infrastrcuture

Data sources - Operational systems - Flat files ↓ Staging Area ↓ Warehouse - summary data - metadata - raw data ↓ Data Marts - Purchasing - Sales - inventory ↓ Users - analytics - reporting - mining

Logical functions

IF, OR, And, NOT

control group

In an experiment, the group that is not exposed to the treatment; contrasts with the experimental group and serves as a comparison for evaluating the effect of the treatment.

Time functions

TIME, HOUR, MINUTE, SECOND, NOW, YEAR, MONTH, DAY, DATE, DATEIF

Split a cell

To divide one cell into two or more cells.

Cloud Computing

Types: public, private, and hybrid - Types of cloud services: --- Infrastructure as a service (IaaS) --- Platform as a service (PaaS) --- Serverless computing --- Software as a service (SaaS) • Drawback: concerns of security, reliability

data transformation

clean data into proper format or structure for the purpose of querying and analysis

4 squares of data

collection cleaning communication analyses

Meta Data

data that describes other data

Data Extraction

extract data from primary/secondary source

Data load

load data into final target database, more specifically an operational data store, data mart, or data warehouse

A/B testing

method of comparing two versions of a webpage or app against each other to determine which one performs better

Two types of data

transactional and analytical

Cofounding/Spurious Correlation

two variables are correlated, but one does not cause the other

transactional

• Captures data describing and event • An exchange between actors • Real-time

Analytical (Data Warehouse)

• Captures data to support analysis and reporting • An aggregated view of the business • Historical

Simulated data

• Data based on assumptions and simulations • Used a lot in scheduling, routing, queuing

Heuristics

• Experience • Intuition • Rules of thumb • Think fast

Secondary data

• Firm's proprietary databases • Internet data (crawlers) -- https://scrapy.org/ -- Beautifulsoup • Stock/capital markets data (Compustat, CRSP, etc.) • Accounting disclosure data (Form 10K, 10Q)

Survey Biases

• Online Survey Selection Biases • Question order biases • Imbalance sampling • Prevent using leading and biased questions, but use neutral wording. Do not include your opinion in the question. • Don't overloaded your question. Always break questions into singles so your survey is measurable. • Don't ask the wrong questions • Don't survey the wrong people. Note - It is always a good practice to revise survey questions and to target a population that fits the survey's goal by clearly defining the respondent requirements so we can prevent survey biases from happening

Survey -- The Code of Ethics

• Respondents should be given the content, sponsorship (who is conducting the survey), and purpose of the survey so that they may make an informed judgement about whether they wish to participate. • Avoid use of methods that deliberately introduce bias into the results. Note - We should include a paragraph to explain who is conducting the survey, the content, and purpose of the survey

Primary data

• Surveys • Interviews (marketing firm's telephone interviews) • Used a lot in marketing research

4Vs of Big Data

• Volume • Velocity • Variety - different forms of data • Veracity - uncertainty of data

SQL SELECT statement

SELECT column_name(s) FROM table_name WHERE condition* GROUP BY column_name ORDER BY column_name DESC/ASC - *condition: column_name =, <>, >, >=, <, <= 'a value' or a number (need ' ' for test value, not numbers like 123) - Understand all the Keywords: DISTINCT, COUNT, MAX, MIN, AVG, SUM

ORDER BY

Sorts the result according to specified criteria

SQL

Structured Query Language

The Information Architecture of an Organization

Data entry → Transactional Database (stores real time transactional data) → Data extraction → Analytical Data Store (stores historical transactional and summary data) → Data Analysis

VLOOKUP

(lookup_value, table_array, col_index_num, range_lookup(optional)) - To find the salary for Elva, move the mouse to D2, type = VLOOKUP(C3,'Salary chart'!$A$2:$B$8,2) - Since I didn't enter the range_lookup value, Excel used the default value. - That worked because we just need an approximate match. - If we need an exact match, then we need to enter 0 as the range_lookup value.

The Relational Database

- A series of tables with logical associations between them - The associations (relationships) allow the data to be combined - Entity - Primary Key - Foreign Key - What does a row in a table mean? - Why does primary key need to be unique?

A Scientific Perspective to Problem Solving

- Heuristics vs Analytics

Analytics Process

- Identify a problem - Find dataset (Data collection) - Prepare data (ETL) - Analyze data with a data model (Descriptive, Explanatory, Predictive, and Prescriptive) - Interpret Results

The Transactional Database

- In business, a transaction is the exchange of information, goods, or services - For databases, a transaction is an action performed in a database management system - Operational databases deal with both: they store information about business transactions using database transactions Ex of Transactions - purchase of product - enroll in a course - hire an employee Data is in REAL TIME - reflects current state - how things are NOW

Transactional Database

- Supports management of an organization's data - For everyday transactions This is what is commonly thought of as "database management"

Analytical Data Warehouse

- Supports managerial decision-making - For periodic analysis This is the foundation for business intelligence

Decision Making Biases

- Understand situations where biases arise - Based on a situation, you should be able to identify what kind of cognitive bias that is - Differentiate the major kinds of biases discussed

identifier

- attribute that uniquely identifies one entity instance from other instances - FlightNo

attributes

- describe characteristics of an entity - FlightNo, From, To, Departs, Arrives

Descriptive Data Model

- insight into the past • What do we have? What happened? When? • Find Patterns; basic plots of variables; Correlation • Descriptive statistics

Predictive Data Model

- predict the future • What is likely to happen next? • Methods: Regression analysis; Data mining (Classification, decision trees, Association Rules (market-basket analysis), Text mining

Prescriptive Data Model

- provide advice • Recommended actions & timing; Possible automation of actions • Methods and Applications

entities

- something users want to track - flights, traveler, reservation

Explanatory Data Model

- understand a (causal) relationship • Why did it happen? How did it happen? • Does smoking cause cancer? • Methods and Applications

ETL (extraction, transformation, and loading)

A process that extracts information from internal and external databases, transforms the information using a common set of enterprise definitions, and loads the information into a data warehouse time consuming and sometimes done in parallel. Several computation tools to develop robust systems

experiment

A research method in which an investigator manipulates one or more factors to observe the effect on some behavior or mental process

double-blind study

An experiment in which neither the participant nor the researcher knows whether the participant has received the treatment or the placebo

Entity data relationship model

Entities → Attributes → Identifier

Survey Examples

Examples of quantitative research that use survey methodology to answer questions about a population - Polls about public opinion - Public-health surveys - Market-research surveys - Government surveys and censuses Although censuses do not include a "sample", they do include other aspects of survey methodology, like questionnaires and interviewers.

Text functions

LEFT, RIGHT, MID, CONCATENATRE (can also be done with a "&"), Trim, LEN, Find, and TEXT (can be used to format a number and convert it to text)

Statistic Math Functions

MIN, MAX, AVERAGE, MODE, MEDIAN, COUNT, SUMIF, COUNTIF, SUM, SUMPRODUCT, RANDBETWEEN, and how to calculate the range (that will be calculated with MAX - MIN)

OLTP

Online Transaction Processing

PaaS (Platform as a Service)

PaaS includes infrastructure but also middleware, development tools, business intelligence (BI) services, database management systems, and more. PaaS is designed to support the complete web application lifecycle: building, testing, deploying, managing, and updating.

SaaS (Software as a Service)

Pay for software as you use it. Not installed locally, instead it is 'Hosted software e.g. Google Docs

Serverless computing

The cloud provider handles the setup, capacity planning, and server management for you.

OLAP

online analytical processing

IaaS (Infrastructure as a Service)

rent IT infrastructure—servers and virtual machines (VMs), storage, networks, operating systems—from a cloud provider on a pay as you-go basis

SELECT DISTINCT

specifies that the statement is going to be a query that returns unique values in the specified column(s)

randomized controlled experiment

study in which people are allocated at random (by chance alone) to receive one of several interventions one of these interventions is the standard control, and the other are treatments

placebo effect

the phenomenon in which the expectations of the participants in a study can influence their behavior

Analytics

the process of developing actionable decisions or recommendations for actions based on insights generated from historical data • Observe what has worked • Experimentation • Collect data and analyze data • Find patterns • Boundary and Generalizability • Think slow


Conjuntos de estudio relacionados

Sociology 1301: Quiz 1/2 Questions

View Set

Chapter 12: Psychological Disorders

View Set

Chapter 38: Caring for Clients with Cerebrovascular Disorders -

View Set

Function of the Larynx and the Thyroarytenoid

View Set

medications and IV lines Nclex review

View Set

Experiment 9: Dehydrating Cyclohexanol

View Set