stat business quiz 1

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

instance

A single occurrence of an entity is called an

Database Management System (DBMS)

A software application for defining, manipulating, and managing data in databases

volume

An immense amount of data is compiled from a single source or a wide range of sources, including business transactions, household and personal devices, manufacturing equipment, social media, and other online portals.

Predictive Analytics

Analytical models that help identify associations between variables, and these associations are used to estimate the likelihood of a specific outcome

variety

Data also come in all types, forms, and granularity, both structured and unstructured. These data may include numbers, text, and figures as well as audio, video, e-mails, and other multimedia elements.

star schema

Data in a data mart are organized using a multidimensional data model called a _______________, which includes dimension and fact tables.

data transformation

Data transformation is the process of converting data from one format or structure to another.

Predictive Analytics

Examples of ____________ include identifying customers who are most likely to respond to specific marketing campaigns, admitted students who are likely to enroll, credit card transactions that are likely to be fraudulent, or the incidence of crime at certain regions and times.

velocity

In addition to volume, data from a variety of sources get generated at a rapid speed. Managing these data streams can become a critical issue for many organizations.

Data Privacy / Information privacy

Its concerns revolve around (a) how data are legally collected and stored; (b) if and how data are shared with third parties; and (c) how data collection, usage, and transmission meet all regulatory obligations.

interval

Observations can be categorized and ranked, and differences between observations are meaningful. The main drawback of the ________________ is that the value of zero is arbitrarily chosen.

ordinal

Observations can be categorized and ranked; however, differences between the ranked observations are meaningless

nominal

Observations differ merely by name or label.

ratio

Observations have all the characteristics of an interval-scaled variable as well as a true zero point; thus, meaningful ratios can be calculated.

Database Management System (DBMS)

Popular ___________ packages include Oracle, IBM DB2, SQL Server, MySQL, and Microsoft Access.

imputation

The ___________ strategy replaces missing values with some reasonable imputed values

Structured Query Language (SQL)

The most popular query language used today is

subsetting

The process of extracting portions of a data set that are relevant to the analysis is called

data wrangling

__________ helps in data quality, reducing the time and effort required to perform analytics, and helping reveal the true intelligence in the data.

Prescriptive analytics

_____________Examples include providing advice on scheduling employees' work hours and adjusting supply level in order to meet customer demand, selecting a mix of products to manufacture, choosing an investment portfolio to meet a financial goal, or targeting marketing campaigns to specific customer groups on a limited budget.

data ethics

a branch of ethics that studies and evaluates moral problems related to data

data warehouse

a central repository of data from multiple functional areas within an organization

database

a collection of data logically organized to enable easy retrieval, management, and distribution of data.

Delimited format

a comma is called a delimiter, and the file is called a comma-delimited or comma-separated value (csv)

XML

a simple text-based markup language for representing structured data. It uses user-defined markup tags to specify the structure of data.

omission strategy

also called complete-case analysis, recommends that observations with missing values be excluded from the analysis.

dummy variable

also referred to as an indicator or a binary variable, takes on values of 1 or 0 to describe two categories of a categorical variable.

data

are compilations of facts, figures, or other contents, both numerical and nonnumerical

discrete variable

assumes a countable number of values

Data Privacy / Information privacy

branch of data security related to the proper collection, usage, and transmission of data.

continuous variable

characterized by uncountable values within an interval. Weight, height, time, and investment return

Business analytics

combines qualitative reasoning with quantitative tools to identify key business problems and translate data analysis into decisions that improve business performance.

Data Privacy / Information privacy

confidentiality, transparency, and accountability are the three key principles of _______________

population

consists of all items of interest in a statistical problem.

relational database

consists of one or more logically related data tables, where each data table is a two-dimensional grid that consists of rows and columns.

Cross-sectional data

data collected by recording a characteristic of many subjects at the same point in time, or without regard to differences in time.

Time series data

data collected over several time periods focusing on certain groups of people, specific events, or objects.

unstructured data

do not conform to a predefined, row-column format. They tend to be textual (e.g., written reports, e-mail messages, doctor's notes, or open-ended survey responses) or have multimedia contents (e.g., photographs, videos, and audio data).

fixed-width format

each column starts and ends at the same place in every row. The actual data are stored as plain text characters.

Prescriptive analytics

explores several possible actions and suggests a course of action.

Descriptive Analytics

financial reports, public health statistics, enrollment at universities, student report cards, and crime rates across regions and time are examples of

Business analytics

is a broad topic, encompassing statistics, computer science, and information systems with a wide variety of applications in marketing, human resource management, economics, finance, health, sports, politics, etc

entity

is a generalized category to represent persons, places, things, or events about which we want to store data in a database table

entity-relationship diagram (ERD)

is a graphical representation used to model the structure of the data

"Not Only SQL" database

is a non-relational database that supports the storage of a wide range of data types including structured, semistructured, and unstructured data

composite primary key

is a primary key that contains more than one attribute.

information

is a set of data that are organized and processed in a meaningful and purposeful way

HTML

is a simple text-based markup language for displaying content in web browsers.

data mart

is a small-scale data warehouse or a subset of the enterprise data warehouse that focuses on one particular subject or decision area

JSON

is a standard for transmitting human-readable data in compact files.

Sample

is a subset of the population.

primary key (PK)

is an attribute that uniquely identifies each instance of the entity

foreign key (FK)

is defined as a primary key of a related entity

knowledge

is derived from a blend of data, contextual information, experience, and intuition.

data modeling

is the process of defining the structure of a database

binning

is the process of transforming numerical variables into categorical variables by grouping the numerical values into a small number of groups or bins.

big data

massive volume of both structured and unstructured data that are extremely difficult to manage, process, and analyze using traditional data-processing tools

Descriptive analytics

refers to gathering, organizing, tabulating, and visualizing data to summarize "what has happened?"

veracity

refers to the credibility and quality of data

structured data

reside in a predefined, row-column format

Data Wrangling

the process of retrieving, cleansing, integrating, transforming, and enriching data to support analytics.

data mangement

the process that an organization uses to acquire, organize, store, manipulate, and distribute data.

Predictive analytics

using historical data to predict "what could happen in the future?"

Prescriptive analytics

using optimization and simulation algorithms to provide advice on "what should we do?"

Three Vs of Big Data

velocity, volume, variety


Set pelajaran terkait

How to calculate the standard deviation

View Set

International Management - Ch. 8

View Set

Chapter 44: Introduction to the Gastrointestinal System and Accessory Structures

View Set

NIC semester 3, ATI fundamentals

View Set

Biomed.- 4.2.3 Conclusion Questions

View Set

Unit 6: Lesson 1: LS Assignment 1

View Set