Abcc Ch. 17

Lakukan tugas rumah & ujian kamu dengan baik sekarang menggunakan Quizwiz!

Load

Once data is transformed and normalized, it's ready to be finally transferred into the data warehouse or datamart. Loading sometimes happens weekly, daily, or even hourly. The more often this is done, the more up-to-date analytic reports are possible, and the more timely they can be.

Transform

Once you've extracted data, it needs to become normalized. Data is no good to you unless it's organized. Normalizing data means that your data is typically organized into the fields and records of a relational database. Normalizing provides the standard data format required to analyze data.

information

Processed data that means something

Volume

Refers to the amount of data collected by an organization. How much data does your business need, and further, where do you keep it once you've collected it?

Structured Data Categories (3)

Structured data Unstructured data Semi-Structured data

5. What decade did businesses start DSS?

- 1980's - [1950s] - 1960's - 1970's 1950S

9. How much data is unstructured?

- 70% - [80%] - 90% - 100% 80%

19. What allows a cluster system that allows data to be stored on multiple servers?

- Business Intelligence - Constellation Server Technologies (Oracle Group LLC) - Early Bot Technologies (Lycos Engine) - [Hadoop] HADOOP

14. In BI, what does the acronym CRM mean?

- Certified Reference Material - [Customer Relationship Management] - Computer Resource Management - Crew Resource Management CUSTOMER RELATIONSHIP MANAGEMENT

24. What software helps BI become more usable through visualization?

- Console - Boundary - GUI - [Dashboards] DASHBOARDS

22. What is a graphical interface that characterizes specific data analysis through visualization?

- Console - [Dashboard] - GUI - Boundary DASHBOARD

16. What is a smaller Data Warehouse?

- Data Depository - Data Silo - Data Store - [Datamart] DATAMART

20. What is another term for Data Mining?

- Data Recognition - Data Detection - Early Data Encounter (EDE, at the server) - [Data Discovery] DATA DISCOVERY

15. What consolidates disparate data?

- Data Silo(s) (usually multiple repositories) - Data Depository - Data Store - [Data Warehouse] DATA WAREHOUSE

18. What is Doug Cutting's son's toy elephant's name?

- Edward - Edvald - [Hadoop] - Eugene HADOOP

23. What attempts to reveal future patterns in the marketplace?

- Extrapolative Analytics - Projecting Analytics - [Predictive Analytics] - Prognostic Analytics PREDICTIVE ANALYTICS

21. What is another term for Text Analytics?

- Text Extraction - Text Withdrawal - [Text-mining] - Text Abstraction TEXT MINING

17. What provides a standard format to organize data?

- Uniform Data (Stored) - Standardized Data - [Normalization] - Prevailing Data NORMALIZATION

7. What kind of data resides in fixed formats?

- Unstructured - Semi-Structured - Structured and Semi-Structured - [Structured] STRUCTURED

13. What refers to quality of data?

- Variety - [Veracity] - Velocity - Volume VERACITY

12. What refers to different kinds of data?

- Velocity - Veracity - [Variety] - Volume VARIETY

11. What refers to how fast data is collected?

- Volume - Veracity - Variety - [Velocity] VELOCITY

4. Large collected datasets are called what?

- Yottabyte Sets - Large Data - [Big Data] - Yottabyte Data Sets BIG DATA

1. What refers to an assortment of software applications to analyze an organization's raw data?

- [BI] - UML - UML and SDLC - SDLC BI

3. In relation to BI, what does the acronym DSS mean?

- [Decision Support System] - Department of Social Security - Digital Signature Standard - Digital Spread Spectrum DECISION SUPPORT SYSTEM

6. What held up the emergence of the Cloud?

- [Internet speeds] - Security - Hard Drive Space - Hard Drive Scarcity INTERNET SPEED

2. Why is keeping enormous amounts of inventory a bad thing?

- [It's too expensive] - Your customers always find what they need - It doesn't take up too much space - It's inexpensive ITS TOO EXPENSIVE

In BI, which of the following is not part of ETL?

- [Transmission] - Transform - Load - Extract TRANSMISSION

8. What data is disorganized and not easily read?

- [Unstructured] - Unstructured and Structured - Structured - Semi-Structured UNSTRUCTURED

10. What refers to the amount of data?

- [Volume] - Variety - Veracity - Velocity VOLUME

Process of collecting internal data

1) Take an inventory of the data your organization makes, and figure out who or what makes it. 2) What your customers are thinking? 3) Deciding what kind of data you require - think of What questions do you need to answer? Because Coming up with questions goes a long way towards the answers you need as well as toward the creation of a baseline for what data you need to collect. 4) find a place to keep the data you want to retrieve

When did organizations organizations start using and processing data & information?

1950's organizations started using and processing data and information to support the tactical and strategic decisions they made, or were going to make influencing the emergence of Decisions support systems (DSS)

Unstructured data

80% of all data is unstructured, disorganized data that cannot be easily read or processed by a computer because it is not stored in rows and columns like traditional data tables.

Big Data

Collected data sets from - smartphone metadata - Internet usage records - social media activity - computer usage records - and countless other data sources sifted for patterns and trends.

Concepts of Data analytics (4)

Data mining Topic analytics Text analytics Business analytics

Forms of business analytics (3)

Descriptive analytics Predictive analytics Decision analytics

Velocity

How fast can you collect data, and more importantly, how quickly can you analyze it?

Varacity

Is the data your organization collected any good? Just because some other business amasses data you may want, doesn't necessarily make it trustable or valuable. Lots of data sources have data that is not "clean," meaning it may be too fragmented to be valuable or usable, or that it was simply collected poorly in the first place.

Data analysis

Makes sense of an organizations collected data and turns it into useful information to validate their future decisions. Basically applying statistics and logic techniques to define, illustrate, and evaluate data

Four V's of Big data

Volume Velocity Variety Veracity

Variety

You may have identified what data you wish to collect, but is it Structured, Semi-structured, or Unstructured? Very likely it is a combination of all three, which could potentially throw a wrench into your data collection gears.

Extract

after you've determined where your data resides you can begin extracting it, often from Customer Relationship Management (CRM) or Enterprise Resource Planning (ERP) software. One of ERPs main function's is to centralize an organization's data so that it ends up being a wealth of data with value across the organization. The extraction step sometimes grabs unstructured data like text notes to semi-structured or structured data by tagging it with metadata.

Dashboards

are easy-to-use graphical interfaces that characterize specific data analysis through visualization, and makes it a lot easier to make sense of the data and see the resulting information.

Business analytics

attempts to make connections between data so organization's can try to predict future trends that may give them a competitive advantage. Business analytics can also uncover computer system inadequacies within an organization.

Predictive analytics

attempts to reveal future patterns in a marketplace, essentially trying to predict the future by looking for data correlations between one thing, and any other things that pertain to it.

Map reduce is not _________________

bringing vast amounts of data back to centralized servers like Data Warehouse and Datamart techniques, it saves immense amounts of network bandwidth and resources.

Decision analytics

builds on Predictive Analysis to make decisions about future industries and marketplaces. Decision Analytics looks at an organization's internal data and then analyzes external conditions like supply abundance and then endorse a best course of action.

SAPBusiness objects dashboard

data visualization software that allows you to create and export interactive dashboards. These dashboards contain various components, such as charts, graphs, and buttons, that are bound to data

ETL can ___________ and requires __________

eat bandwidth and storage at an alarming rate and size. And requires massive amounts of time, money, and storage space.

Businesses' throughout history have tried to _________________

ecognize trends to best serve their customers and in turn, to then try and become more profitable. Essentially, organizations have always tried to predict the future.

Apache Hadoop

is an open-source software framework written in Java for distributed storage and distributed processing of very large data sets on computer clusters built from commodity hardware.

Decision support systems (DSS)

evolved as academic research, are computer-based systems that support an organization's decision-making activities.

With Data Businesses' consider

how much data they are going to collect how fast they can analyze it? what type of data is collected, and is the data reliable?

Hadoop

is a toy elephant owned by Doug Cutting's young son, evolved as an infrastructure for storing and processing large sets of data across multiple servers. BUT INSTEAD OF centralized files in one place, like a Data Warehouses or Datamarts, Hadoop uses a cluster system that allows files to be stored on multiple servers.

Semi structured data

lands somewhere in-between Structured and Unstructured data and can possibly be converted into structured data, but not without a lot of work.

Data

raw, unorganized facts

Business intelligence (BI)

refers to an assortment of software applications used to analyze an organization's raw data, described as computer applications that change data into significant, meaningful information that helps organizations make better decisions - often described as "the set of techniques and tools for the transformation of raw data into meaningful and useful information for business analysis purposes".

Structured data

resides in fixed formats. typically well labeled and often with traditional fields and records of common data tables., and does not have to be "table like" BUT NEEDS TO HAVE RECOGNIZABLE PATTERNS that allow it to be more easily queried ands searched for in a standard format.

Datamart

smaller more focused warehouse that limits the complexity of databases so you cant answer as much as a Datawarehouse but they are cheaper to implement

Data mining

sometimes called Data Discovery is the examination of huge sets of data to find patterns and connections, and identify outliers.

Text analytics

sometimes called Text-mining hunts through unstructured text data to look for useful patterns, like whether their customers on Facebook.com or Instagram.com are unsatisfied with the organization's products or service.

Hadoop is flexible enough _____________

that it allows for one query to be issued that searches through multiple servers on the fly, or ad hoc queries, and is very difficult to implement and run & requires a highly qualified data scientist(s) to run it.

Descriptive analytics

the baseline that other types of analytics are built. Descriptive Analytics define past data you already have that can be grouped into significant pieces like a department's sales results, and also start to reveal trends

Data visualization

the graphic display of the results of data mining, analytics and BI in general, typically in real time. Many times, data and information is just too massive and confusing to rely on numbers, so products like PowerPoint and Dashboards have become invaluable tools. Data Visualization software helps BI program results become more understandable and therefore, more meaningful in decision making.

Map Reduce

the processing arm, or engine of Hadoop that allows data to be queried and processed directly on the server where it lives, instead of moving the data across the network to be analyzed on the computer. & only the query is transported through the network.

Extract transform and load (ETL)

tools used to standardize data across systems, allowing it to be queried. & must happen in order; 1 ) extracting the data 2 ) transforming data so it fits into your data warehouse or datamart 3 ) loading the data into the data warehouse or datamart.

Topic analytics

tries to catalog phrases of an organization's customer feedback into relevant topics. For example, if a customer said, "the barista was friendly", that would be categorized under the topic "Employee Friendliness."

Data warehouses

used to consolidate disparate data in a central location, holding yottabytes of data.


Set pelajaran terkait

NUTR 3362 Ch.3: Planning Nutritious Diets

View Set

Chapter 3 - Basic Shooting Skills

View Set

F2F GOVT2305 Federal Government exam 2 review

View Set

APUSH The american Pageant ch. 2

View Set

Composite Risk Management Army/Civilian

View Set

Chapter 36: The Great Depression and the New Deal

View Set

A first look at Communication Ch. 1-3

View Set