BLDR Data Analytics (ch 1-4)

¡Supera tus tareas y exámenes ahora con Quizwiz!

Suppose there are two tables: A and B. What would the following code output? SELECT * A INNER JOIN B on A.orderid = B.orderid; (ch 4)

Only the records from both tables A and B where tables A and B have the dame orderid is correct because inner join refers to the middle where the two have matching data and only that data. While left join talks about the whole left side and middle, and right join talks about the whole right side and middle

The Hadoop ecosystem encompasses Hadoop's various tools and projects, which were designed to promote... (ch 3) Optimal efficiency Optimal usability Optimal reduction Optimal processing power

Optimal usability is correct because the various tools and projects makes it easier to use and keep going

Which of the following is not a type of join? (ch 4) Left join Full outer join Partial inner join Inner join

Partial inner join is correct because there are only four types left join, right join, inner join, and full join

Which is most valuable and most difficult? (ch 1) Predictive analytics Descriptive analytics Prescriptive analytics diagnostic analytics

Prescriptive analytics is correct because referring to the five stages of business, this one is farthest up the graph in difficulty and value out of all the options

Which is likely to be an objective for analytics? (ch 1) Segmentation Demand estimation Market effectiveness Qualitative reasoning

Qualitative reasoning because qualitative versus quantitative. Qualitative uses words, photos or graphs. Quantitative uses numbers. Qualitative data isn't the focus in analytics because we're looking at the numbers.

Which of the following is not a programming language important for business analytics? (ch 2) Python R SQL RStudio

RStudio is correct because R and Python are programming languages, the book also thinks SQL is a programming language. RStudio is a programming tool

Which is least likely to be an example of unstructured data? (ch 1) Email Chat Spreadsheet Tweet

Spreadsheet is correct because the data is filtered and sorted. Social media and chats are the opposite.

Big data are useless to a company unless... (ch 3) The data are raw and unclean The company finds something meaningful in data quickly the company has a place to analyze and easily access the information The data supports assumptions the company has made before analyzing them

The company has a place to analyze and easily access the information

Suppose there are two tables: A and B. What would the following code output? SELECT * from A LEFT JOIN B on A.orderid = B. orderid; (ch 4)

The complete set of records from table A, along with the matching records (depending on the availability) from table B where tables A and B have the same orderid is correct because left join talks about the whole left table while right join talks about the whole right table

Suppose there are two tables: A and B. What would the following code output? SELECT * from A RIGHT JOIN B on A.orderid = B.orderid; (ch 4)

The complete set of records from table B, along with the matching records (depending on the availability) from table A where tables A and B have the same orderid is correct because the right join talks about the whole right table while left join talks about the whole left table

Business analytics is... (ch 1)

The use of data to gain insights from data to maximize business outcomes. Just doing researcher/a study doesn't belong in the analytics

What is the function COUNT() used for? (ch 4)

To return the number of rows that match a specified criteria

What is the role of a relational database? (ch 3) To store semi-structured data so that a business analytics professional can retrieve the data To manage and store big data To store large amounts of data and create easy access to important information

To store large amounts of data and create easy access to important information is correct because a relational database helps us see, understand, and improve a company's metrics

Which of the following is used to restrict rows in SQL? (ch 4) FROM WHERE SELECT GROUP BY

WHERE is correct because SELECT gives you everything from a table, while WHERE allows more of a restriction

What is a primary key in a relational database? (ch 3) A column that uniquely identifies a row in the table A row that uniquely identifies a column in the table Either a column or a row that uniquely identifies a cell in the table Both a column and a row that uniquely identify a cell in the table

A column that uniquely identifies a row in the table

What benefit(s) do non-relational databases have to a business analytics professional? (ch 3) All of these They assist in machine-to-machine data retrieval They allow professionals to extract data from emails, documents, xml files, etc. They are more affordable since many are open source

All of these are correct being relational databases are not as affordable, don't extract info from unstructured data, and not compatible with machine-to-machine data retrieval

What does a query do? (ch 3) Calculates the solution to a complex combination of statistical functions All of these Allows you to ask a question and answer the said question through information pulled from the database Stores information in a virtual "deep pit" for the purpose of future extraction

Allows you to ask a question and answer the said question through information pulled from the database is correct because to access certain information in a database, a user must perform a query

What are the three steps of getting data ready? (ch 1) Data, analytics, and visualization Address outliers, complete missing data, and make the data consistent Extract, transform, and load Clean, structure, and integrate

Clean, structure and integrate is correct. Clean to get rid of the outliers. Structure to help organize so it's easier to read and analyze. Data integration connecting two or more pieces of data to offer more insight, help find unique insights

What are the three legs of business analytics? (ch 1)

Data, analytics, and visualization

Which is an advantage of R? (ch 2) Designed for data analytics Prepares people for programming tasks besides data analysis Integrates with website and mobile apps or a production database Designed for general programming

Designed for data analytics is correct because R is specifically made to analyze data while Python is not

What does open-source mean? (ch 2)

Developed by and for the community and available for free

Which is most incorrect about unstructured data? (ch 1) Fits a relational data model Created by machines Generated by humans May have an internal structure

Fits a relational data model is correct because anything that is structured would be able to fit in a model like a spreadsheet. unstructured data are emails, vlogs, are not going to fit in a spreadsheet

Which type of non-relational database is best for the data structure of social media websites? (ch 3) Graph database Document database Object database Columnar database

Graph database

Which of the following is a top business priority that business analytics supports? (ch 1) Highlight technology opportunities for growth Enrich the arch of augmented reality Increase digital cycles Decrease creative destruction

Highlight technology opportunities for growth is the right answer because businesses use the data to gain insight on how to improve the sales of their products

Which of the following does a relational database not contain? (ch 4) Indices ID graphs Tables Queries

ID graphs is correct because relational databases have tables and queries and ID graphs only apply to non-relational

Cleaning data includes dealing with data that are... (ch 1)

Incomplete, outliers, or inconsistent is correct because we haven't talked about data being invaluable

Which join uses records that have matching values in both tables? (ch 4) Left join Right join Full join Inner join

Inner join is correct because the key word is matching, which will always be in the middle piece

Which is an advantage of Python? (ch 2) Supported by well-established programming tool, RStudio Designed for data exploration Integrates well with programming activities outside data analytics Has over 10,000 existing packages

Integrates well with programming activities outside data analytics is correct because all the other options refer to R not Python.

Which of the following is an example of a non-programming tool for data analysis? (ch 2) Python JMP Rodeo RStudio

JMP is correct because Rodeo and RStudio are programming tools, and Python is a programming language.

What is one function that a relational database can do that a non-relational database cannot? (ch 3) Join queries Delete information Retrieve information Update information

Join queries is correct because queries aren't as easy to use in non-relational databases.

Which type of non-relational database is best for the simplest data modeling? (ch 3) Graph database Columnar database Object database Key-value stores Document database

Key-base value

Which of the following includes matched records from the right table and all records from the left table? (ch 4) Left join Full join Right join Inner join

Left join is correct because a left join will give everything on the left while right join give everything on the right

Which is not an Excel function important to business analytics? (ch 2) Quartile Vlookup Mean Match

Mean is correct because it's not called mean in Excel, its called average. Mean is not an Excel function

Which of the following is a/are non-commercial open-source database management system(s)? (ch 4) MongoDB only None of these MS SQL Server and MongoDB MongoDB and Oracle MS SQL Server and Oracle MS SQL Server Oracle only

MongoDB only is correct because the only two non-commercial open-source database management systems are MongoDB and My SQL, and My SQL isn't an option

In which of the following competing implementations of SQL can data be stored before it is structured? (ch 1) Oracle Microsoft SQL server MySQL All of these None of these

None of these is the correct answer because all of them require the structure before storing the data

What is a programming language? (ch 2)

A formal set of instructions that can be used to produce various kinds of output

Which of the follow is RStudio? (ch 2) A programming code A non-programming tool for data analytics A programming language A programming tool

A programming tool is correct because it's a software package that allows the execution of programming code

What is a programming tool? (ch 2)

A software package that allows the execution of programming code

What is an integer? (ch 2)

A variable that contains numbers without decimal points

Hadoop is an open source software framework that store and processes large amounts of data using... (ch 3) Synchronous map-reduced algorithms Commodity cluster software Asynchronous map-reduced algorithms Commodity cluster hardware

Commodity cluster hardware


Conjuntos de estudio relacionados

PHIL 100 CSU Alvarez, exam 5, Final

View Set

Cell and molecular biology quizzes

View Set

Botany Exam 4 (CH 18, 20, 21, 22)

View Set

Sociology Chapter 10 - Gender Stratification

View Set

LESSON 3: TRANSFER OF REAL PROPERTY

View Set