Analytics

¡Supera tus tareas y exámenes ahora con Quizwiz!

Which of the following are differences between a spreadsheet (e.g., Excel) and a table in a relational database? Select all that apply.

- Each column in a relational database must contain the same data type. - A table in a relational database must contain a unique identifier.

Which of the following joins do NOT require a UNION tool in Alteryx (select all that apply)?

- Left outer join - Right outer join - Inner join

Which of the following are examples of structured data? Select all that apply. Text extracted from PDF documents Relational database tables .CSV files Text contained in emails Excel files with defined fields (i.e., column names) and data types.

- Relational database tables - Excel files with defined fields

Which of the following are benefits of relational databases? Select all that apply. Increases Redundancy Saves space and processing power Reduces risk of data entry errors Easier for visual consumption (relative to flat tables)

- Saves space and processing power - Reduces risk of data entry errors

We can access the Metadata tab (within the Results window) by selecting the "Metadata" option (see image below): Which of the following information is NOT provided in the "Metadata" tab? Select all that apply.

- The number of records in our data - The number of null/missing values in our data - The number of unique values for each field

For which of the following dates would you need more information to determine whether the date is formatted in US date format (MM/DD/YYYY) or European format (DD/MM/YYYY)? Select all that apply. 07/13/2018 04/07/2020 05/25/2017 12/18/2019 12/01/2021 15/09/2016

04/07/2020 12/01/2021

Consider the following regular expression: \d Which of the following string records would be completely identified by this regular expression? 1 12 123 1234

1

AUS751-2MO Canada342,9W GER265:3M US219 8W MEX193|9MO Japan572-5M 1-(\w+)(\d\d\d).(\w)(\u+) 2-(\w*)(\d+).(\d+)(\w*) 3-(\w*)(\d{3}).*(\w+)(\w*) 4-(\w+)(\d{3}).(\d*)(\w\w)

1-(\w+)(\d\d\d).(\w)(\u+)

Assume we estimate the following model: Wages = α + β1*Exper + β2*BEMS + β3*Exper*BEMS + ε where Wages = annual wages (in dollars), Exper = years of work experience, and BEMS = 1 for people who majored in business, engineering, math, or science and zero for other majors. Assume the estimated model is below: Wages = 30,000 + 1,800*Exper + 20,000*BEMS + 4,200*Exper*BEMS According to our model output, it would take 12 years for a person who does not have a BEMS degree to surpass the starting salary for a person who does have a BEMS degree

12 years

Assume you want to estimate the following model: collGPA = a + B1*hsGPA + B2*ACT + e where collGPA = college GPA, hsGPA = high-school GPA, and ACT = ACT score. Assume the estimated model is: collGPA = 1.29 + 0.453*hsGPA + 0.0094*ACT According to our model, what is the expected college GPA of a student with a high-school GPA of 3.49 and an ACT score of 21? Round your answer to 2 decimal places (e.g., 3.13).

3.07

Assume we estimate the following model: Wages = α + β1*Exper + β2*BEMS + β3*Exper*BEMS + ε where Wages = annual wages (in dollars), Exper = years of work experience, and BEMS = 1 for people who majored in business, engineering, math, or science and zero for other majors. Assume the estimated model is below: Wages = 30,000 + 1,800*Exper + 20,000*BEMS + 4,200*Exper*BEMS According to our model output, the estimated annual wages for a person with 5 years of work experience is _____for non-BEMS majors and _____ for BEMS majors.

39000 and 80000

In the "Writing Conditional Statements" lesson, we learned that basic conditional statements in Alteryx are composed of how many parts?

4

Assume we estimate the following model: Wages = α + β1*Exper + β2*BEMS + β3*Exper*BEMS + ε where Wages = annual wages (in dollars), Exper = years of work experience, and BEMS = 1 for people who majored in business, engineering, math, or science and zero for other majors. Assume the estimated model is below: Wages = 30,000 + 1,800*Exper + 20,000*BEMS + 4,200*Exper*BEMS According to our model output, the expected annual wage increase for a person with a BEMS degree is

6000

In the "Viewing Data" lesson, we learned that a ___________ tool allows us to see the entire contents of a dataset and allows us to assess the quality, distribution, and attributes of the data.

Browse

Consider the following regular expression: \w+ Which of the following string records would NOT be completely identified by this regular expression? Goodbye 1234 TIGERS 1a2B3c Clemson University hello

Clemson University

According to Viz lecture #1, what has research found to be the least accurate way to compare magnitudes in a visualization?

Color

Stacked Bar Charts, Line Charts, and Combo charts are all examples of what type of Visualizations?

Comparison

A blank is a summary table that allows us to visualize the performance of a predictive model, including the prevalence of type 1 and type 2 errors

Confusion matrix

In a blank analysis, we are assessing "why did it happen"

Diagnostic

Independent of the transactional system (SAP, Oracle, SalesForce, etc.), the data always contains three pieces of information, (i) information about the process steps and activities that have been conducted, (ii) information about the point in time in which the activities were carried out, and (iii) information about the object or ID for which the activities have been executed. The combination of these three pieces of information is called a:

Digital footprint

A blank feature takes the user to another report that is relevant to the data being analyzed. This type of analyses lets you see the same data in different reports, analyze it with different features, and even display it through different visualization methods

Drill-through

In the "Connecting to Multiple Sheets at Once" lesson, we learned that an INPUT DATA tool and a _____________ tool can be used together to input data from multiple sheets within the same Excel (.xlsx) file.

Dynamic input

In the "Diving into Expressions" lesson, we learned that Alteryx Expressions can be used for a number of data preparation and analytic tasks, such as a) cleansing string data, b) applying conditional logic (i.e., if/then), or c) mathematically transforming numeric values. In the Alteryx FORMULA tool, where do we enter expressions?

Expression editor

A piece of XBRL data is referred to as a:

Fact

In an iXBRL document, anything with orange lines above and below it represents a piece of XBRL data, which are referred to as facts .

Facts

T/F: When interpreting output of a regression analysis, if a coefficient estimate has a p-value that is less that .10 then we can reject the null hypothesis that the x variable is correlated with the y variable

False

T/F: XBRL data includes all amounts reported on a company's financial statements filed with the SEC, but does not extend to data outside of the financial statements (e.g., textual data contained on the face of the report and data from the notes to the financial statements).

False

True or False: Benford's law applies to all datasets, as long as there are a sufficient number of observations in the data.

False

True or False: Pre-attentive attributes require conscious thought in order to be detected by your brain.

False

True or False: Each table in a relational database can have multiple foreign keys and the foreign keys cannot contain duplicate values.

False , each table can have multiple foreign keys and the foreign keys can contain duplicate values

True or False: Both the FORMULA tool and the MULTI-ROW FORMULA tool a) can create new columns or modify existing columns, b) use an expression editor to input functions, and c) can apply multiple expressions per tool.

False, the multi row formula tool only allows one expression per tool

What Alteryx tool can accomplish similar tasks as a VLOOKUP formula in Excel?

Find replace

If you would like to compare one quantitative value across two categorical values the visualization that would best allow you to do this is a

Grouped bar chart

Which Alteryx workflow would produce an "Full Join"?

Input to join tool to union connecting L,J, and R

Research on visualizations has shown that the best way to compare magnitude across multiple groups is through

Length (e.g. different length bars)

Which type of visualization is NOT a distribution visualization?

Line chart

Consider the following statements: (1) XBRL is used to deliver human-readable financial statements in a machine-readable, structured format. (2) The US GAAP Financial Reporting Taxonomy includes standard and custom XBRL tags used to represent financial statements filed with the SEC.

Only 1 is true

The Alteryx TEXT TO COLUMNS tool belongs to this category of tools in the tool pallet:

Parse

In order to parse data with RegEx, we must identify _____________ in the data.

Patterns

Which of the following delimiters is recommended by the AICPA in its Audit Data Standards as a preferred delimiter for files provided to auditors?

Pipe

In a delimited text file (e.g., .csv file), we can tell data software to ignore delimiters that are contained within certain characters. These characters are called text __________.

Qualifiers

What visualization would be the best choice if you would like to show the relationship between two quantitative variables?

Scatterplot

In the "String Functions" lesson, we learned that we can use ___________ functions to remove unwanted characters, including whitespace, from string data.

Trim

T/F: Assume you want to estimate the following model: collGPA = a + B1*hsGPA + B2*ACT + e where collGPA = college GPA, hsGPA = high-school GPA, and ACT = ACT score. Assume the estimated model is: collGPA = 1.29 + 0.453*hsGPA + 0.0094*ACT Our model has an R2 of 0.13. True or false: In this setting, an R2 of 0.13 indicates that the sample students' high school GPAs and ACT scores explain 13% of the variance in college GPAs.

True

T/F: Once digital footprints are gathered, process mining technology uses these digital footprints to visualize and reconstruct the actual process flow, so that the user gets a transparent and objective view of how a process is 'actually' run (relative to how we 'think' it is run).

True

T/F: Assume you want to estimate the following model: collGPA = a + B1*hsGPA + B2*ACT + e where collGPA = college GPA, hsGPA = high-school GPA, and ACT = ACT score. Assume the estimated model is: collGPA = 1.29 + 0.453*hsGPA + 0.0094*ACT True or false: Assuming two students have the same ACT score, if student 1 has a high school GPA of 3.0 and student 2 has a high school GPA of 4.0, then this model would predict that student 1's college GPA is 0.453 points lower than student 2's college GPA.

True

T/F: Any visualization can be exploratory or explanatory. The distinction relates to how/why they are being used and how they are presented (if presented at all).

True

T/F: Exploratory visualizations are used as part of the data analysis phase (not communication phase), and are used to develop a question/problem that has not been clearly defined and help assess a question without a clear answer.

True

T/F: Removing redundant labels from visualizations helps increase the "data-to-ink ratio."

True

T/F: The FASB is responsible for keeping the GAAP Taxonomy current and aligned with FASB Codification.

True

True or False: The OUTPUT tool allows us to output data to one single file or can group data and output this grouped data into separate files.

True

True or False: When transposing data using the TRANSPOSE tool, the tool will output at least two columns of data with standardized names (called NAME and VALUE).

True

True or False: A common audit area in which analytics can streamline the audit is performing tests on massive journal-entry populations to identify risks and items of audit interest.

True

True or False: The DATA CLEANSING tool can be used to: a) Remove null rows/columns, b) Remove unwanted characters, c) Capitalize every letter in a string, and d) Capitalize the first letter of each word in a string.

True

True or False: Two or more spaces in a row can be used as a delimiter in a data file.

True

t/f: Relational databases are the optimal way to store data, while flat files are the optimal way for humans to visually consume data and are also optimal for analyzing data.

True

Assume the Alman Z-score bankruptcy model predicted that company A would go bankrupt net year but company A did not go bankrupt. This is an example of a

Type 1 error

You are working with data where the [Region] field is messy and contains both abbreviations (i.e., N, S, E, W) and the full word (i.e., North, South, East, and West). Which Alteryx conditional statement would clean this data so that the [Region] field only includes the full word (i.e., North, South, East, and West). Option C IF [Region] = "N" THEN "North"ELSEIF [Region] = "S" THEN "South"ELSEIF [Region] = "E" THEN "East" ELSEIF [Region] = "W" THEN "West"ELSE [Region] ENDIF Option D IF [Region] = "N" THEN "North"ELSE [Region] = "S" THEN "South"ELSE [Region] = "E" THEN "East" ELSE [Region] = "W" THEN "West"ELSEIF [Region] ENDIF

c

According to Tufts's visualization principles, we should seek to increase the balnk to blank ratio in visualization that we create

data to ink


Conjuntos de estudio relacionados

Nursing Unit B Practice Questions

View Set

Chapter 5 Key Terms - Business Ethics

View Set

DOD Cyber Awareness Challenge 2024

View Set

Chapter 11: Political Crime and Terrorism

View Set