A355 Final Exam

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

Which of the following best addresses outliers? Please select the best answer. a. univariate regression b. multivariate regression c. logistic regression d. sample selection e. adjusted r^2

Answer: D

what do accountants provide that machines do not? Please select the best answer a. alternative methods of analysis b. accurate computation c. fast computation d. interpretation and expertise related to understanding and communicating the computation

Answer: D

which of the following is a balanced panel? Example A: Company / Year A / 1 A / 2 B / 1 B / 2 C / 1 Example B: Company / Year A / 1 A / 2 A / 3 B / 1 B / 2 Example C: Company / Year A / 1 B / 1 C / 1 A / 2 B / 2 D / 2 Example D: Company / Year A / 1 A / 2 A / 3 B / 1 B / 2 B / 3

Answer: Example D

as accounting analytics, which item of the framework do we view differently than data scientists? Please select the best answer a. ask the question b. master the data c. perform the analysis d. share the story

Answer: a

ledger

a list of events or transactions, establishes consensus about facts

excel

best at data analysis and exploring data

robotic process automation

can automate simple tasks, process information very quickly (EX: using a function in excel and dragging it into another cell)

DeFi

decentralized finance; covers a broad array of topics: -smart contracts -stablecoins -decentralized exchanges (DEX) -NFTs

NFTs

essentially a way to digitally store and secure property rights, and can be used in any fashion; non-fungible: each token is unique and cannot be mistaken for another; OpenSea is the biggest NFT marketplace (currently most used for art)

volume

how much data

outer join

maximum information, keep all information

p-value

probability of observing a t-statistic as extreme as the one shown if the null hypothesis is true (small p-values are indicative that we can reject the null hypothesis of no relation)

triple entry accounting

public receipt allows for perfect verifiability of every transaction, instantly in real time; no more need to sample transactions because you have all of them instantly; becomes more difficult to commit fraud because all books are linked and the transaction is public

winsorize the outlier

pulls the outlier back into the distribution, retains the outlier observation but changes its value; sample size remains the same, but value of certain observations changes to be closer to the 'center' of the distribution

variety

unstructured, semi-structured, structured

descriptive analytics

what happened? what is happening?

we consider accounting and non-accounting data sources available for data analysis including:

-financial statements -macroeconomic statistics -supply chain data -financial analyst reports

AMPS model

1. ask the question 2. master the data 3. perform the analysis 4. share the story

Bloom's Taxonomy

1. create 2. evaluate 3. analyze 4. apply 5. understand 6. remember (machines do 4-6, accountants do 1-3)

two broad types of diagnostic analytics

1. identifying anomalies/outliers 2. finding previously unknown linkages, patterns, or relationships between and among variables

two common ways of dealing with outliers

1. trim the outlier 2. winsorize the outlier

unbalanced panel

A panel data set in which some data are missing.

Total Value Locked, or TVL, is an example of a new accounting metric that is made possible through DLT technology. In the scenario below, what is the TVL for A355 Protocol? Abby deposits $1,000 into A355 protocol to validate transactions (staking) Lydia lends $1,000 in digital assets into A355 protocol for lending. A355 protocol itself lends out $800 of Lydia's original investment. a. $2,000 b. $2,800 c. $1,200 d. $3,600

Answer: A

You are worried that you have a correlated omitted variables problem. Which would best help you address this problem? a. ensuring your model has a high r^2 b. Ensuring your Beta coefficient is statistically different from zero at a tail end of the probability distribution (i.e. 1% or lower) c. Including the correlated omitted variable you are concerned about as an additional explanatory variable in your regression. d. ensuring the intercept of your model goes through the origin

Answer: C

You estimate the following trend line analysis (linear regression) in Tableau, and get the following output. What is B1 in this regression? Awesomeness = 0.620356*You + 0.0370451 a. awesomeness b. 0.0370451 c. 0.620356 d. 0.0001 e. You.

Answer: C

what type of task would Robotic Process Automation, a currently commonly used 4th industrial revolution technology, be well suited for? Please select the best answer. a. providing the firm guidance b. creating balance sheets given unstructured information c. classifying lease agreements based on shared keywords d. identifying the most profitable future business opportunities, given accounting data

Answer: C

which of the following does Tableau not do? Please select the bets answer a. visualize data b. compute measures c. create data d. organize data e. tableau does all of the other listed answers

Answer: C

What aspect of distributed ledger technology allowed for the development of airdrops, a brand new capital allocation mechanism? Please select the best answer. a. Because consensus for the ledger is generated via mining, the winning accountant can share their reward for entering transactions into the ledger. b. The decentralized portion of the ledger allows for investors to search out early stage ventures. c. Triple entry accounting allows for startups to interact with established corporations d. Because the ledgers are publicly available, start-up companies can send assets to entities based on their transaction history with other companies.

Answer: D

You wish to investigate how well the independent variables in your regression explain the variance in Y. Which statistic should you use? a. beta coefficients b. alpha coefficients c. p values d. R^2

Answer: D

You have two datasets that you wish to combine. If you did a left-join, which firms WOULD be in the final dataset? Please select ALL firms that would be in the final dataset. Dataset #1 Firm / Accruals A 0.50 B 2.70 C -0.55 Dataset #2 Firm / Cash Flow A -1.0 C 5.0 D 7.0 a. Firm A b. Firm B c. Firm C d. Firm D

Answer: Firm A, Firm B, Firm C

blockchain trilemma

a blockchain has tradeoffs, and its very difficult to simultaneously achieve 1. decentralization 2. security 3. scalability (Ex: Ethereum aims for decentralization and security, but as a consequence it is relatively slow and expensive)

random forest

a grouping of decision trees, injecting randomness at each stage of the process

management by exception

a managerial style that allows management to spend its time addressing issues/problems

regression

a predictive analytics technique that allows the accountant to estimate a specific dependent variable outcome value based on independent inputs (shows the on-average associations between variables of interest, standard regressions assume linear relationships, altman's z score is derived from the coefficients estimated using a linear regression)

classification

a predictive analytics technique used to separate or classify a sample (or population) into two or more groups or classes--fraud/no fraud, extend loan/do not extend loan, etc. (predicting a probabilistic outcome, or what might happen based on our forecasts)

smart contract

a self-executing agreement between two or more parties where the terms are written directly into code--they have dominion over the assets in the contract (Ethereum)

time series analysis

a tool/technique used to predict future values based on past values of the same variable

machine learning

a type of artificial intelligence and is the ability of a computer to automatically learn on its own without being explicitly programmed to do so

smart contracts - wallet

a wallet is a way to access your funds on the blockchain; secured with a private key; MetaMask is the most common wallet

stablecoins

attempt to stabilize value at $1 (or whatever); imagine a crypto asset that is always (hopefully) worth a dollar; can use very quickly to send money (biggest ones currently are USDT and USDC) -- also known as CBDCs (Central Bank Digital Currencies)

decentralized exchanges (DEX)

automated, permissionless, marketplaces that allow instant real-time trading (or swapping) of digital assets

4th Industrial Revolution

based on the use of cyber physical systems

tableau

best for data visualization; does not allow for original data entry

double entry accounting

both parties record the transaction, but there is no way to know if either recorded it right

base rates

defined as probability of an event occurring based on a related historical average

blockchain

distributed, permissionless ledger (anyone can see the ledger, ledger is updated in real time), essentially a distributed database

in class example of hash

hash = nonce + a + b + c - value of last 2 digits of previous hash a = value of the first letter of the transaction b = value of the first letter of the "from" party c = value of the first letter of the "to" party

veracity

how trustworthy

Altman's Z score decision rules

if z < 1.80 classify as significant risk of bankruptcy, or in the "distress zone" if z >= 1.8 and z < 3 classify as at risk of bankruptcy or "gray zone" if z >= 3 classify as not currently at risk of bankruptcy or "safe zone"

inner join

keep only the rows that match in both datasets

hash functions

mathematical functions that transform a given set of data into a string of fixed size; it is deterministic: same input will always produce the same output; one way encoding - can't get original message from encoded

proof of stake

miners put up collateral (stake). they stake their collateral on not submitting false entries. if found to commit false entries, they lose their collateral entirely (miners putting up stake are rewarded with more tokens, don't have to be winning miner to get tokens)

databases

most secure method of storing data

nonce

number only used once, helps calculate the hash function

base rate fallacy

occurs when the prediction places too little weight on the base rates of the past and instead uses different or new information

to be useful to decision makers, accounting needs to be both _______ and ____________ the substance of what actually occurred

relevant; faithfully represent

r^2

represents the proportion of variance of the y variable that is explained by the x variables

velocity

speed of generation or rate of analysis

avoid overfitting by

splitting into training and test samples (can be sequential or random)

left join

start with the first dataset, and add only columns that match the rows in the first dataset

right join

start with the second dataset, and add only columns that match the rows in the second dataset

total value locked (TVL)

sum of all assets deposited in crypto protocols earning rewards, interest, new coins and tokens, and fixed income

ordinary least squares (standard regression)

the regression line is fit so that the sum of the squared errors from the line are minimized

balanced panel

the variables are observed for each entity and each time period

box charts can be used to...

visualize the distribution of data points, identify outliers

value transfer in a distributed ledger network

we call the computers in this network "miners" but really they are accountants (they all vote on the correct hash--decentralized because each are solving and voting separately); they are verifying authentication and authorization, just like banks do

supervised learning

we tell the algorithm what the input/output data is

hypothesis testing

we test the hypothesis and then see if our hypothesized relation is significantly different from zero

prescriptive analytics

what should we do, based on what we expect will happen? how do we optimize our performance based on potential constraints?

diagnostic analytics

why did it happen? what are the root causes of past results?

predictive analytics

will it happen in the future? what is the probability something will happen? is it forecastable?

Altman's Z 5 factors that predict bankruptcy

x1: working capital / total assets x2: retained earnings / total assets x3: earnings before interest and taxes / total assets x4: market value of stockholders' equity / book value of total debt owed x5: sales / total assets

type 2 error

your pregnancy test says you're not pregnant but you are (false negative)

type 1 error

your pregnancy test says you're pregnant but you're not (false positive)

What formula in Tableau allows us to classify firms based on their Altman's Z scores? Assume that we have previously coded the Altman's Z formula in Tableau (as in our lab), and that we named this encoded variable "Altman's Z score". Please read carefully. (Hint, compare the formulas and you should be able to identify the errors in the incorrect ones). a. IF [Altman's Z Score]< 1.8 THEN "Distress Zone" ELSEIF [Altman's Z Score] >=1.8 AND [Altman's Z Score] < 3 THEN "Grey Zone" ELSE "Safe Zone" END b. IF [Altman's Z Score]< 1.8 THEN "Distress Zone" IF [Altman's Z Score] >=1.8 AND [Altman's Z Score] < 3 THEN "Grey Zone" ELSE "Safe Zone" END c. [Altman's Z Score]< 1.8 THEN "Distress Zone" [Altman's Z Score] >=1.8 AND [Altman's Z Score] < 3 THEN "Grey Zone" ELSE "Safe Zone" END d. IF [Altman's Z Score]< 1.8 THEN "Distress Zone" ELSEIF [Altman's Z Score] >=1.8 AND [Altman's Z Score] < 3 THEN "Grey Zone" ELSE "Safe Zone"

Answer: A

What is one problem of double entry accounting that distributed ledgers can solve? a. Distributed ledgers allow for triple entry accounting, in which each party must record the same value for the transaction, where in double entry accounting each party could record a different entry. b. Distributed Ledgers use less energy to maintain compared to double entry ledgers. c. Distributed ledgers allow for each party to keep their transactions completely private and secret from competitors, whereas for double entry accounting these transactions are open for anyone to see. d. Double entry accounting requires a lot of complex understanding of accounting concepts, whereas triple entry accounting requires less understanding of accounting concepts.

Answer: A

Assume we have a regression of the form: Consulting Fees = α + β1(Kelley Alum) + β2(High School GPA) + ε assume that: Consulting Fees is measured in Millions Kelley Alum is an indicator variable equal to one if the consultant is a Kelley Alum, and zero if they are not a Kelley Alum. High School GPA is a continuous variable which is equivalent to the consultant's high school GPA. Assume that β1 is equal to 3, α is equal to 1, and β2 is equal to 2. Assume all variables are statistically significant at the <1% level. Which of the following are correct interpretations of this result? MORE THAN ONE CAN BE CORRECT, SO PLEASE SELECT ALL CORRECT INTERPRETATIONS. a. On average Kelley alums are associated with higher consulting fees, even after controlling for the association between high school GPA. b. On average Kelley alums are associated with higher consulting fees, but this effect is conditional upon having a high school GPA greater than or equal to 2. c. On average a higher high school GPA is associated with lower consulting fees. d. As high school GPA increases by one unit, consulting fees are on average 2 million dollars higher.

Answer: A and D

Assume we have a regression of the form: Consulting Fees = α + β1(Kelley Alum) + ε and that Consulting Fees is measured in Millions, while Kelley Alum is an indicator variable equal to one if the consultant is a Kelley Alum, and zero if they are not a Kelley Alum. Assume that β1 is equal to 3, while α is equal to 1. What is the correct interpretation of this result? a. On average Kelley Alums are associated with an increase of $4 Million in consulting fees. When the individual is not a Kelley Alum, consulting fees are expected to be $1 million. b. On average Kelley Alums are associated with an increase of $3 Million in consulting fees. When the individual is not a Kelley Alum, consulting fees are expected to be $1 million. c. On average Kelley Alums are associated with an increase of $1 Million in consulting fees. When the individual is not a Kelley Alum, consulting fees are expected to be $3 million. d. On average Kelley Alums are associated with an increase of $4 Million in consulting fees. When the individual is not a Kelley Alum, consulting fees are expected to be $1 million.

Answer: B

Which way of dealing with an outlier retains the observation that contains an outlier, but changes the value of the variable in question? a. trim b. winsorize c. cluster d. trim

Answer: B

why are balanced panels helpful in accounting analytics? a. they allow us to more easily compute variable in Tableau b. they allow us to ensure that changes over time are not driven by changes in sample composition c. they allow us to best generalize to the newest real-world data d. they allow us to make sure our economic magnitudes are balanced e. they allow for more accurate tests of statistical significance

Answer: B

Based on firm characteristics, you predict that Company A is likely to commit fraud, while Company B is not likely to commit fraud. What type of analysis is this? a. Regression Analysis b. Deterministic Analysis c. Probabilistic Analysis d. Prescriptive Analysis e. none of the other answers are correct

Answer: C

If we wanted to treat each observation from a dataset as an individual unit in Tableau, so that, for example, we could do a trend line analysis... What would we classify the data type as? Hint - think about the steps we took to do trend line analysis in either in-class diagnostic exercise #1, or in-class diagnostic exercise #2. a. attribute b. measure c. dimension d. count

Answer: C

In accounting analytics we use different tools. When comparing Python versus Tableau, which is TRUE? a. python is easier to work with in creating visualizations b. tableau is cheaper than python c. Python allows for multiple independent variables to be used in a regression, while Tableau only allows for one independent variable. d. tableau allows for more advanced machine learning algorithms

Answer: C

Of the choices below, which type of analysis would you choose if you wish to easily covey a correlation or association between two variables? Choose the best answer a. pie chart b. histogram c. trend line d. bar chart

Answer: C

Which of the following best describes t-statistics? a. they tell you the economic significance of the variable in question b. if a t-statistic is below 1.65, you shouldn't care about what the analysis says c. they are measures of statistical significance d. tell inform you whether the relationship examined is causal

Answer: C

Which of the following is NOT true? a. In Altman's Z classification, we can take the ratio of Working Capital/Total Assets, and multiple this variable times 1.2 to help in calculating the Z score. b. The Z scores were originally derived from regressions, and we use the estimated coefficients from that regression to now classify firms based on bankruptcy risk. c. Altman's Z score predicts that as the ratio of Retained Earnings/Total Assets increases, the company has a higher bankruptcy risk. d. An Altman's Z score of greater than 3 indicates the company is classified as not currently at risk of bankruptcy, or is in the "safe zone" e. all of the answers are true

Answer: C


Kaugnay na mga set ng pag-aaral

Pro Domain 4: Perimeter Defenses

View Set

The Essentials of Human Anatomy and Physiology: Chapter 6 Practice Test (The Muscular System)

View Set

Ms. Le's 7th grade Science Bench mark 1

View Set

Chapter 10: Leadership, Managing and Delegating

View Set