Accounting Data Analytics
Foreign Key
A primary key of one table that appears as an attribute in another table and acts to provide a logical relationship between the two tables (filed that relates to primary key in another table)
SQL Aggregate Functions
COUNT, SUM, AVG, MIN, MAX
What is a data dictionary?
Defines what data is/is acceptable
Financial Performance KPI's : EBITDA
Earnings before interest, taxes, depreciation, and amortization - This measures core business
Requesting data is an iterative practice involving
Extract, Transform, Load
4 areas of balance scorecard
Financial - how the company generates value Customer - how the company interacts with customers Internal process - how efficiently the company is operating Organizational capacity - how the company is training employees
SQL Inner Join
INNER JOIN requires each record in the two tables to have matching records. Creates a new query table by combining column values of the two tables based upon the join-predicate. No NULL values. Returns records that have matching values in both tables.
Benford's Law
Predicts that the first digit of random number sets will begin with a 1 more often than a 2, a 2 more often than a 3, and so on - Looks for anomalies in a large set of data
Prescriptive Analytics:
Procedures that model data to enable recommendations for what SHOULD be done in the future (giving information to tell you what to do) - (decision tree/ support and AI)
Predictive Analytics
Procedures used to generate a model that can be used to determine what is likely to happen in the future (regressions, model of employee data to see if someone will make their payments on time)
SQL Left Outer Join
Returns all records from the left table, and the matched records from the right table
SQL right outer join
Returns all records from the right table, and the matched records from the left table
SQL Full outer join
Returns all records when there is a match in either left or right table
SQL string functions
SQL keywords and clauses are not case-sensitive. But, string comparison are The following three SQL statements will generate three different results: SELECT * FROM Product WHERE Description = 'Denim Jeans' SELECT * FROM Product WHERE Description = 'denim jeans' SELECT * FROM Product WHERE Description = 'Denim jeans' You can use the Lower() or Upper() string function before performing the comparison SELECT * FROM Product WHERE Lower(Description) = 'denim jeans' SELECT * FROM Product WHERE Upper(Description) = 'DENIM JEANS'
SQL
Structured query language. Used by SQL-based databases, such as Microsoft's SQL Server. Websites integrated with a SQL database are subject to SQL injection attacks. Input validation with forms and stored procedures help prevent SQL injection attacks. Microsoft's SQL Server uses port 1433 by default.
Pivot Table
The name of the tool used by most spreadsheet programs to create a summary table
P Value
The probability of observing a test statistic as extreme as, or more extreme than, the statistic obtained from a sample, under the assumption that the null hypothesis is true.
Composite Key
Two foreign keys used for line items (a data modeling technique used in database design to create a unique identifier for a table by combining multiple columns together)
SQL SELECT
What fields do you want to display? It indicates which attributes you wish to view. It pulls what you want from the table.
SQL WHERE
What search criteria are you going to use to choose records? Search criteria is a boolean statement and is phrased 'where this is true'. When changing records, this is used to set the record which would be updated. Tells SQL to filter the data.
SQL Order By
Which fields will you use to sort your records into an order? More than one field can be put here, it 'prioritises' the first field. A sort feature (A-Z) or (Z to A)
SQL FROM
Which table(s) would you like to use to display data from? (from any table or tables)
Primary Key
a field that uniquely identifies a record in a table (Can not be repeated)
Pivot Chart
a graphical representation of the data in a pivot table
KPI (Key Performance Indicator)
a measure of achievement that can be attributed to an individual, team, or department (Specific type of performance metrics used to measure performance at a company)
R Square
a measure of goodness of fit (how well the regression line "fits" the data) - how tight is the fit around the line - high r square means the data would be good to use for a predictive model - if the data is scattered the r square would be lower
T Test
a statistical test used to evaluate the size and significance of the difference between two means
Balance Scorecard
a system of performance measures, including nonfinancial measures, used to assess company and division manager performance. Four perspectives: 1. Customer: what do customers think of us? 2. Internal processes: which of our ops are critical to meeting customer needs? 3. Innovation and learning: how can we improve? 4. Financial: what do our owners think of us?
What is XBRL?
eXtensible Business Reporting Language (XBRL) - Built on XML(eXtensive markup language) - computer readable format for financial statements - used for organizing and defining financial elements
Regression
helps predict expected outcomes - a measure of the relation between the mean value of one variable and corresponding values of other variables
Financial Performance KPI's : Net promoter score
how likely are you to recommend us to your family or friends
Diagnostics Analytics
procedures that explore current data to determine why something happened, looking more at relationships
Descriptive analytics
procedures that summarize and describe the data (mean, etc)
SQL Group By
states how to group/aggregate the stated field
SQL INT(LEFT)
taking the left value
Z-Scores
tells you what the outliers are in a dataset: how many standard deviations something is away from the mean: Z= 3
Regression Line
the line that best fits data
What does it mean for a database to be normalized?
- Remove redundancies - No repeating groups - Watch for transitive dependencies (fields dependent on a field other than primary key) - All tables must have a primary key - Each row is unique - Order columns or rows is not significant
Fuzzy Matching
- locates approximate matches - useful for identifying relationships in imperfect data