ACCT 3303 Chapter 4, UTA Sargent
Load (3 of 3, ETL)
Load- third part of ETL, involves importing the Transformed data into the target system. Target system is the final data store and will be where the actual data analysis occurs. will be a full load, bringing all data into the target data store. The target system may be a data Warehouse, data mart, or a database.
Managerial Accounting
Management accountants have focused on their efforts in 3 areas: Cost management and reporting, performance measure and analysis, supporting managements planning and decision making efforts.
audit data analytics
is the art and science of identifying and analyzing pattern, trends and anomalies through an examination of audit related data. examination entails, modeling visualization and analysis, and informs the planning execution of assurance engagement.
Velocity (2 of 4)
is the speed at which data are created. Data is originating everywhere. global population of Internet-connected devices exceeded the global population of people. Organizations finds dealing with the speed which data are created overwhelming. An Analogy is "drinking water through a fire hose."
Tax Accounting
tax department is one of the largest consumers of data. tax accountants must pour over and consume volumes of financial and transactional data. taxes play a critical role in maximizing shareholder value within an organization. tax accountants must extract data these systems and manually review, reconcile, and manipulate the data so that they are in a format easily used for their purposes.
Hadoop
Hadoop- is an open source infrastructure for storing and processing large sets of unstructured, semistructured, and structured data. Two interesting facts of hadoop is, lend themselves to big data environments are the manner in which it stores data and the manner in which it processes data.
Data Provisioning
Specific data relevant to the decision at hand will need to be extracted from their perspective data stores, modified and cleansed, loaded up to a final target data store for analysis.
Extract (1 of 3, ETL)
Extract- first part of ETL, Involves identifying and copying relevant data from a source data store to make it accessible for further processing. most critical step is identifying which data to extract from the source system. Also, Extracting large amounts could load the network or data store.
Social Sources
Includes vast quantities of unstructured data from social media sites, such as facebook, twitter, pinterests, linkedin, youtube, and instagram. Social sources create voluminous quantities of data at a staggering rate. Monitors social media, agile organizations are better positioned to modify a product and service offerings to respond to changing customer demands.
Volume (1 of 4)
Sheer quantity and scale of data. Data created at an astonishing rate . 90% of todays data are less than 2 years old. it would take 10 million blu-ray discs (approximately 2.5 quintillion bytes) to record new data created at a given day. Facebook has more users than China. Facebook stored 250 billion photographs in 2015. In essence, people and processes become more connected since the amount of data is growing exponentially. Volume of data creates problems for accountants and tools they use to analyze them. Accountants must perform complex analyses.
Cloud Databases
Shifts towards cloud computing have major implications for modern database systems. Database-as-a-Service(DAAS) allows firms to outsource their databases to cloud service providers, and use of DAAS has grown rapidly over the past several years. many major players in database, software, and online industries are making large investments in cloud databases (Amazon, oracle, microsoft, SAP)
Data Procurement
Since data is being generated at a staggering rate, Data may not be useable or useful for decision-making in accounting context. Data relevant for analysis from budget, control, and accounting perspective originate 3 broad categories: Operational, Mechanical, or Social
Transform (2 of 3, ETL)
Transform- second part of ETL, involves cleaning and converting the data from source systems so they can be loaded into the target systems. Since data may be coming from several data sources, field format may be different from the data source. EX. ( M/F one data source, Male/Female other data source, (205) 555-1234 one data source, 205.555.1234 another data source.)
Variety (3 of 4)
Variety the most challenging aspect, deals with the diversity of Data that organizations create or collect. Traditionally, data resulted form processing transactions, adhering to a given structure that was easily stored in relational databases or flat flies. Today, organizations now capture and collect unstructured data such as media files (sound, video, image), Twitter feeds, social media posts, scanned documents, web pages, blogs, and emails. challenges traditional database management systems and also incompatible with earlier data analysis tools.
Veracity (4 of 4)
Veracity is the extent to which the data can be trusted for insights. for data to have predictive and/or feedback value, it needs to be objective and representative. for example, when we ask for advice from people that share our worldview, we run the risk of hearing what we want. This is known as Confirmation bias. When we utilize data that is inherently biased for an analysis, we run the risk of making decisions that omits relevant facts.
Hadoops distributed file system (HDFS)
allows it store files that are very large. ex several terabyte in size. a 2 terabyte Microsoft word document would be roughly 86 million pages long. single file might exceed the storage capacity of a hard drive. How does HDFS store files this large? utilizing a name node and a cluster of data nodes. Hadoop also uses a program called MapReduce that is optimized for processing large data sets in a clustered, distributed environment.
Data mart
are smaller than data warehouses, typically focus on one application are, for ex. marketing data. in most ways, they are similar to data warehouses, data mart acts as an interface between a data warehouse and its users. data mart is usually designed for a particular subset of users in the organization.
Descriptive Analytics (1 of 3)
are useful in telling us what happened and are historical in nature. represents more than 80% of business analytics. examples include descriptive statistics ( min, mean, mode, max) provides insights to how specific regions are doing and make comparisons that inform resource allocations or identify potential issues. to facilitate monitoring and internal control , continuous monitoring systems utilize descriptive analytics to evaluate transactions against established norms and ranges.
Data analysis
decision makers looking to accountants, What happened? What might happen? What should we do?
Assurance and compliance
focus on obtaining reasonable assurance that managements assertions are complete, accurate, and truthful, and that the organization is complying with applicable laws, rules, and regulations.
Presentation
for analysis to be effective the results must be communicated in a meaningful format. most common communicating results are reports or some sort of visualization. Visualizations (dataviz) use graphical representations of data and results from analyses. can take the form of charts, graphs, geomaps or other graphics.
Operational Sources (1 of 3)
include information systems that collect data regarding the business events of an organization and support its day-to-day business data requirements. Transaction Processing systems handles the collection, modification, retrieval of data. (making a sale, billing a customer, depositing cash.) Enterprise Resource planning (ERP) larger organizations, are comprehensive information systems that integrate front- and back office business processing functionality and are supported by a single centralized database. Utilizes and stores all data relating to the organizations business events and transactions, used as a primary source of data analytics. Legacy systems is a catch -all phrase that refers to the aging applications and/or hardware that is either outdated or in need of a replacement.
Mechanical Sources (2 of 3)
includes sensors that may be worn, embedded in devices, or ingested. These devices gather different types and quantities of data, which can be stored internally or uploaded via wireless connections to a server. (sensors can monitor health and location) can also be valuable in input into better estimating warranty costs, failure rates on equipment, as well as providing usage statistics on equipment for chargeback and automated billing systems.
Tax Data Hub
is a specialized data mart designed to provide a single version of truth for tax-related data. for organizations, this centralized data store automatically extracts data from source systems and loads it in a standard format that lends itself to the needs of of the tax department.
Data Access
online analytical processing (OLAP)- involves using ETL tools to obtain complex information that describes what happened and why it happened. OLAP is used extensively in firms business intelligence groups. EX. (oracle, SAS)
Data Mart
or a data warehouse, the process of extracting data from similar or disparate data sources, transforming the data into a common format, and loading the harmonized cleansed data into a final target data store is known as ETL(Extract-Transform-Load)
Data warehouses
pool the data from separate applications into a large, common body of information. rarely current. typically older information. For data warehouse to be useful it must be (1) error free, (2) be defined uniformly, (3) span a longer time horizon than the company transaction systems, and (4) allow users to answer complex questions for example queries requiring info from diverse sources.
Analyzing Data
proliferation of data has the potential to fundamentally change the practice of accounting. The way accountants interfaced with AIS: (1) as users to enter data into, or extract information from, these systems. (2) as auditors to assess the integrity of data and information from these systems, or (3) investigators assessing the nature of and implications of fraud or other malfeasance perpetuated through AIS. Accountants are expected to challenge these perceptions.
Big Data
the exponential growth in processing capabilities, storage capacity, and network throughput that enables organizations to create, capture, and analyze data at an alarming rate with far reaching results. accounting professionals are beginning to see how big data has the potential to fundamentally change their decision making and how organizations view participants in the decision making process. Big Data is a catch all term for vast quantities of structures, semistructured and unstructured that are on real or near real time. Big Data is characterized by four attributes known as 4 V's: Volume, Velocity, Variety, and Veracity.
analysis and data visualization tools
three core functions: enabling data collection, processing data, and communicating insights.
Predictive analytics (2 of 3)
useful in telling us what might happen in the future and are forward looking. Predictive analytics describes a set of tools that are good for analyzing historical events that may occur in the future. also good in identifying relationships and patterns. predictive analytics can be used to find cellular customers who are at risk of not renewing their contracts, or to analyze past payment patterns to estimate a customers credit worthiness. regression models, decision trees, and machine learning are the three most commonly used predictive anayltics.
Prescriptive analytics (3 of 3)
useful in telling us what should be done by recommending a course of action based on a set of scenarios or inputs. prescriptive analytics aim to offer a suggested solution/s that take into account unknowns.