Acct 4342 Ch. 8 (Data Analytics & AdHoc Reporting)
Data Warehouse
A centralized relational database, separate from organization's operational database designed to meet needs of anlaysis. Has all the data of the operations (sales, purchases, etc.). But also adds in additional data
Managing big data projects
Approach and understanding = top management informed of and engaged in big data initiative" Understand scope and risks? Data Quality = Does the data collection and storage process result in accurate, reliable, complete, timely data? Data confidentiality and privacy = Does the organization comply with external laws and regulations, and its own internal standards for data confidentiality and privacy? Availability = Are disaster recovery and event response plans in place and reliable?
Conclusion
Big data sets are here and growing in frequency and importance. Varied uses and applications. Must consider ethical and legal risks. Must consider governance and controls over big data projects
Value of Big Data
Help monitoring & evaluate performance in operations and finance. Improve risk and compliance management. Assist with product and service innovation. Improve customer experiences and loyalty. Data monetization (data sales)= everyone else wants the data you have.
Governance of big data
Must establish a clear governance structure for big data projects: (Responsibilities, scope and limits of project. Require a clear purpose, scope and plan. Consider qualitative characteristics of information in formulating big data plans). New, emerging (largely unexplored) possibilities for monitoring of accounting and internal control systems
Important because
Possible new roles for CPAs = data scientists of financial information Applications of big data in auditing, tax work, and risk analysis = Big data analysis of risk of material misstatement of an audit client. Big data analysis of our identified risk areas. "Continuous audit" of big data streams
Risks of big data
Privacy & Data Security = Firewalls, Access Controls, Password Controls, Audit Trails Legal issues and prohibited uses (e.g., of medical data and HIPPA) = Technology and structure. where and how will we store and protect it?
Data Mining:
Process of selecting, exploring and modeling data to uncover relationships and global patterns Two Methods: -Verification = drill-down -Discovery Model = looking for information about the data. what does it say
Velocity:
Speed data needs to be analyzed. This leads to machine learning and AI. Because of volume and need to analyze, have to use big data analytics techniques
Volume
Terabytes, petabytes, exabytes
Big Data
The creation, analysis, storage and dissemination of extremely large data sets. Feasible due to advances in storage technologies (the cloud), advanced data analytics, and massive computing power
Big Data COSO Principle #11
The organization selects and develops general control activities over technology to support the achievement of objectives
Iplications
Will result in expanding existing data warehouses. Big data analytics and smart data = Emerging focus of big data—data mining (discovering data trends), expanded OLAP (online analytical processing). Another name for big data is "smart data", which generally refers to both big data and the use of advanced analytic methods on the data. Example= big data-based audits
Ubiquitous computing
a concept in software engineering and computer science where computing is made to appear anytime and everywhere. In contrast to desktop computing, ubiquitous computing can occur using any device, in any location, and in any format.
Variety:
a lot of different data. what types of data are collected about you all day?
Dark data
data collected from business activities that may be reused in analytics, business relationships, or directly monetized (sold). Part of the reason "dark data" may be unused is because it lacks "meta data," i.e. "data about data" which explains what the "dark data" is. For example, many companies generate lots of (unused) data about their networks (e.g., who is using the networks, for how long, and for what reason) but many companies fail to use this data to understand how to better serve their customers and employees. When one identifies the data as "data about customer needs and uses" instead of "automated data generated by our network systems" this "dark" data is more likely to be reused.
Sources of big data:
data sources = ubiqutous computing (ex smart phones and wearables, the fitbit) the internet of things, biometrics (ex automated human recognition) // 3 three main sources = operational data, social media data, dark data
Gartner defenition:
high volume, velocity, and/or variety info assets that demand new, innovative forms of processing for enhanced decision making, business insights or process optimization
Descriptive
mathematical process that describes real-world events and the relationship between factors responsible for them. No interpretation and no problem solution. Sales by customer,
Prescriptive
tells the user what actions should be take in response to specific questions. How long to run a promotion? Draws on structured data
Predictive:
uses a variety of statistical techniques that draw upon current and past data to calculate the statistical likelihood of future scenarios occurring. When would a patient develop a heart condition? Netflix's CinematchSM - predicts how much you will like a movie based on what you have watched
Diagnostic
views past performance to determine why something happened the way it did. IE - why are sales declining?