ISDS Final Ch. 6
3 Big Data technologies:
- MapReduce - Hadoop - NoSQL (Not only SQL)
MapReduce + Hadoop =
Big Data core technology
One with skills to investigate Big Data; also called Big Data guru; use a combination of their business and technical skills to investigate Big Data
data scientist
because with the backing of one or more executive sponsors, future business graduates from LSU E.J. Ourso College of Business like yourself can get the ball rolling and instill a virtuous cycle: The more departments in your organization realize actionable benefits, the more pervasive analytics becomes across your organization. Fast, easy-to-use visual analytics is the key that opens the door to organization-wide analytics adoption and collaboration.
evangelize
Facts about Big Data
- Big Data by itself, regardless of the size, type, or speed, is worthless unless business users do something with it that delivers value to the organization. - Big Data plus "big" analytics yields value. - The traditional means for capturing, storing, and analyzing data are not capable of dealing with Big Data effectively and efficiently and so a new breed of technologies are needed to take on the Big Data (developed or purchased or hired or outsourced). - Traditional data warehouses have not been able to keep up with the variety and complexity of data - Big Data + "big" analytics = value
facts about data scientists:
- Data scientists use a combination of their business, communication, and technical skills to investigate Big Data looking for ways to improve current business analytics practices (from descriptive to predictive and prescriptive) and hence to improve decisions for new business opportunities. - A data scientist is considered a Big Data guru. - Data scientist positions are in high demand and offered with very high salaries and very high expectations.
Challenges of Big Data Analytics:
- Data volume - Data integration - Processing capabilities - Data governance (security, privacy, access) - *Skill availability* (data scientists are in short supply) - Solution cost (ROI)
Big Data is characterized by these 6 traits:
- Volume - Variety - Velocity - Veracity - Variability - Value proposition
Using data to understand customers/clients and business operations to sustain and foster growth and profitability is:
- an increasingly challenging task for today's enterprises. - is not a new technological fad, rather, it's a business priority
3 Critical success factors for Big Data analytics:
1) A clear business need 2) Strong, committed sponsorship 3) A fact-based decision-making culture
2 Challenges brought about from Big Data?
1) Effectively and efficiently capturing, storing, and analyzing Big Data 2) New breed of technologies needed
7 ways to succeed with Big Data:
1) Simplify 2) Coexist 3) Visualize 4) Empower 5) Integrate 6) Govern 7) Evangelize
Business investments ought to be made for the good of the business, not for the sake of mere technology advancements. Therefore the main driver for Big Data analytics should be an alignment with the vision and the strategy and at any level-strategic, tactical, and operations. Which of the critical success factors for Big Data analytics is being described?
A clear business need
massive volumes of data; also depends on the size of the using organization and means different things to different people; is a misnomer and includes both structured and unstructured data
Big data
an open source framework for storing and analyzing massive amounts of distributed, unstructured data; Originally created by Doug Cutting at Yahoo!
Hadoop
Where do Data Scientists come from?
LSU ISDS Masters of Science in Analytics
a technique popularized by Google that distributes the processing of very large multi-structured data files across a large cluster of ordinary machines/computer processors.
MapReduce
Where does the Big Data come from?
Most Big Data is generated by *machines*.
A new style of database that stores and process large volumes of unstructured, semi-structured, and multi-structured data, and can handle Big Data better than traditional relational database technology
NoSQL (Not only SQL)
This type of analytics evaluates every incoming observation against all prior observations when analyzing Big Data in the context of intelligent systems and recognizing how the new observation relates to all prior observations enables the discovery of real-time insights.
Perpetual analytics
Analytic process of extracting actionable information from continuously flowing/streaming data; also called Data-in-motion analytics and real-time data analytics
Stream Analytics
This characteristic of Big Data is its potential to contain more useful patterns and interesting anomalies than "small" data, *is the most important V.*
Value Proposition
Data flows can be highly inconsistent, with periodic peaks, making data loads hard to manage or can be very consistent.
Variability
*Data today comes in all types of formats*-ranging from traditional databases to hierarchical data stores created by the end users and OLAP systems, to text documents, e-mail, XML, meter-collected, sensor-captured data, to video, audio, and stock ticker data. By some estimates, 80 to 85 percent of all organizations' data is in some sort of unstructured or semi-structured format.
Variety
*refers to both how fast data is being produced and how fast the data must be processed to meet the need or demand.* RFID tags, automated sensors, GPS devices, and smart meters are driving an increasing need to deal with torrents of data in near—real time.
Velocity
refers to the conformity to facts and the accuracy, quality, truthfulness, or trustworthiness of Big Data
Veracity
*the most common trait of Big Data.* Many factors contributed to the exponential increase in data, such as transaction-based data stored through the years, text data constantly streaming in from social media, increasing amounts of sensor data being collected, automatically generated RFID and GPS data, and so forth. Large quantity of technology.
Volume
In ____?______, the numbers rather than intuition, gut feeling, or supposition drive decision making. There is also a culture of experimentation to see what works and doesn't. To create ____?____, senior management needs to do the following: recognize that some people can't or won't adjust; be a vocal supporter; stress that outdated methods must be discontinued; ask to see what analytics went into decisions; link incentives and compensation to desired behaviors.
a fact-based decision-making culture
It is a well-known fact that if you don't have committed executive backing, it is difficult (if not impossible) to succeed. If the scope is a single or a few analytical applications, the support can be at the departmental level. However, if the target is enterprise-wide organizational transformation, which is often the case for Big Data initiatives, _____________________ needs to be at the highest levels and organization-wide. Which one best Critical Success Factor for Big Data Analytics best fills the blank in the previous sentence?
a strong, committed sponsorship
because using the strengths of each database platform and enabling them to collaborate in your organization's data architecture are essential. There is ample literature that talks about the necessity of maintaining and nurturing *synchronicity* of traditional data warehouses with the capabilities of new platforms
coexist
is a method of capturing, tracking, and analyzing streams of data to detect events of certain types that are worthy of the effort.
critical event processing
because Big Data and self-service business intelligence go hand in hand. Organizations with Big Data are over 70 percent more likely than other organizations to have BI/BA projects that are driven primarily by the business community, not by the IT group. Across a range of uses - from tackling new business problems, developing entirely new products and services, finding actionable intelligence in less than an hour, and blending data from disparate sources - Big Data has fired the imagination of what is possible through the application of analytics.
empower
has always been a challenging issue in IT, and it is getting even more puzzling with the advent of Big Data. More than 80 countries have *data privacy laws*. The European Union (EU) defines seven "safe harbor privacy principles" for the protection of their citizens' private data. In the US, Sarbanes-Oxley affects all publicly listed companies.
govern
a stream analytic application and the biggest potential source of Big Data comes from patient monitoring
health services
because *blending data* from disparate sources for your organization is an essential part of Big Data Analytics. Organizations that can blend different relational, semi-structured, and raw data sources in real time, without expensive up-front costs, will the ones that get the best value from Data.
integrate
because it is hard to keep track of all of the new database vendors, open source projects, and Big Data service providers. It will be even more crowded and complicated in the years ahead. Procedures need to be implemented to make it simpler.
simplify
the one main challenge of big data analytics is?
skill availability
A use case in the energy industry for stream analytics is a classic ____________ application for the electric power supply chain
smart grid
because according to leading analytics research companies like Forrester and Gartner, enterprises find advanced *data visualization platforms to be essential tools* that enable them to monitor business, find patterns, and take action to avoid threats and snatch opportunities.
visualize