Big Data
MapReduce
Map job: takes a set of data and converts it into another set of data, where individual elements are broken down into value pairs. Reduce job: takes the output from a map as input and combines those data value pairs into a smaller set of value pairs. PARALLEL PROCESSING - improves the speed and reliability.
6 Key dimensions
Marketing insights, re-imagined retail experience, non-linear shopping, partner relationships, supply chain, occasion centric relationships
Customer-centric model
Personalization & speed
Partner relationships, occasion centric relationships
are not limited to buying and selling but also include data and media partnerships, making occasion centric relationships a reality
Supply chain
better order fulfillment and faster delivery
Big Data
generates value from the storage and processing of very large quantities of digital information that cannot be analyzed with traditional computing techniques.
First stage of analytics
Advanced analytics algorithms, using search and index technologies begin sifting through all of the different pieces of information for gems of intelligence.
4 C's of Digital
Creating, currating, connecting, culture
Data exhaust
a large proportion of the data that companies gather is not processed with a significant quantity of useful information passing through. For example, loyalty cards are not processed; videos of surgeries are deleted within weeks.
Variety
data becomes increasingly diverse and dense. Photos, videos, 3D models, location data in addition to traditional documents, financial transactions, stock records and personnel data. Many of these Big Data sources are unstructured, thus it is difficult to categorize it.
What is Hadoop?
developing a software library for reliable, scalable, distributed computing systems capable of handling the Big Data deluge.
What does Hadoop do?
distributes the storage and processing of large data across groups or "clusters" of server computers using a simple programming model. detects and compensates for hardware problems or other system failures at the application level.
Causes of Big data
e-commerce, loyalty card schemes, retailers, logistics, financial services, healthcare, etc.
Marketing insights
enable businesses to identify and present the next best content assortment and offers
Non-linear shopping
enable sinking across devices to help customers share ideas and collaborate around shared objectives
Volume
is Big Data;s greatest challenge an as well as its greatest opportunity. Storing, interlinking, and processing vast quantities of digital information. Predicting customer behavior, diagnosing disease, planning healthcare services, modelling climate.
Hadoop Distributed File System (HDFS)
is designed to store very large data sets reliably, and to stream those data sets at high bandwidth to user applications.
Hadoop
new technology solutions to deal with the issue of 3V's. An open-source project from Apache.
Re-imagined retail experience
no longer about selling products. involves redefining channels and creating an integrated and personalized experience.
Analytics
organizations that adopt a full range of analytics capabilities can discover what is happening, determine why it is happening, predict what is likely to happen, and prescribe the best action to take.
Second stage of analytics
the information is correlated and analyzed for patterns and trends, and more than 2 hundered times a second faster than a hummingbird can flap its wings
Velocity
the rate at which data is flowing into most organizations is increasing beyond the capacity of their IT systems to store and process. In addition, users want streaming data to be delivered to them in real time, and often on mobile devices.
Third stage of analytics
this advanced analysis is quickly turned into inside, that is used to determine which actions drive optimal results. Recommended actions along with supporting information are delivered to the systems or people that can effectively implement them rather than making gut decisions and hoping for the best.