Analytics MIDTERM
Surrender response to the big data identity threat
"you have zero privacy anyways,. get over it" -Scott McNealy; give in and accept your data will be taken from you
What are 2 conditions necessary for data mining?
1. clean and consistent data data 2. events in the data must reflect current and future trends
Two main problems of the privacy vs convenience dilemma
1. digital slavery 2. identity prison
The lifecycle of predictive models: 1. 2. 3.
1. historical data-data from past collected and prepped to use 2. predictive algorithms-applied on top of the data acquired 3. Model-model is built and when possible embedded into system to optimize performance
Machine learning is a subset of ....
Artificial intelligence, they are no synonymous
AI Subsets: Artificial Intelligence Machine Learning Deep Learning
Artificial intelligence-software that automates and mimics/improves upon tasks that would otherwise require human intel Machine learning- subset of AI results improved without explicit programming Deep Learning-type of ML, includes several layers of analysis between input data and output results
Data collection organizations
Census, BLS, NOAA
Describe the chain that leads from data to a decision Data Information Knowledge Wisdsom Decision
Data: raw numbers Information: contextualized raw numbers knowledge: patterns, trends, insights wisdom: breaking of a knowledge shield decision: leads to some action by managers
What form of analytics does hindsight match with? what are some examples of hindsight?
Descriptive analytics-dashboards, scorecards, data warehouses
Data is a datawarehouse has already forgone the _____ process
ETL
ETL aka
Extraction-taking only important/relevant data Transformation-cleaning data for missing, repeated, incomplete data, take 70% of time Loading-once done, ready to load it into data warehouse
What is the evolution of buisness analytics?
From descriptive analytics, to predictive, and to prescriptive analytics
Types of AI: ___________ Learning on the outside ring and _________ Learning on the inside ring
Machine; Deep
What is a common database for data/public data?
NoSQL, used to query database, does not have a rigid structure like relational databases, can take in a verity and massive amounts of data that is less structured ex: documents
What are concerns with public data
Privacy concerns and accuracy concerns
Reclusive response to the big data identity threat
Protect privacy and control over your identity by avoiding the conveniences of the data gatherers, meaning you dont buy from Amazon, Uber, check WebMS, use AirBNB, Canvas etc. this is impossible to do in todays world
Great leaders ask great _________
Questions
True strategic asset data is when its...
Rare, Valuable, imperfectly imitable, and lacking in substitute
T or F: data by itself can be a source of competitive advantage if used properly
True
T or F: technology by itself is not a source of competitive advantage
True
Describe the 4 V's of Big data
Volume-huge amount, need a better level of IT infrastructure to store information that is much bigger than a laptop could hold Variety-so many different types of data is collected in so many different forms, for example image data, video data, canvas data, biometric data Velocity-constantly coming in at such high volumes in specific directions that are hard to keep up with Veracity-the degree of uncertainty in the data that is used to make a decision; we must make sense of the data even though there is veracity/uncertainty
report
a collection of visualizations and contain much more detailed information than dashboards, created with descriptive analytics
Data Analytics
a comprehensive process to analyze data and produce outputs that can inform decision-making
Dashboards
a visual representation of company performance, use Key Performance Indicators KPI, created with descriptive analytics
you can create new catagories from public data, like what?
ability to afford products, for example: thrifty elders, new age/organiza life style adherents, people who do a lot of medical googling
Data mining has roots in ___________ __________
artificial intelligence
91% of Fortune 1000 senior executives surveyed said ______ _________ initiatives were planned and underway, however many organizations lack skills required to exploit big data, there is a talent shortfall of data analysts/scientists
big data
Privacy Convenience Dilemma
big data is a threat to our privacy because companies use the collection of data for problems like digital slavery and an identity prison
Data Warehouses
central repository of info that can be analyzed to make more informed decisions; data inflows into a warehouse from transactional systems, regional databases, etc. on a regular cadence
Data aggregators
combine data from various sources and package it for resale ex: Carnival Cruise lines integrated their customer data with SES data to target limited marketing dollars to past customers who are likely to afford to use a cruise line again
Descriptive analytics use infrastructure needed tools used disadavantage
commonly known as buisness intelligence, used to create dashboards/reports to show past/current events, need data warehouses for them. Use SQL, PowerBI, Tableau, and Qlik. It is disadvanatged in the fact that it does not explore root causes behind observed trends and can also predict future outcomes, based on historical data analysis
Big data helps track and predict this form of privacy concern
consumer targeting
Privacy is the critical ability to evolve, to adopt a new identity, and _________ robs us of that opportunity
convenience
Dynamic response to the big data identity threat
create a rhizome identity, accept the big data convenicne but rethink how we generate our identify that that we retain control over our information and anatomy, even while submitting to the invasion of the data platforms. Do not give in to the convenient "buy now" or "recommended tabs" instead search for your things on your own, and continuously evolve your identity as you should as a person
rhizome identities conceive privacy as ________, not preserved
created
enterprise software
customer relationship management systems, supply chain management systems, enterprise resource planning systems
Capitalizing on _____ helps firms dominate their markets
data
Buisness Analytics
data analytics applied in the context of buisness
data mart
data is organized into a specific predetermmined and acessible structure, easier for end-users who expect to have regular access to data in a specific format for reporting and standard analysis; contains a subset of data warehouse information, you can take a few bits out of the data warehouse onto a harddrive and taking it with you in your job or office to make it less and more portable
data warehouse
data is organized into a specific predetermmined and acessible structure, easier for end-users who expect to have regular access to data in a specific format for reporting and standard analysis; it has a strong folder structure and is a central hub of all information that is organized logically and cleaned up. They would sort music, documentsm, pictures etc
Predictive analytics are performed by...
data scientists or simpler models by data analysts
surveys and focus groups
data that cant be captured during transactions
Data Scientists toolkit
data visualization ex: GGplot Programming, ex: Python/R Statistics Data mining: machine learning, deep learning, neural networks Non-technical skills (buisness acumen, comm skills, data intuition
What are hard skills required for a data analyst?
data visualization, excel, databases and SQL, foundation of machine learning, knowledge in analytical and stat techniques
Buisness analysts, data engineers, data scientists, and decision makers access ____________ with buisness intelligence tools
data warehouses
what kind of infrastructure does predictive analytics use? tools?
data warehouses and machine learning; Hadoop, R, Python
Predictive analytics applications
demand forecasting, workforce planning, churn analysis, fleet or equipment maintenance, modeling credit/financial risk
Buisness Intelligence (BI) is an alternative name for...
descriptive analytics
forms of response to data threats that require labor/effort
dynamic (adopt rhizome) and reclusive
predictive analytics application: churn analysis
evaluation of a company's customer loss rate in order to reduce it
Big Data tracks you __________
everywhere, when you use a website, excel, a phone, car, laptop, antivirus software, symtpomchecker, tinder, netflix, watches, doctor
Convenience gives us please, but ________ convenience is corruptive, for example:
extreme; a highly automated car robs you of your driving experience; you lose experience of shopping, learning, growing, exploring
Big Data
extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.; has 4 components: velocity, volume, veracity, variety
Computer vision
field of study that seeks to develop techniques to help computers "see" and understand content of Digital images ex: pics and videos It would be able to classify, verify, identify, detect, landmark detect, and recognize objects
rhizome identities locate identity in the ______, not the past, and begin with _______, not nouns
future; verbs
Examples of AI in action
image recognition, computer vision, robo journalists, language translations
Why using data as a strategic asset is difficult:
inconsistent, imcomplete, too little data not enough ppl with right skill set change management and resistance to use of data illegal/unethical issues from machine learning
Transparent data pools
information pool of data derived from explcity exchanges of personal info for services; explicit in the sense you that knowingly hand over personal details and you get a technological convenience; usually in the form of informed consent/terms of agreement
What are the benefits of using data warehouses?
informed decision making, consolidated data, historical data analysis, the data is quality, consistent, and accurate, and separation of analytics processing from transactional databases which improves performance of both systems
When is data an asset?
it can capture, store, and analyze data along with managing change and resistance, digitize and automating the buisness processes, and complement the other existing processes and tech that cant be digitized
Challenges with Databases
just because its collected doesnt mean it can be used as info, legacy systems (outdated and incompatible), most transactional databases not set up for large amounts of data, need to get data into systems that support analytics
Data lake
large amounts of raw; unstructured and structured data, a pool of data for free-form exploration often requires more specialized skills; often people request data from lakes to be extracted in more structured formats, some dont have data lakes but have warehouse or marts directly
Challenges of predictive analytics
large and comprehensive datasets, adaptibility of old models to new problems, data organization and hygiene, data privacy and security
data swamp
like a data lake but with unorganized files all over the place that leads you to dig through them and waste time
Artificial intelligence is created by computer programmers and software developers who apply what tools?
machine learning, deep learning, neural networks, computer vision, natural language processing
Examples of the application of analytics in buisness
marketing analytics, supply chain analytics, HR analytics, healthcare analytics, financial analytics, sports analytics
predictive analytics
more complex and trendy than descriptive analytics, many companies invest heavily in this technology.
object classification- object identification- object verification- object detection- object landmark detection- object recognition-
object classification-what category of object, dog or cat? object identification-what type of a given object is this picture, what kind of dog is this? object verification-is the object in this picture a cat? ex: CAPTCHA or face verification object detection-what are the objects in this pic? ex: cat, dog, human? object landmark detection-what are the key points in this object in the pic? ex: animated effects in picture or video calls object recognition-what objects are in this picture and where are they? ex: this is a coke can
Identity Prison
one isolated mistake in ones past, at one time, was something you could outgrow move away or out from, but now one episode could become escapable and you will be trapped with this mistake/identity forever
Transaction processing systems
point of sale systems, mobile apps, ecommerce
What form of analytics does insight match with? what are some examples of insight?
predictive analytics-data mining, regression analysis, time series, hazard, discriminant
Examples of Data Mining roots in AI
predictive test: predict next word in a sentence FaceID: 30,000 indiscernable infrared dots on your face to identify you, take face image, saves it
What form of analytics does foresight match with? what are some examples of foresight?
prescriptive analytics-optimization, simulation, decision modeling
Dilemma: we want both HIGH _________ and HIGH ___________
privacy; convenience
Data Mining
process of using computers to identify hidden patterns in, and to, build models from, large data sets; applied in: -customer segmentation (healthy eatsers vs junk food eaters), -market basket analyses (those who purchase product x also purchase product y) -fraud detection (uncovering patterns consistent with criminal activity) -hiring and promotion (ID characteristics consistent with employee success in the firms various roles
predictive modeling
process of using known results to create, process, and validate a model that can be used to forecast future outcomes-it analyzes fixed historical data to increase probability of a forecast event happening
Data analysts toolkil (buisness intelligence)
query tools, ex: excel and SQL Programming, ex: Python/R Statistics Data visualization tools: Tableau, PowerBI, Excel, Pivot Tables in Excel Planned Adhoc reporting tools Dashboards
Natural Language Processing
read, decipher, understand and make sense of human languages in a manner that is valuable applied in personal digital assistants like Alexa, Cortana, and Siri, Word processors (grammerly), translation devices/apps, ChatGPT, interactive voice response, Dalle2
An example of predictive analytics
recommendation systems—Netflix telling you what to watch! 75% of what consumers watch comes from recommender systems Preventative maintenance-predict when machines will need to be fixed/updated to be able to account for loss of machinery ex: finnish railway operator
Artificial Intelligence
references to the general ability of computers to emulate human thought and perform tasks in real world environments; used in prescriptive analytics
Predictive analytics practices
regression, neural networks, random forests
Rhizome-Identity response to big data identity threat
renders gathered data inapplicable by disassociating from the past; recognize your identity doesn't stem form what youve done in the past but also from what you WILL do in the future; we must continuously reform our tastes, aspiration, behaviors deep into us, then data platforms wont ahve enough info to predict us and data does not describe the person we are EFFORT AND LABOR IS NEEDED
effort and labor is needed to adapt the ____________ identity
rhizome
Rhizome personality/identity
rhizome: a continuously growing horizontal underground stem which puts out lateral shoots and adventitious roots at intervals. You want to be continuously growing and evolving, when you do this it makes it impossible for AI to predict where youre going/what you want and doesn't let big data keep hold of you; you want to adopt this identity when it comes to privacy versus convenience, this would be called being a dynamic person
Digital slavery
robs you of freedom of creating your own identify, it predicts what you want before you know you want it; who you are/what you want is determined by an app—so would we rather save on time and be convenienced or expend effort by browsing and not clicking suggestions
There are 3 responses to the big data identity threat:
surrender, reclusive, dynamic
Computer vision applications
tesla autopilot, healthcare, industry 4.0, plant care apps, retail
Privacy
the ability to control access to ourselves; ability to decide what remains secret and from whom
Prescriptive analytics
the most advanced and sophisticated more of data analytics, makes use of artificial intelligence (AI) and machine learning (ML) to tell you what you should do
Machine Learning
the technologies and algorithms that enable systems to identify patterns, make decisions, and improve themselves through experience and data; used in predictive and prescriptive analytics; subset of AI
How does data help gain competitive advantage?
theres no monopoly on math, but based on formula, algorithms, and data
How does amazon use descriptive analytics?
they anticipate shopper purchase and cut down on shipping time by starting the process of shipping products to users before they even make a purchase, leveraging past data, behavioral information, and finding patterns.
How do data warehouses work?
they contain multiple databases, within each data is organized in tables and columns and in each column you define a description of data, like integer, data field, or strings, Tables can be organized inside of schemas, which you can think of as folders, when data is ingested its stored in various tables described by the scheme and queery tools use the schema to determine which tables to access and analyze.
Dark Data Pools
third party vendors accumulate and manipulate information without consent from original providers; for example data traded from Tinder (romance), Enolytics (wine), and SYmptomChecker(health) may be purchased by a dat broker Acxiom, then combined with other purchased info before being resold to strategic merchants. By your activity they can see the lifecourse of finding someone, marrying them, and having a baby and then target you with adds the entire way and build an entire life profile
Data is created by...
transaction processing systems, enterprise software that capture operational data, and surveys and focus groups
What kind of data uses informed consent/terms of agreement, and what are the issues with these forms?
transparent data pools; people dont fully understand the extent of their exposure, they dont fully read agreement
What are soft skills required for a data analyst?
understanding goals and problem solving, analytical and critical thinking, presentation and communication skills, foundational knowledge in some particular buisness field
How does Big Data enable the dilemma of privacy versus convenience?
unites surveillance and artificial intelligence; amplified by technology and AI machines; one specific example is with CONSUMER TARGETING
just like Nozick's experience machine in the opium den
we are addicted to pleasure and convenience in data, just like drug addicts are, but you begin to lose your identity when those are provided for you, you dont get to adapt or grow your personal identity yourself instead a machine does this for you
Public data
weatlh, employment stats, gas prices, household income, name, address, SSN
how is data created?
well reputed organizations (ex: Census, BLS, NOAA) and public data
What does descriptive analytics tell us?
what happened and what is happening
Linear programming
what is the shortest route to deliver a package for each of the house using the least fuel? many variables to take into consideration like speed limits, traffic, time of day
What does predictive analytics tell us?
what will happen and why will it happen
Descriptive analytics gives insights on... W____, W_______, W__, and H__
what, when who, and how
What does prescriptive analytics tell us?
why should we do it
Biometric Data
you fingerprints, skin cells, hair, and saliva