Concepts of Data Analytics
data
individual facts, figures, signals, measurement raw, unanalyzed, unorganized material
knowledge
information + meaning idea, learning, notion, concept, thought out
wisdom
knowledge + insight understanding, integration, principles
supervised learning
learning a function that maps an input to an output based on example input/output pairs Y = f(x) logistical regression decision tree
test set
measure performance of the selected model on unseen data not always used estimate
analytics are often used
new customer acquisition cross and up sell pricing tolerance supply optimization staffing optimization financial forecasting product placement churn insurance rate sets fraud detection
importance of business analytics
profitability revenue shareholder return understanding of data vital for businesses to remain competitive enables creation of informative reports
interval/ratio data
scale data numerical format measured on continuous scale can be placed in rank order interval scale have no true zero ratio scale does have a true zero ratio has exact value and absolute zero
validation set
select the best model during training avoid overfitting
important trends
storage capacity continues to rise rapidly cost of storage continues to drop
business analytics
subset of data analytics
unsupervised learning
type of ML algorithms used to draw inferences from datasets consisting of input data without labeled responses
variable
unit of data collection whose value can vary
challenges in decision making
we do not know everything huge amount of data mistake can cause disaster uncertainty
analytical landscape
analytical modelers to proliferation of models to operations to target
machine learning
artificial intelligence scientific study of algorithms and statistical models that computer systems use to perform a specific task without using explicit instructions, relying on learning patterns and inferences from examples instead
information
data + context organized, structured, categorized
analytics
data, information technology, statistical analysis, quantitative methods, mathematical/computer methods
utilizing business data
determine business needs capture and store data ensure quality access and format analyze and summarize gain insight and produce action
training set
To find patterns and create an initial set of candidate models
business intelligence
a broad category of applications, technologies, and processes for gathering, storing, accessing, and analyzing data to help business users make better decisions
ordinal data
can be rank ordered no fixed units of measurement can make statistical judgements ex. socio econ. status or military ranks
four types of data
categorical, ordinal, interval, ratio
decision making
choice about a course of action
categorical/nominal data
comprised of categories that cannot be rank ordered each category is just different no quantitative relationship no mathematical operations mutually exclusive exhaustive ex. customers location or different colleges at univ.
data mining
computational process of discovering patterns in large data sets involving methods at the intersection of artificial intelligence, machine learning, statistics and database systems customer segmentation (clustering) predictive modeling associate rule mining