Intro to Data and Data Science
Imagine your boss wants to know what to do with the customer data to increase profits.. what do you do?
1. Analysis 2. Forecast
Imagine your boss wants to know the outgoing cost the firm in the next year... what do you do?
1. Gather relevant data 2. Prepare for analysis
Advanced Analytics Overview
1. all areas intertwine 2. not a strict representation of common accepted definitions
machine learning target
1. data 2. model (usage of bow) 3. objective function (calculate where) 4. optimization algorithm (improve performance)
balancing
the data is not skewed one way
Analytics
the science of fact-based decision making
time series analysis
A forecasting method that uses historical sales data to discover patterns in the firm's sales over time and generally involves trend, cycle, seasonal, and random factor analyses
regression analysis
A method of predicting sales based on finding a relationship between past sales and one or more independent variables, such as population or income
Unsupervised Learning
A type of model creation, derived from the field of machine learning, that does not have a defined target variable.
Supervised Learning
Category of data-mining techniques in which an algorithm learns how to predict or classify an outcome variable of interest.
Data
Facts and statistics collected together for reference or analysis
Deep Learning
Involves developing the tools of critical thinking and applying them to whatever challenges you encounter now and in the future.
Knowing which programming language is a huge advantage if you are supposed to be working with big data and/or machine learning?
Java
Data Science
Managing and analyzing massive sets of data for purposes such as target marketing, trend analysis, and the creation of individually tailored products and services.
Choose the best answer. Which software tool(s) are constantly being used across all five categories from our infographic?
Python and R
relational schema
The organization of a relational database as described by the database administrator.
Data Cleansing
The process of detecting incorrect or insufficient data
artificial intelligence (AI)
The science of designing and programming computer systems to do intelligent things and to simulate human thought processes, such as intuitive reasoning, learning, and understanding language.
Data Analytics
The science of examining raw data with the purpose of drawing conclusions about that information
From a data scientist's perspective, the solution of every task begins
with a proper dataset
KPI (Key Performance Indicator)
a measure of achievement that can be attributed to an individual, team, or department
business analysis
a review of the sales, costs, and profit projections for a new product to find out whether these factors satisfy the company's objectives
factor analysis
a statistical procedure that identifies clusters of related items (called factors) on a test; used to identify different dimensions of performance that underlie a person's total score.
cluster analysis
a technique used to divide an information set into mutually exclusive groups such that the members of each group are as close together as possible to one another and the different groups are as far apart as possible
Business Intelligence (BI)
analyzing, understanding, and reporting past data
ml engineer
applies state of the art computational models
text data
combination of letters, numbers and some symbols which form words and sentences
Which of the following columns from our infographic contain activities that are said to belong to the field of 'predictive analytics' and do not aim at explaining past behaviour? a. traditional data b. big data c. business intelligence d. traditional methods
d
Data Architect
design the way data will be retrieved, processed, and consumed
analysis
detailed examination of the elements or structure of something, typically as a basis for discussion or interpretation.
bi consultant
external BI analyst varied
data scientist
extracts knowledge from data by performing statistical analysis, data mining, and advanced analytics on big data to identify trends, market changes, and other relevant information
Big Data
extremely large data sets that may be analyzed computationally to reveal patterns, trends, and associations, especially relating to human behavior and interactions.
advanced analytics
focuses on forecasting future trends and producing insights using sophisticated quantitative methods, including statistics, descriptive and predictive data mining, simulation, and optimization
training your machine learning model
give final goal
database administrator
handles control of data and usually traditional
reinforcement learning
learning association between stimuli and reward receipt
In reinforcement learning, a reward system is being used to improve the machine learning model at hand. The idea of using this reward system is to:
maximize the objective function
class labeling
numerical, categorical
traditional methods
perfect for forecasting future performance for accuracy
bi analyst
perform analyses and reporting of past data
data analyst
prepares more advanced types of analyses
Data Engineer
process obtained data for analysis
Quantification
process of representing observations as numbers
bi developer
python + SQL to perform analyses designed for company
data shuffling
randomly mixing up data frame rows - prevents unwanted patterns - improves predictive patterns
Dealing with missing data
solve this before in the pre processing stage
traditional data
structured can be managed from one computer
traditional method example: UX
survey --> cluster analysis --> customer satisfaction, revenue
Machine Learning (ML)
the extraction of knowledge from data based on algorithms created from training data