Analytics

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

Taxonomy of DSS?

- Communication-driven DSS - Data driven DSS - Document driven DSS - Knowledge driven DSS - Model driven DSS

What are two common methods to evaluate a classifier?

- Confusion matrices (aka contingency table) - Receiver Operating Characteristic (ROC) Curve

Potential Benefits of DSS?

- Decision quality - improved communication - cost reduction - increased productivity - time savings - improved customer and employee satisfaction

What are some practical ML issues or concerns?

- Getting good data - choosing appropriate features - which algorithm to use (not possible to know beforehand.) - How to test it? - how to set user defined parameters

What are some disadvantages of using ML?

- Need lots of data - Gold standard data can be expensive to acquire - good data is hard to find - error prone: it is usually impossible to get perfect accuracy

What are some advantages of machine learning?

- Often more accurate than human made rules - No need for human presence - Humans are not often capable of expressing knowledge even if they do a task well - If the solution needs to be adapted to particular cases - Problem sizes too big for humans to reason about

What are the components of a DSS?

- Specialized databases - Analytical models, decision maker insights and judgments -Interactive Graphical User Interface (GUI)

What are the 7 varieties of Data?

- Structured: pre-set format (ex: banking tansaction) - Unstructured: no pre-set format (ex: web pages, social media. Currently most data is unstructured) - Semi-Structured: Unstructured data that can be put into a structure using format descriptions - Batch: Big chunks of data, time separated - Streaming: chunks of data, consistent feed - Real Time: analysis to be done immediately - Meta-data: definitions, mapping: i.e., data about data. This helps you to do pre-processing on info

with regard to ML, What are the 4 standard learning scenarios?

- Unsupervised learning: only unlabeled data. (grouping or clustering data based only on features) - Supervised learning: only labeled data (predict class/labels of an unseen item based on its features) - Semi-supervised learning: both labeled and unlabeled data (improve prediction by also using info from unlabeled data) -Reinforcement: agent interaction with environment (learns best behavior based on consequences of actions)

What is data mining?

-An attempt to discover patterns, trends, and correlations hidden in the data that can give a strategic business advantage. -Includes the analysis of huge databases/warehouses. - Can highlight buying patterns, reveal customer tendencies, cut redundant costs or uncover unseen profitable relationships and opportunities

What is Type II error?

-False Negative

What is Type I error?

-False Positive

What is the main purpose of data mining?

-Knowledge discovery (a component of some DSS)

What are the 4 characteristics of big data?

-Volume (data at rest): Terabytes to exabytes of existing data to process - Velocity (Data in motion): Streaming data, milliseconds to seconds to respond - Variety (Data in many forms): Structured, unstructured, text, multimedia - Veracity (Data in doubt): Uncertainty due to data inconsistency & incompleteness, ambiguities, latency, deception, model approximations (Big Data Typically has 2 of the above characteristics)

What do we mean by making a machine learn?

-a process of acquiring knowledge from observations/data and/or interactions/feedback from an environment - So we need algorithms that instruct machines how to acquire knowledge from data, not what the knowledge is

What business outcomes can benefit from Big Data?

1. Acquire, Grow & Retain Customers 2. Optimize Ops and reduce Fraud 3. Maximize Insights and improve economics 4. Transform Business Performance 5. Create New Business Models

What are things big data cannot help with?

1. Chance correlation 2. Meaning ( can find relationships in data, but more is needed to determine meaning or cause) 3. Action (more data doesn't imply more knowledge 4. Easily fooled ( many data tools can be purposely fooled) 5. data drift (incoming data can cause unintentional signals) 6. feedback (reinforcing data) 7. critical thought (scientific sounding answers to vague questions) 8. new data (How to handle previously unseen data) 9. realistic (big data is not a silver bullet)

What are the challenges with big data?

1. Data lacks integrity 2. Data lacks metadata 3. Back end is cheap 4. Front End is confusing 5. Analysts don't understand your question 6. Analysis is incomplete 7. Lacks a means to interpret the analyses 8. Not acting on the analyses

What is the Imitation game?

A test to see if something is artificially intelligent. If you are able to interact with something without being able to tell if if you are interacting with a human or a computer than it passes the test

Compare ML vs AI vs Data Mining vs Statistics

AI: Computers that behave and reason intelligently ML : Automatically learn models of data for prediction (a subset of AI) DM: Human-guided discovery of hidden patterns in a particular dataset Stats: Quantify and summarize data

What are some common learning tasks?

Classification: assign a category to each item Regression: predict a real value for each item Ranking: order items according to some criterion Clustering: partition data into homogenous regions Dimensionality reduction: find a lower-dimensional feature space that preserves most important properties of the data

What are some applications of ML?

Image recognition, Speech recognition, Chess, Cancer diagnosis, computer security, medical diagnosis

Why can Big Data happen now (8 reasons)?

Low cost, CPU Power, Fast Access, Cloud Computing, Distributed computing, Government investment, Open Source Software, Machine Learning

What does AI Include?

Natural languages Industrial Robots Expert Systems Intelligent agents

What is Machine Learning?

a computational method that uses experience to improve algorithm performance for purpose of prediction

With regard to Machine Learning what is experience?

a data-driven task, and thus stats, probability, and optimization will play a significant role

What is a Decision Support System (DSS)?

an interactive software used to aid in decision making by suggesting solutions to the problem at hand

What is the goal of AI?

to develop computers that can think, as well as see, hear, walk, talk, and feel.


Ensembles d'études connexes

3614 Alterations in Bowel Elimination

View Set

Ch 54, CH37-COMMUNITY AND ECOSYSTEM ECOLOGY, Chapter 37, 1041SCG Biological Systems Week 12, Ecology CH. 12 Book Online Question, quiz 5, Chapter 38, Biology Ch. 42 and 43, Ecology Exam, BSC2011L Chapter 53 Q A, Chapter 53 practice test, Biology Fina...

View Set

Unit 4 quiz Business and Technology

View Set