Decision Support Systems

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

what are the three knowledge Aquisition methods?

1.) Manual- Based on interview (structured/unstructured)- Track reasoning process (think aloud protocols)- Observation of expert 2.) Semi-automatic- Build base with minimal help from knowledge engineer- Allows execution of routine tasks with minimal expert input 3.) Automatic- Minimal input from both expert and knowledge engineer- Discover knowledge from existing data (= data mining)

What is supervised Machine learning

Uses label data, self coded online sources -predictive approach

ML process: supervised learning - how important is the data set

Very Important. - Sufficiently large training data set - Representative sample - Well-balanced labels - Accurately labelled

Big data specifications

Volume : size of data Velocity: how fast is the data moving Variety: what form is the data in? Veracity: is the data accurate (huge challenge for businesses)

structure of a KBS

basic architecture: Knowledge engineers interface --> Knowledge base --> <-- inference engine --> user interface advanced architecture see slide 35 lecture 4

what is knowledge engineering?

is the process of acquiring knowledge from experts and building a knowledge base

Programming concepts: Data Frames

tabular" data: a data structure representing cases (rows), each of which consists of a number of measurements (columns).

definition of predictive analytics and examples

use data to predict future behavior based on past performance ex: Naive bayes, decision tree model

definition of descriptive analytics and examples

use data to understand past & present ex: business dashboards

how to compare the performance of using different # of periods to compute forecasting?

we compare the forecasting accuracy: by comparing the forecasted value with the actual value. The more data points within the model, will lead to a more stable forecast accuracy. We compare their forecast accuracy (MSE, MAD) for different models (related to a different n). And select the model with the lowest error (highest accuracy). 2)

DW development approach: Data mart approach (Kimaball)

(bottom-up) DW = collection of data marts

DW development approach: Enterprise data warehouse approach (Immon)

(top-down) DW= one integrated database

Programming concepts: Loops

- A loop statement allows us to execute a statement or group of statements multiple times - While loop is the simplest loop which executes a block of code until and expression is True - A 'break' command can be used to exit any loop prematurely

Type of programming language: architectural language

- Best used to build frameworks that support (make easy) application building - Not as fast (at run-time) as system level languages ex: Java, C#

Programming concepts: Functions

- Block of organized, reusable code that is used to perform a single, related action -Provide better modularity for your application and a high degree of code reusing - Python gives you many built-in functions like print(), len() etc..

Machine Learning process: How does it work and function used

- Class of algorithms that are data-driven. unlike "normal" algorithms, it is the data that "tells" what the "good answer" is - No requirement of a hardcoded definition of good and bad sentiment - It can figure out what a good and bad sentiment is by learning from examples function: 𝑦=𝑤𝑒𝑖𝑔ℎ𝑡𝑠 ∗𝑥+𝑏𝑖𝑎𝑠𝑒𝑠

Application Languages

- Used to build the applications like web pages and user interfaces - These languages all allow for extremely easy development ex: Python, HTML

what is the role of the domain expert?

- knowledgeable and skilled person capable of solving problems in a specific area or domain -the person's expertise is to be captured in the expert system - could be more than one expert that contribute to an expert system - the expert must be able to communicate his or her knowledge, be willing to participate in the expert system development and commit a substantial amount of time to the project - is the most important person in the expert system development team

what is the role of the project manager ?

- leader of the expert system development team, responsible for keeping the project on track - makes sure that all deliverables and milestones are met, interacts with the expert, knowledge engineer, programmer and end-user

what is the role of the programmer?

- person responsible for the actual programming, describing the domain knowledge in terms that a computer can understand. - needs to have the skills in symbolic programming in such AI language such as Prolog. - should also know conventional programming language like C, Pascal, FORTRAN and Basic

what is the role of the knowledge engineer

- someone who is capable of designing, building and testing an expert system - interviews the domain expert to find out how a particular problem is solved -establishes what reasoning methods the expert uses to handle facts and rules and decides how to represent them in the expert system - choose some development software or an expert systems shell, or look at programming languages for encoding the knowledge -responsible for testing, revising and integrating the expert system into the workplace

common business case uses for predictive analytics

-Customer Analytics -manufacturing and operations -Financial modeling

ML process: Classifier Model

-Find the best fit model (association) that represents the influence of the input features from the sentence on the outcome variable. ex: Naive Bayes, Decision Tree

Supervised ML model: Gradient Descent

-Incrementally change the model weights and biases so that it provides the best prediction (reduce mean squared errors) -Objective: Predict 𝒚 from its observed characteristics 𝒙(features) based on a sample 𝒚𝒊,𝒙𝒊 ;

type of programming language: system language

-It is used to build operating systems, hardware drivers etc. - Gives you low level access to the computer and its memory ex: C, C++

Performance Dashboards 3 layers of information

1) Monitoring: Graphical, abstracted data to monitor KPI's 2) Analysis: Summarized dimensional data to analyze the root cause business problems 3) Management: Data that identifies what actions to take to resolve a problem

multi-dimensional data can be classified as

1) measures: summable information concerning a business process ex: profit, cost, sales figures 2)Dimensions: Represent different perspectives on viewing measures, organized hierarchically ex: time(month), time (quarter), time (year)

process of BI

1) production 2) Assembly, logistics and storage 3) Processing, Analysis and consumption

Definition of the ETL staging area

1)Extract data from operational systems 2) Transform data (ex: cleaning) 3) Load data into data warehouse

structure of OLAP cube

3D data cube that can store procice business data with three axis, product, time, location

ML Process: Decision Tree algorithm

A Decision Tree is a hierarchically organized tree structure, with each node splitting the data space into pieces based on value of a feature

what is a primary Key (PK)

A primary key is the column or columns that contain values that uniquely identify each row in a table. A database table must have a primary key for Optim to insert, update, restore, or delete data from a database table.

What is in un-supervised machine learning

Algorithmic, where sentences are not labeled ML objective: to find patterns in the data

what transformation process does data within DSS

Data as raw symbols are collected and converted to information in the form of formatted data, then used to create knowledge in the form of data relationships

Prescriptive analytics process (steps)

Define objective Data prep Modeling and analytics Validation and deployment

What is a business report

Document that contains information regarding business issues -Purpose: Support and improve managerial decisions -Source: Data warehouse/mart -Format: text + tables + graphs/charts -Distribution: inprint, email, portal/intranet

situation awareness relating to performance dashboards is defined on three levels

L1: Perception of the elements in the environment L2: Comprehension of the current situation L3: Projection of future sales

Analysis metrics for decision tree algorithm: F1 Score

Harmonic mean of precision and recall 2∗ precision∗recall / 𝑝𝑟𝑒𝑐𝑖𝑠𝑖𝑜𝑛+𝑟𝑒𝑐𝑎𝑙𝑙 =2 ∗ (0.6∗.750) / (.6+0.75) =67%

Programming concepts: Data Types

Integer, float, string and boolean

is it possible to overfit the model in a ML process?

Overfitting is possible. Model relying on idiosyncrasies in the training data - Difficult to generalize: The trained model too closely follows a particular data and fails at predicating future observations - Problems of small (non-representative) training data - Note: that the prediction accuracy may be misleading high

what is open source software

Open-source software (OSS) is a type of computer software in which source code is open for study, modify, and redistribution ex: python

what is the BI Front end applications

Query & Reporting -data mining -data visualizations +

OLAP operators

ROll up: aggregating measures to a higher dimensional level ex" from Quarters to years Drill down: reverse of roll up ex: drill down on time from quarters to months SLice & dice: selecting a subset of cells that satisfy a certain selection condition

supervised ML process for text: Structure

See outline: Predictive analytics lecture 2: Slide 16

How to correct for the issues of the decision tree algorithm (Overfitting, Instability)

Solution Ensemble techniques: -The goal of ensemble methods is to combine the predictions of several base estimators in order to improve generalizability / robustness over a single estimator. -Averaging methods (Bagging): Build several estimators independently and then to average their predictions e.g. Random forest

How to create sentiment analysis on python: Steps to do so

Step 1: Feature Extraction: Define features of the word you want to use in order to classify the data set -Tokenization: chop up the sentences into individual words -feature generation: selecting right features and determining how to encode them - Create feature set: we will use a corpus of sentiments from the Word_sentiment.csv file to create a feature dataset which we will use to train and test the ML model Step 2: Train the model on the training data set - split the sample into testing and training data sets, training sets is used to train our ML model, and the testing set can be used to check how good the model is. On average 20% of the data is split into the testing set and the other 80% is put into the training set. -Use ML method (Naive Bayes) to create the classifier model: which gives us several ML methods to create a classifier model using out training set and based on our selected features. Step 3: using classifier object created predict the sentiment of a given word step 4: Evaluating the model -Find how good the model is in identifying the labels. ensure that the test set is distinct from the training corpus. If we simply re-used the training set as the testing set, then a model that simply memorized its input, without learning how to generalize to new examples would receive misleadingly high scores.

Definition of Data marts

Subsets of data warehouse (less dimensions, detail, history) application oriented

ML process: Feature Extraction

Tokenization feature generation: selecting the right features and figuring out how to encode them. "kitchen sink approach"

what is a foreign key (FK)

a column or combination of columns that is used to establish and enforce a link between the data in two tables to control the data that can be stored in the foreign key table

what is a knowledge based expert system (KBS)

a computer program that can perform like an expert in a relatively narrow domain and has additional properties a) knowledge stored explicitly b) inference engine & knowledge base separated c) explaining reasoning steps

what are frames/objects

a data structure that includes all knowledge on a particular object knowledge in frame is partitioned into slots

defintion of a data warehouse

a database that is maintained separately from the organizations operational databases for the purpose of managerial decision making (High quality cleaned data. star/snowflake model) 1) Subject-Oriented (focusing on anaylsis of data for decision makers ) 2) Integrated (integrating multiple heterogenous data sources 3)Time-variant (time frame within the data warehouse is longer than that in an operational data base, every structure in the data warehouse contains a time element) 4)Nonvolatile (operational updates of data do NOT occur the data warehouse environment

what is a star schema

a fact table in the middle connected to a bunch of dimension tables: for example, sales fact table (with keys of all the brnaches) in the middle and on the branches would be time, location, branch etc:

What is the backward chaining method within the inference engine

backward chaining is goal driven reasoning : inference engine starts with a list of goals (hypothetical solutions) and works backward from the conclusion to the condition to see if there is data availible to support and of the conclusions. uses AND - OR structure for arguments

moving average model (performance dashboards)

builds a forcast by averaging observations in the most recent n periods

Definition of Meta Data

data about data: origin, location, meaning -also data models

definition of a data lake

data base that holds raw data in its native format until it is needed

what is Business intelligence comprised of

data warehousing + descriptive analytics

what is explicit knowledge?

deals with objective, rational, and technical material (data, policies, procedures, software, documents, etc.) easy to teach/learn

what is the forward chaining method within the inference engine

forward chaining is data driven reasoning that starts from known data and proceeds forwards ti see if any conclusions can be drawn uses IF - THEN structure for arguments

what are semantic networks

graphical depictions of knowledge consisting of nodes & links in a spiderweb format hierarchical relationships between objects

what is tacit knowledge?

is usually in the domain of subjective, cognitive, and experiential learning (human experience, knowhow, insights, etc.) It is highly personal and hard to formalize

Programming concepts: Data Structures

lists, dictionaries and tuples data structures use data types to create more complex things

definition of prescriptive analytics and examples

make decisions or recommendations to achieve the best performance ex: Intro to AI, Knowledge based systems

what is the role of the end-user?

often called the user - is a person who uses the expert system when it is developed - must not only be confident in the expert system performance but also feel comfortable using it.

what is Business Analytics comprised of

predictive + prescriptive analytics

ML process: Development test set

smaller set within the test set, where the user can make further adjustments to the model by fine tuning features. Keep in mind that actual model accuracy is tested using the test set which is kept away from training data.

what is a data base management system (DBMS)

software that controls the data Ex: MS access

Analysis metrics for decision tree algorithm: Accuracy

𝑇𝑜𝑡𝑎𝑙𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑝𝑟𝑒𝑑𝑖𝑐𝑖𝑡𝑖𝑜𝑛𝑠 / 𝑇𝑜𝑡𝑎𝑙𝑝𝑟𝑒𝑑𝑖𝑐𝑖𝑡𝑖𝑜𝑛𝑠(𝑡𝑒𝑠𝑡𝑠𝑎𝑚𝑝𝑙𝑒) = 𝑇𝑃+𝑇𝑁 / 𝑇𝑃+𝑇𝑁+𝐹𝑃+𝐹𝑁 =3+28 =62.5%

Analysis metrics for decision tree algorithm: Recall

𝑇𝑜𝑡𝑎𝑙𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑝𝑟𝑒𝑑𝑖𝑐𝑖𝑡𝑖𝑜𝑛𝑠 / 𝑇𝑜𝑡𝑎𝑙𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝐿𝐴𝐵𝐸𝐿𝑆 = 𝑇𝑃 / 𝑇𝑃+𝐹𝑁 =34 =75%

Analysis metrics for decision tree algorithm: Precison

𝑇𝑜𝑡𝑎𝑙𝑐𝑜𝑟𝑟𝑒𝑐𝑡𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑝𝑟𝑒𝑑𝑖𝑐𝑖𝑡𝑖𝑜𝑛𝑠 / 𝑇𝑜𝑡𝑎𝑙𝑝𝑜𝑠𝑖𝑡𝑖𝑣𝑒𝑃𝑅𝐸𝐷𝐼𝐶𝑇𝐼𝑂𝑁𝑆 = 𝑇𝑃 / 𝑇𝑃+𝐹𝑃 =35 =60%


Ensembles d'études connexes

Greenhouse Management Dual Credit - Chapter 5: Plant Reproduction

View Set

Unit A, Lesson 1: Structure of the heart

View Set

Unit 2 - Global System for Mobile (GSM) Communications

View Set

Cabot 105- Unit 1 Review, Unit Test 2 Ch 3/4, Chapter 5 & 6 Test, APT 200 unit 4 review, APT 200

View Set

ANTH 1101: Final Exam, Anthropology 101 Exam 1, Anthropology 101 Exam 2, Anthropology 101 Final, Anthropology 101- Exam 3, Anthropology 101, Anthropology 101, Anthropology 101, Anthropology 101, ANTHROPOLOGY 101 MIDTERM, Anthropology 101 Final, Anthr...

View Set

Transport and exchange mechanisms - Using a respirometer to measure oxygen consumption ✅

View Set