ISM Exam 2
T/F? We reviewed a Business Case discussing the infusion of project management and data science. The CRISP-DM was used to determine the phases of the "Executing" process of the Project Management Life Cycle. (NOTE: You may have to quickly research Project Management Body of Knowledge to understand the 5 processes at a macro level which describe INITIATING - PLANNING - EXECUTING - MONITORING CONTROLLING - CLOSING)
True
When telling stories, select the major factors that one should consider when building their storyboard: a: Context in time is important b:Confirming the business question that needs to be answered c:Refining the answer in iterations as the team builds the analytics solutions d:Collecting data that mainly supports the agenda of the business owner
a,b,c
The CRISP-DM cycle resembles a(n) ______________.
an iterative process focused on exploration
A major underlying framework for a proposal is the _____ and _______.
approach and methodology
1. _____________: This is answering a question, singular. they have a layer of complexity due to the amount of people involved in declaring the need. You have to adapt and may waste time on what you think is right.
business need
The CRISP-DM phase that is a major component of the storytelling framework is ______________
business understanding
The first process is _________________________ which is similar to the first step in most problem solving framework, identifying the problem. We are allowed to be creative in this first phase.
business understanding
4. ____________: "Estimating approach based on the assumption that future value of a variable is a mathematical function of the values of the other variable(s). Used where sufficient historical data is available, and the relationship (correlation) between the dependent variable to be forecasted and associated independent variable(s) is well known." It is basically looking at multiple distinct events and identifying a pattern where observations can be made where the data supports that one event seems to be followed by the next event. It gives you a probability of event 1 causing event 2
causal modeling
Which approach to Business Analytics attempts to assign each unit in a population into a small set of classes where the unit belongs?
classification
As a consultant, when responding to a proposal the first step is to _______ the questions that are asked. The second step is to determine the _____ to answering the question.
confirm approach
2. _______: compensation and benefits are the main attributes managed in this category
costs
4. __________: in the service industry, employers are interested in gathering information on their employee's performance of interacting with external customers.
customer satisfaction
2. _____________: Data scientists are conducting research to identify data that already exists and are determining other data gathering methods like a survey
data needs
3. ____________: here the data is acquired and staged. Transferring the data through a secure method is discussed because the integrity of the data is important. Another consideration is the cleanliness of the data...there could be human error.
data preparation
_____________: or data mart, is a focused data structure for a OLAP architecture. Many times, the OLTP that is designed to focus on transactions will pipe data to a different platform for OLAP process. The term Big Data is born out of data warehouse because the capacity requirements to process large amounts of data is necessary in a data warehouse
data warehouse
The early roots of analytics comes from a classic Business Intelligence technology in the 1950's called ______________.
decision support
_________: All the activities and resources necessary to launch and support the operations of a solution.
deployment
_______________: All the activities and resources necessary to launch and support the operations of a solution.
deployment
The ability to be objective rather than subjective is closely tied to a person's _______ intelligence factor, EQ. This EQ is cultivated through __________ and _________.
emotional experience knowledge
_________: The phase where the solution is fully tested and accepted. Once sign-off criteria have been met then the solution moves to the implementation or deployment stage.
evaluation
______________: The phase of a project, life cycle, or framework where the solution is fully tested and accepted. Once sign-off criteria have been met then the solution moves to the implementation or deployment stage.
evaluation
Methodology for storytelling 1. _______ data (sample size) 2. _______ data 3. _______ data
gather scrub prepare
Most problem solving methods start with ______________________.
identifying the problem
Thoughtful data scientists recognize the most important fundamental principle of data science is to think carefully about ______________.
identifying the problem to be solved
During the pursuit process, teams are put together to develop proposals- every action is focused on ________ win probability.
increasing
4. ____________: data science is a platform for innovation. Setting up processes to study a customer's buying behaviors is insightful when you start gathering data on the following questions: how customers shop? what customers buy? how the items purchased might relate to each other? what is a customer's brand loyalty?
innovative indicators
The Vision of ISM3541 is based on developing deep thinkers focused on storytelling with data through ________.
learning, unlearning, and relearning
Which approach to Business Analytics attempts to predict relationship between two data items?
link prediction
1. ____________: an attempt to predict a relationship between 2 data items, this may be a subjective activity. This is a qualitative approach. We see this on social media
link prediction.
3. ___________: is the gathering of data to reveal trends and patterns. The patterns are candidates for automation through AI, or machine language. An entire industry of companies focus on providing technology and services to monitor business process with the objective of identifying opportunities for efficiencies and advising companies on realizing business process efficiencies.
machine learning
The 9 core data mining tasks are primarily performed in which stage of data science?
modeling stage
Quadrant 1 is _________ actions. Its a "low hanging fruit"
near term
5. _____________: don't miss the opportunity to control your career and where you are going. Analytics is a great way to measure the goals you've set for yourself even if you are constantly making tweaks to your career.
personal vision
Client builds software by __________ the work, ________ the work, and _________ the work.
planning, building, running
1. ____________: this can be measured through utilization, business development, professional development, community investment (e.g., volunteering your time).
productivity
A major platform for winning work is the _______ space. When companies _____________ to procure solutions and services, consultants have an opportunity to story tell and put together a unique bid that will be selected
proposal go to the market
________ data is critical.
repeating
______________; various methods exist to capture data on the state of a requirement as it moves from inception all the way to production (e.g., installing the latest version of iOS for an iPhone). Test cases, or use cases, is an effective method to illustrate the requirement and demonstrate the evolution of the requirement using data to track the requirement (e.g., pass/fail).
requirements traceability
3. _________: companies focus on retention rates and will try to offer as many services as possible to employees if they are trying to reduce costs by retaining talent. Conversely, some organizations are set up to constantly train people because turnover is high and this type of turnover is commonplace in call centers.
retention
Understanding ___________ is important. You need to use confidence ___________, confidence _______ and margin of _______ to calculate the ideal size.
sample size confidence interval confidence level margin of error
Which of these is NOT included in the five steps of the ETL process? a: determine the purpose and scope the data request b: scrub the data c: obtain the data d: validate the data for completeness and integrity
scrub the data
2. ____________: a method of identifying similar individuals based on data know about them. Used in fraud detection systems -you look at trends in behavior and any outlying behavior may be suspect for fraud.
similarity matching
Which approach to Business Analytics attempts to identify similar individuals based on data known about them?
similarity matching
Preparing the data is an important process in the CRISP-DM framework. A supervised method relies on a ______________ variable/attribute that has a range of values and gives better context to a row in the data set and how that row of data (e.g. think about the customer in the "write-off" or "non-write-off" outcome) might behave in the future.
target
The biggest metric we focus on in consulting is _____________. no matter your level or position in an organization, you want to be in a position of ____ utilization.
utilization high
Vision and Mission statements: Building a business case for an analytics program is critical to long-term sustainability. The __________ will give insights into an organization's desire to get better; and, the _________ is telling on how the organization is behaving right now, today. A culture fostering continuous improvement and constant learning is primed for an authentic analytics program.
vision mission statement
T/F? The definition of analytics that will be used in this course is the systematic computational analysis of data or statistics
True
The actions requiring more time and risk management is in quadrant _____, the most level of uncertainty.
4
____________: a popular concept born out of the data warehouse era of the 1990's and early 00's. ETL is a critical function of the Data Preparation phase in the CRISP-DM framework.
ETL: Extract, Transform, and Load
__________: is the underlying data structures for an analytics program. The strategy is to have repeating data where you are trying to understand patterns and trends.
OLAP
_________: is designed to eliminate repeatable data where possible by leveraging relationships between entities.
OLTP
____________: the underlying data structures for an analytics program. The strategy is to have repeating data where you are trying to understand patterns and trends.
Online Analytical Processing: OLAP
____________: designed to eliminate repeatable data where possible by leveraging relationships between entities.
Online Transaction Processing: OLTP
________________: refers to the iterative stages or phases that start with Business Understanding moving to Data Understanding followed by Data Preparation then Modeling then Evaluation and finally Deployment.
CRISP DM
_____________________ is a good generic model for learning an approach and methodology to building a data science program
CRISP-DM: Cross Industry Process for Data Mining
3. _____________: this method is an attempt to discover associations between individuals based on transactions involving them. We see vendors such as Walmart, Target, Amazon, and others gather as much data about consumer's buying habits. This data is then analyzed to see how products might be grouped together creating marketing strategies to package goods together and to increase the sales of grouped items.
Co-occurence Grouping
___________: refers to the iterative stages, or phases, that start with Business Understanding, moving to Data Understanding, followed by Data Preparation, Modeling, Evaluation, and finally Deployment.
Cross Industry Process for Data Mining: CRISP-DM
4. __________: You model by looking at 3 questions: 1. Where have you been and what data do you have to support that view? -describes the path one has taken with the data clearly telling the story ________ analytics 2. What could happen to me based on what I know and what data do you have to support that view? -predicting the future based on data and experience, _________ analytics 3. What should you do based on all of your options and what data do you have to support that view? -a space where we have risk management, probability, and consensus to launch valuable data models, for the unknown. ______________ analytics
Data Modeling descriptive predictive prescriptive
______________: is a focused data structure for a OLAP architecture. Many times the OLTP that is designed to focus on transactions will pipe data to a different platform for OLAP process. The term Big Data is born out of data warehouse because the capacity requirements to process large amounts of data is necessary in a data warehouse .
Data Warehouse
___________: the ability to recognize biases within oneself and to put oneself in the shoes of someone else. Metrics have been gathered through the years on EQ and organizations will invest time to assess their talent using the Myers-Briggs assessment or the DISC assessment (among others).
EQ: Emotional Quotient, Emotional Intelligence
_________: is a critical function of the Data Preparation phase in the CRISP-DM framework.
ETL
6. _________: This phase is release which means to capture all the activities necessary to introduce something new into the ecosystem. We value a controlled environment that reduces human error
Deploy
5. ___________: This phase is refining and testing, outcomes are analyzed and logic and implementation of the data model is tested. Many issues may come up and through progressive elaboration we are able to adjust for this. The goal is to have a model that we can use for real life situations
Evaluation
______________: is a technique that has evolved over the past two to three decades that capture the scope of an application/system by documenting the amount of functions. The number of functions is then used as an input into an estimation module that will produce an output illustrating the level of effort required to implement the functions. A predictive model can be applied using the FPA data to forecast the progression of the features throughout the development life cycle.
Function Point Analysis FPA
1. _______________: surfaced as a statistic over the past couple decades. Once we set one, we need to focus on gathering data so that we can analyze patterns associated with the process. Adjustments can be made to show process performance based on the current state and also be used to forecast expected performance based on future state.
Key Performance Indicator, KPI
Running tests to find the root cause of the symptoms is called _________________ in the systems world.
Problem Resolution Management
_____________: Companies have a responsibility to manage data properly. Many would argue that the data is the most important component of a systems architecture when it comes to protecting and making available through the applications. The RPO simply states how much data an organization is willing to lose in the unexpected event of a major outage that takes time to restore.
Recovery Point Objective (RPO)
2. _____________: Businesses manage the risk of a natural, or unexpected, event by trying to mitigate the impact through Disaster Recovery Planning (DRP) or Business Impact Assessment (BIA). The RTO is the amount of time that a service or process can be completely down before a major impact to the business is realized. RTO objectives are a good way to identify the critical business processes of any organization in any industry that need the most safeguards and risk mitigation.
Recovery Time Objective (RTO)
5. ______________: provide insight into the high impact processes that have clear expectations and consequences for falling short of expectations. SLAs are popular in third-party agreements which are commonplace in large business ecosystem. Hotel chains commonly use SLAs to cover the data center, facilities management at the sites, support of reservation systems, loyalty systems, inventory management, etc.
Service Level Agreement (SLA)
_______ help control costs and ensure a level of certainty.
Standards
During the module on "Communication - Storytelling" we explored a placemat that provided the framework for storytelling with data. The step that explores the data mining techniques that we've covered in the class is:
Step 3: Analyze Data -Phase 1
Which of the placemats, or constructs, that have been provide through Chapter 4 provide the best view of selecting the proper data mining method?
Storytelling Placemat
T/F? The Cross Industry Standard Process for Data Mining is a framework for designing data science solutions.
T
The ____ team is responsible for the operations and infrastructure of the environment. They are implemeenting a new people model called the _____________________, _ _ _. It is the aggregator of people, process, and technology for the organization.
Technical Account Manager, TAM