Data_Analytics_CS
CAPTURE
Collect or bring in data from a variety of different sources.
How businesses can use their controlled data?
Improve processes, identify opportunities and trends, launch new products, serve customers, make thoughtful decisions
ARCHIVE
Keep relevant data stored for long-term and future reference.
What is Data Analytics?
Puts the data to work through Analysis
Data-inspired decision-making
explores different data sources to find out what they have in common.
One goal of structured thinking is organizing the available information to reveal _____ and opportunities.
gaps
What's a report?
is a static collection of data given to stakeholders periodically.
In order to get at the root cause of a problem, a data analyst would ask ______ .
the five whys
Questions that make assumptions often involve concepts that are formed without evidence. An example of this is an idea that is accepted as true without proof.
True
In the data life cycle, which phase involves gathering data from various sources and bringing it into the organization?
Capture
MANAGE
Care for and maintain the data. This includes determining how and where it is stored and the tools used to do so.
How many searches Google process every second? How about a day and a year?
40k. 3.5 billion per day. 1.2 trillion searches every year.
A data analyst identifies and classifies keywords from customer reviews to improve customer satisfaction. This is an example of which problem type?
Categorizing things
What is Data?
A collection of facts. (Numbers, pictures, videos, words, measurements, observation, etc.)
Decision Intelligence
A combination of applied data science and the social and managerial sciences
When would a pie chart be an effective visualization?
A pie chart shows how a whole is broken down into parts and is an effective visualization for a class broken down by age.
Based on what you have learned in this course, spreadsheets are digital worksheets that DOES NOT enable data analysts to do?
Choose a topic for data analysis. Spreadsheets enable data analysts to store, organize, sort, and filter data. This helps them see patterns, group information, and easily find the information they need.
What should you do in each step of Data Analysis Process?
ASK: Ask questions and define the problem PREPARE: Prepare data by collection and sharing the information PROCESS: Process Data by cleaning and checking the information ANALYZE: Analyze data to find patterns, relationships and trends SHARE: Share data with your audience ACT: Act on the data and use the analysis results
Machine Learning
AUTOMATES MANY MANY DECISIONS UNDER UNCERTAINTY. The extraction of knowledge from data based on algorithms created from training data. A type of artificial intelligence that leverages massive amounts of data so that computers can improve the accuracy of actions and predictions on their own without additional programming. A type of artificial intelligence that enables computers to both understand concepts in the environment, and also to learn.
A data analyst shares insights from their analysis during a formal presentation to stakeholders. In a slideshow, they make a data-driven recommendation for how to solve a business problem. What phase of the data analysis process would come next?
Act
ACTION-ORIENTED QUESTIONS
Action-oriented questions encourage change. You might remember that problem solving is about seeing the current state and figuring out how to transform it into the ideal future state. Well, action-oriented questions help you get there. So rather than asking, how can we get customers to recycle our product packaging? You could ask, what design features will make our packaging easier to recycle? This brings you answers you can act on.
How did the SMART framework help you arrive at your conclusions? The SMART framework played a pivotal role in guiding the conversation and structuring the questions to ensure they were Specific, Measurable, Action-oriented, Relevant, and Time-bound. Here's how each element contributed: Specific: By asking specific questions, I was able to narrow down the focus to a particular area of concern - inventory management. This specificity helped the sweets store owner and me to delve deeper into a targeted aspect of the business rather than discussing broad and vague topics. Measurable: The framework encouraged the identification of specific metrics, such as daily sales velocity, that could be measured over time. This measurability provided a tangible way to assess the effectiveness of the proposed data-driven changes.
Action-oriented: The questions were designed to elicit information about the actions currently taken based on data, as well as potential actions that could be implemented in the future. This ensured that the data collected would directly inform actionable decisions, aligning with the owner's goals. Relevant: Each question was aimed at uncovering information directly relevant to the sweets store owner's business objectives. This relevance ensured that the data collected would have a meaningful impact on improving inventory management and, consequently, the overall success of the business. Time-bound: The inclusion of a specific timeframe - a three-month trial period - provided a clear deadline for implementing changes and evaluating their impact. This time-bound approach adds urgency and structure to the data-driven initiatives, fostering a more efficient and focused implementation process. In summary, the SMART framework ensured that the conversation was goal-oriented, data-driven, and strategically aligned with the sweets store owner's specific needs and aspirations. It created a roadmap for actionable insights that could be derived from the data collected, enhancing the effectiveness of the proposed changes.
After opening the ice cream shop on her farm, the same dairy farmer then surveys the local community about people's favorite flavors. She uses the data she collected to determine that the top five flavors are strawberry, vanilla, chocolate, mint chip, and peanut butter. She feels confident in her decision to sell these flavors. This is part of which phase of the data life cycle?
Analyze
During which phase of data analysis would a data analyst use spreadsheets or query languages to transform data in order to draw conclusions?
Analyze
A data analyst focuses much of their work effort for a business on what?
Business tasks
Coke launch failure In 1985, New Coke was launched, replacing the classic Coke formula. The company had done taste tests with 200,000 people and found that test subjects preferred the taste of New Coke over Pepsi, which had become a tough competitor. Based on this data alone, classic Coke was taken off the market and replaced with New Coke. This was seen as the solution to take back the market share that had been lost to Pepsi.
But as it turns out, New Coke was a massive flop and the company ended up losing tens of millions of dollars. How could this have happened with data that seemed correct? It is because the data wasn't complete, which made it inaccurate. The data didn't consider how customers would feel about New Coke replacing classic Coke. The company's decision to retire classic Coke was a data-driven decision based on incomplete data.
Fill in the blank: For a data analyst, curiosity is the analytical skill of seeking out new ______ and experiences in order to gain knowledge.
CHALLENGES
Fill in the blank: Gathering additional information about data to understand the broader picture is an example of understanding _____.
CONTEXT
Data Analyst / Data Engineer / Data Scientist
Data Analyst is someone who works with SQL, spreadsheets, databases, might work as a business intelligence team creating those dashboards. Where does all that data come from? From Data Engineers, generally they work together turning that raw data into actionable pipelines. The data engineers have built these pipelines, the data analysts have provided us with clean and actionable data, the data scientists then worked to turn it into really cool machine learning models or statistical inferences that are just well beyond anything you could have ever imagined
What are some reasons why a data analyst might use data visualizations?
Data analysts use data visualizations to explain complex data quickly, reinforce data analysis, and create interesting graphs and charts.
PLAN
Decide what kind of data is needed, how it will be managed, and who will be responsible for it.
(DA): Good morning! I've been looking into ways to enhance the efficiency of businesses, and I'm curious about how data plays a role in your sweet store. Mind if I ask you a few SMART questions? Sweets Store Owner (SSO): Good morning! Absolutely, fire away. DA: Great! First, can you specify a particular aspect of your business where you believe data could make a measurable impact? SSO: Definitely. I've been thinking about our inventory management. Sometimes we run out of popular sweets, and other times we have too much of something less popular. DA: That's a great start. How do you currently measure the success of your inventory management, and what actions do you take based on those measurements? SSO: Right now, it's a bit manual. We check what sells well each day and adjust orders accordingly. It's time-consuming.
DA: Understood. To make it more action-oriented, could you envision a specific metric or key performance indicator (KPI) that, if tracked, would help you make quicker and more accurate decisions regarding your inventory? SSO: Maybe we could track the daily sales velocity of each product. That way, we'd know which sweets are consistently popular. DA: Excellent! Moving on, how relevant do you think this data would be to your overall business goals? SSO: Extremely relevant. Efficient inventory management means we can meet customer demand, reduce waste, and ultimately improve our bottom line. DA: Last question - for a more time-bound approach, can you think of a specific timeframe, like a trial period, where we could implement changes and then analyze the impact of these data-driven adjustments? SSO: Let's aim for a three-month trial. We usually see some seasonal variations, so that should give us a good sense. DA: Perfect. Thanks for sharing! I'll start working on a data framework that aligns with these SMART goals. I believe with the right data, we can sweeten up your inventory management. SSO: I'm excited! Let's make it happen.
1- Plan for the conversation First, decide who you will speak with and how they might use data. Your goal is to plan for a successful conversation. Think about how much time you need and how you will use it. For this step, review the following advice: Prioritize your questions: Prepare to ask the most important and interesting questions first. Make your time count: Stay on subject during the conversation. Clarify your understanding: To avoid confusion, build in some time to summarize answers to make sure you understood them correctly. This will go a long way in helping you avoid mistakes. For example, in a conversation with a teacher, you might check your understanding with a statement like, "Just to double check that I understand what you're saying correctly, you currently use test scores in the following ways..."
Depending on the field they are in, the person you chat with may not be comfortable sharing detailed data with you. That's okay! Be sure to respect what they are willing to share during your conversation.
Some common topics for questions include
Objectives Audience Time Resources Security
The steps of the data life cycle are:
Plan: What plans and decisions do you need to make? What data do you need to answer your question? Capture: Where does your data come from? How will you get it? Manage: How will you store your data? What should it be used for? How do you keep this data secure and protected? Analyze: How will the company analyze the data? What tools should they use? Archive: What should they do with their data when it gets old? How do they know when it's time? Destroy: Should they ever dispose of any data? If so, when and how?
Grouping data based on common features.
Problem type Categorizing things
How did you use a spreadsheet to help prepare your data? How did you format your chart to help you analyze your data?
A spreadsheet helps you structure data in rows and columns, prepare data for analysis, and create custom data visualizations. To better analyze your data, you clean up your chart to make it more visually appealing and to clarify what data means by making your chart more descriptive. To do that, it's important to add chart titles and axis titles. Ultimately, this is an essential skill to master because clear, descriptive data visualizations help data analysts be great storytellers.
A data analyst works for an appliance manufacturer. Last year, the company's profits were down. Lower profits can be a result of fewer people buying appliances, higher costs to make appliances, or a combination of both. The analyst recognizes that those are big issues to solve, so they break down the problems into smaller pieces to analyze them in an orderly way. Which analytical skill is the data analyst using?
A technical mindset
Question 3 A data analyst uses a spreadsheet function to aggregate data. Then, they add a pivot table to show totals from least to greatest. This would happen during which phase of the data life cycle?
Analyze
What are the different processes for Data Analysis?
Ask, Prepare, Process, Analyze, Share and Act
A company takes the insights provided by its data analytics team, validates them, and finalizes a strategy. They then implement a plan to solve the original business problem. This describes the share step of the data analysis process.
False
A data analyst finishes using a dataset, so they erase or shred the files in order to protect private information. This is called archiving.
False. Erasing or shredding files describes the destroy phase of the data life cycle. Archiving involves storing files in a place where it's still available.
A data analyst at an online retailer works with historical sales data. The analyst identifies repeating trends in the sales data. This is an example of which problem type?
Finding patterns
A set of instructions used to perform a specified calculation is known as what?
Formula
Fill in the blank: A query is used to _____ information from a database. Select all that apply.
RETRIEVE (GET), UPDATE and REQUEST
RELEVANT QUESTIONS
Relevant questions matter, are important and have significance to the problem you're trying to solve. Let's say you're working on a problem related to a threatened species of frog. And you asked, why does it matter that Pine Barrens tree frogs started disappearing? This is an irrelevant question because the answer won't help us find a way to prevent these frogs from going extinct. A more relevant question would be, what environmental factors changed in Durham, North Carolina between 1983 and 2004 that could cause Pine Barrens tree frogs to disappear from the Sandhills Regions? This question would give us answers we can use to help solve our problem.
DESTROY
Remove data from storage and delete any shared copies of the data.
2 - Create questions Now, come up with questions to help you understand their business goals, the type of data they interact with, and any limitations of the data. Use the SMART question framework to make sure each question you ask makes sense based on their field. Each question should meet as many of the SMART criteria as possible. As a reminder, SMART questions are
Specific: Questions are simple, significant, and focused on a single topic or a few closely related ideas. Measurable: Questions can be quantified and assessed. Action-oriented: Questions encourage change. Relevant: Questions matter, are important, and have significance to the problem you're trying to solve. Time-bound: Questions specify the time to be studied.
How Data evolves overtime? Give 2 examples.
The analysis can give us new information throughout data's entire life cycle. Ex1: Read reviews of a product before you buy it. Ex2: Wear a fitness tracker to count your steps to stay active.
Understanding Context
The analytical skill that has to do with how you group things into categories
A technical mindset
The analytical skill that involves breaking processes down into smaller steps and working with them in an orderly, logical way
TIME-BOUND QUESTIONS
Time-bound questions specify the time to be studied. The time period we want to study is 1983 to 2004. This limits the range of possibilities and enables the data analyst to focus on relevant data.
In a spreadsheet, what is text wrapping used for?
To allow all of the text to fit inside a cell
ANALYZE
Use the data to solve problems, make decisions, and support business goals.
As a recently promoted data scientist one of your responsibilities is the implementation of data strategy. What would this responsibility include?
Managing the people, processes, and tools involved
MEASUREABLE QUESTIONS
Measurable questions can be quantified and assessed. An example of an unmeasurable question would be, why did a recent video go viral? Instead, you could ask how many times was our video shared on social channels the first week it was posted? That question is measurable because it lets us count the shares and arrive at a concrete number.
Fill in the blank: In a data table, a row is called an observation. An observation includes all of the _____ for what is contained in the row.
Attributes
You have been learning why data is such a powerful business tool and how data analysts help their companies make data-driven decisions for great results. As a quick reminder, the goal of all data analysts is to use data to draw accurate conclusions and make good recommendations. That all starts with having complete, correct, and relevant data.
But keep in mind, it is possible to have solid data and still make the wrong choices. It is up to data analysts to interpret the data accurately. When data is interpreted incorrectly, it can lead to huge losses.
Identifying a relationship between two or more pieces of data is known as what?
Correlation
The data analysis process phases are ask, prepare, process, analyze, share, and act. What do data analysts do during the ask phase?
Define the problem to be solved
Things to avoid when asking questions Leading questions: questions that only have a particular response Example: This product is too expensive, isn't it? This is a leading question because it suggests an answer as part of the question. A better question might be, "What is your opinion of this product?" There are tons of answers to that question, and they could include information about usability, features, accessories, color, reliability, and popularity, on top of price. Now, if your problem is actually focused on pricing, you could ask a question like "What price (or price range) would make you consider purchasing this product?" This question would provide a lot of different measurable responses. Closed-ended questions: questions that ask for a one-word or brief response only
Example: Were you satisfied with the customer trial? This is a closed-ended question because it doesn't encourage people to expand on their answer. It is really easy for them to give one-word responses that aren't very informative. A better question might be, "What did you learn about customer experience from the trial." This encourages people to provide more detail besides "It went well." Vague questions: questions that aren't specific or don't provide context Example: Does the tool work for you? This question is too vague because there is no context. Is it about comparing the new tool to the one it replaces? You just don't know. A better inquiry might be, "When it comes to data entry, is the new tool faster, slower, or about the same as the old tool? If faster, how much time is saved? If slower, how much time is lost?" These questions give context (data entry) and help frame responses that are measurable (time).
An airport wants to make its luggage-handling process faster and simpler for travelers. A data analyst examines and evaluates how the process works currently in order to achieve the goal of a more efficient process. What methodology do they use?
Gap analysis
Which of the following examples are leading questions?
How has our product helped make your life easier? How much would you pay for this convenience?
A dairy farmer decides to open an ice cream shop on her farm. After surveying the local community about people's favorite flavors, she takes the data they provided and stores it in a secure hard drive so it can be maintained safely on her computer. This is part of which phase of the data life cycle?
Manage
If you are having a conversation with a small business owner of an ice cream shop, you could ask:
Specific: What data do you use to help with purchasing and inventory? Measurable: Can you order (rank) these factors from most to least influential on sales: price, flavor, and time of year (season)? Action-oriented: Is there a single factor you need more data on so you can potentially increase sales? Relevant: How do you advertise to or communicate with customers? Time-bound: What does your year-over-year sales growth look like for the last three years?
If you are having a conversation with a teacher, you might ask different questions, such as:
Specific: What kind of data do you use to build your lessons? Measurable: How well do student benchmark test scores correlate with their grades? Action-oriented: Do you share your data with other teachers to improve lessons? Relevant: Have you shared grading data with an entire class? If so, do students seem to be more or less motivated, or about the same? Time-bound: In the last five years, how many times did you review data from previous academic years?
Fill in the blank: During the _____ phase of the data life cycle, a business decides what kind of data it needs, how it will be managed, who will be responsible for it, and the optimal outcomes.
Planning
A data analyst is trying to understand what data to use to help solve a business problem. They're asking questions such as, "What internal data is available in the database?" and "What outside facts do I need to research?" The data analyst is in which phase of the data analysis process?
Prepare
Identifying similar challenges across different entities—and using data and insights to find common solutions.
Problem type Discovering connections
Using historical data about what happened in the past to understand how likely it is to happen again.
Problem type Finding patterns
Recognizing broader concepts and trends from categorized data.
Problem type Identifying themes
Using data to make informed decisions about how things may be in the future.
Problem type Making predictions
Identifying data that is different from the norm.
Problem type Spotting something unusual
In which step of the data analysis process would an analyst ask questions such as, "What data errors might get in the way of my analysis?" or "How can I clean my data so the information I have is consistent?"
Process
Data Design
The analytical skill that involves how you organize information
Data Strategy
The analytical skill that involves managing the people, processes and tools used in data analysis
Examples of SMART questions Here's an example that breaks down the thought process of turning a problem question into one or more SMART questions using the SMART method: What features do people look for when buying a new car? Specific: Does the question focus on a particular car feature? Measurable: Does the question include a feature rating system? Action-oriented: Does the question influence creation of different or new feature packages? Relevant: Does the question identify which features make or break a potential car purchase? Time-bound: Does the question validate data on the most popular features from the last three years?
Questions should be open-ended. This is the best way to get responses that will help you accurately qualify or disqualify potential solutions to your specific problem. So, based on the thought process, possible SMART questions might be: On a scale of 1-10 (with 10 being the most important) how important is your car having four-wheel drive? What are the top five features you would like to see in a car package? What features, if included with four-wheel drive, would make you more inclined to buy the car? How much more would you pay for a car with four-wheel drive? Has four-wheel drive become more or less popular in the last three years?
In which data analysis phase would a data analyst use visuals such as charts or graphs to simplify complex data for better understanding?
Share
What is a Data Analyst?
Someone who collects, transforms, and organizes data in order to help make informed decisions
SPECIFIC QUESTIONS
Specific questions are simple, significant and focused on a single topic or a few closely related ideas. This helps us collect information that's relevant to what we're investigating. If a question is too general, try to narrow it down by focusing on just one element. For example, instead of asking a closed-ended question, like, are kids getting enough physical activities these days? Ask what percentage of kids achieve the recommended 60 minutes of physical activity at least five days a week? That question is much more specific and can give you more useful information.
For instance, if you have a conversation with someone who works in retail, you might lead with questions like:
Specific: Do you currently use data to drive decisions in your business? If so, what kind(s) of data do you collect, and how do you use it? Measurable: Do you know what percentage of sales is from your top-selling products? Action-oriented: Are there business decisions or changes that you would make if you had the right information? For example, if you had information about how umbrella sales change with the weather, how would you use it? Relevant: How often do you review data from your business? Time-bound: Can you describe how data helped you make good decisions for your store(s) this past year?
What is Data Analysis?
The collection, transformation, and organization of Data in order to draw conclusions, make predictions, and drive informed decision-making. TURNING DATA INTO INSIGHTS.
Data Science
The discipline that makes Data useful
The data life cycle deals with the stages that data goes through during its useful life; data analysis involves following a process to analyze data.
True
What problem type could involve a data analyst working for an agricultural company examines why a dataset has a surprising and rare data point?
The spotting something unusual
A good reflection on this topic would describe how you applied SMART questions to the scenario. Here are a few questions you might want to ask: When is the project due? Are there any specific challenges to keep in mind? Who are the major stakeholders for this project, and what do they expect this project to do for them? Who am I presenting the results to? Here are some examples of questions you might ask based on the suggested topics: Objectives: What are the goals of the deep dive? What, if any, questions are expected to be answered by this deep dive? Audience: Who are the stakeholders? Who is interested or concerned about the results of this deep dive? Who is the audience for the presentation? Time: What is the time frame for completion? By what date does this need to be done? Resources: What resources are available to accomplish the deep dive's goals? Security: Who should have access to the information?
These questions can help you focus on techniques and analyses that produce results of interest to stakeholders. They also clarify the deliverable's due date, which is important to know so you can manage your time effectively. When you start work on a project, you need to ask questions that align with the plan and the goals and help you explore the data. The more questions you ask, the more you learn about your data, and the more powerful your insights will be. Asking thorough and specific questions means clarifying details until you get to concrete requirements. With clear requirements and goals, it's much easier to plan and execute a successful data analysis project and avoid time-consuming problems down the road.
During the process phase of data analysis, a data analyst cleans data to ensure it's complete and correct.
True
A data analytics team works to recognize the current problem. Then, they organize available information to reveal gaps and opportunities. Finally, they identify the available options. These steps are part of what process?
Using structured thinking
Rephrase the sentence to be a good question. "How can we get more customers to recycle our product packaging?"
What design features will make our packaging easier to recycle?
A data analyst has entered the analyze step of the data analysis process. Identify the questions they might ask during this phase. Select all that apply.
What story is my data telling me? How will my data help me solve this problem?
Fill in the blank: The question, "How could we improve our website to simplify the returns process for our online customers?" is _____-oriented.
action
3 - Take good notes It is important to take good notes during your conversation. Your notes should be comprehensive and useful. To help you capture meaningful notes, you should stick to a process of asking a question, clarifying your understanding of their response, and then briefly recording it in your notes. Remember: If a question is worth asking, then the answer is worth recording. Commit yourself to taking great notes during your conversation. Helpful aspects of your conversation to note include: Facts: Write down any concrete piece of information, such as dates, times, names, and other specifics. Context: Facts without context are useless. Note any relevant details that are needed in order to understand the information you gather. Unknowns: Sometimes you may miss an important question during a conversation. Make a note when this happens so you can figure out the answer later.
For example, if the previous SMART questions led the ice cream shop owner to propose a project to analyze customer flavor preferences, your notes might appear something like this: Project: Collect customer flavor preference data. Overall business goal: Use data to offer or create more popular flavors. Two data sources: Cash register receipts and completed customer surveys (email). Target completion date: Q2 To do: Call back later and speak with the manager about the location of survey data. The notes you will take will differ greatly based on the data conversation you have. The important thing is that your notes are clear, organized, and concise.