Intro to Data Analytics
Why is proficiency in Statistics an important skill for a Data Analyst?
For identifying patterns and correlations in data
What Data Analysis roles may be best suited for people with little or no technical training?
Functional analyst
In the data analyst's ecosystem, languages are classified by type. What are shell and scripting languages most commonly used for?
Automating repetitive tasks
When you're combining rows of data from multiple source tables into a single table, what kind of data transformation are you performing?
Union
When you're calculating the middle value of a data field in a data set, what are you really calculating?
median
What type of data mining operations was R specifically built to handle?
classification of data
What is one of the most significant advantages of an RDBMS?
Is ACID-compliant
Data Marts and Data Warehouses have typically been relational, but the emergence of what technology has helped to let these be used for non-relational data?
NoSQL
According to the video "Languages for Data Professionals," which of the programming languages supports multiple programming paradigms, such as object-oriented, imperative, functional, and procedural, making it suitable for a wide variety of use cases?
Python
What are some of the steps in the process of "Identifying Data"? (Select all that apply)
Determine the information you want to collect. Define a plan for collecting data.
Which one of the NoSQL database types uses a graphical model to represent and store data, and is particularly useful for visualizing, analyzing, and finding connections between different pieces of data?
graph-based
What is one of the steps in a typical data cleaning workflow?
Inspecting data to detect issues and errors
Which of the data roles is responsible for extracting, integrating, and organizing data into data repositories?
Data engineer
What is the discipline of communicating information through the use of visual elements?
data visualization
Matplotlib is a widely used Python data visualization library.
true
Which of the following is an example of unstructured data?
video and audio files?
Which emerging technology has made it possible for every enterprise to have access to limitless storage and high-performance computing?
Cloud computing
A modern data ecosystem includes a network of continually evolving entities. It includes:
Data sources, enterprise data, business stakeholders, etc.
"A presentation is not a data dump". What is the one thing you would do to ensure your presentation is not a data dump?
Include only what is needed to address the business problem
Which of these is essential for getting started and growing as a Data Analyst?
Love for numbers, a curious mind, openness to learn
Which one of these file formats is independent of software, hardware, and operating systems, and can be viewed the same way on any device?
PDF?
Which of the provided options offers simple commands to specify what is to be retrieved from a relational database?
RSS feed
What can you do to help your audience trust you?
Share your data sources, hypotheses and validations???
OpenRefine is an open-source tool that allows you to:
Transform data into a variety of formats such as TSV, CSV, etc.
Which of these is one of the soft skills required to be a successful Data Analyst?
Work collaboratively with cross-functional teams
What does the attribute "Veracity" imply in the context of Big Data?
accuracy and conformity of data to facts
In "A day in the life of a Data Analyst", what according to Sivaram Jaladi forms a large part of a Data Analyst's job?
cleaning and preparing data
A Principal Data Analyst is responsible for:
Having expertise in tools and tech used in data analytics
Web scraping is used to extract what type of data?
???
Which of the following statement describes Data Analyst Specialist Roles?
Analysts who advance technical, statistical and analytical skills over time to expert levels
Which of the data repositories serves as a pool of raw data and stores large amounts of structured, semi-structured, and unstructured data in their native formats?
Data lakes
Data Mining is defined as the process of:
Extracting knowledge from data
Apache Spark is a general-purpose data processing engine designed to extract and process Big Data for a wide range of applications. What is one of its key use cases?
Perform complex analytics in real-time
The first step in the data analysis process is to gain an in-depth understanding of the problem and the desired outcome. What are you seeking answers to at this stage of the data analysis process?
Where you are and where you need to be.
When you detect a value in your data set that is vastly different from other observations in the same data set, what would you report that as?
outlier
From the provided list, select the three emerging technologies that are shaping today's data ecosystem.
cloud computing, machine learning, big data
What is a branch of mathematics dealing with the collection, analysis, interpretation, and presentation of numerical or quantitative data?
Statistics
Which data source can return data in plain text, XML, HTML, or JSON among others?
API
Which of the data analyst functional skills helps research and interpret data, theorize, and make forecasts?
analytical skills
When we analyze data in order to understand why an event took place, which of the four types of data analytics are we performing?
diagnostic analysis
What is the goal of Data Visualization?
make information easy to comprehend, interpret, and retain
Data Analysts work within the data ecosystem to:
Gather, clean, mine and analyze data for deriving insights
When you analyze historical data to predict future outcomes what type of Data Analytics are you performing?
predictive analytics
Data obtained from an organization's internal CRM, HR, and workflow applications is classified as:
primary data
What type of data refers to information obtained directly from the source?
primary data
Job roles such as Project Managers, Marketing Managers, and HR Managers, can achieve greater efficiency and effectiveness in their current roles by acquiring data analysis skills, and are therefore known as analytics-enabled job roles.
True
What does a typical data wrangling workflow include?
Validating the quality of the transformed data.
In "A day in the life of a Data Analyst", what are some of the data points that were useful in analyzing the use case. (Select all that apply)
serial number of the meters Average bulling amount of complainants
What is the general tendency of a set of data to change over time called?
trend