Udemy Exam Prep Course

Ace your homework & exams now with Quizwiz!

A Zoology professor named Christle is conducting a poll of her students' favorite animal. Zebras were selected as the class favorite because of their unique pattern. The pattern of a zebra would be considered what kind of data? A.) Quantitative Data B.) Qualitative Data C.) Ordinal Data D.) Nominal Data

A Zoology professor named Christle is conducting a poll of her students' favorite animal. Zebras were selected as the class favorite because of their unique pattern. The pattern of a zebra would be considered what kind of data? B.) Qualitative Data -------------------------------------------------------------- *This is an example of qualitative data because it does not have a numerical value associated with it. *Nominal and ordinal data are both subcategories of qualitative data, but this question does not specifically ask about ordering animals or their patterns.

A bank, brokerage company, or investment firm would be most concerned with which of the following? ("specifically speaking") A.) Data Breach B.) Intellectual Property (IP) C.) Personally Identifiable Financial Information (PIFI) D.) Protected Health Information (PHI)

A bank, brokerage company, or investment firm would be most concerned with which of the following? C.) Personally Identifiable Financial Information (PIFI) -------------------------------------------------------------- *Personally identifiable financial information (PIFI) is information about a consumer provided to a financial institution. PIFI includes information such as account number, credit/debit card number, personal information (such as name and contact information), and social security number. *Protected health information (PHI). This is health-related data that can be used to identify an individual. PHI includes information about a person's past, present, or future health, and payments and data used in the operation of a healthcare business. *Intellectual property (IP) is an intangible product of human thought and creativity. Intellectual property is legally protected by copyrights, patents, trademarks, and trade secrets. *A data breach is the loss or disclosure of private or sensitive information. This occurs when data is read, modified, or deleted without authorization.

A data analyst may need to convert field entries. Which of the following field types is the most flexible? A.) Text B.) Alphanumeric C.) Date D.) Numeric

A data analyst may need to convert field entries. Which of the following field types is the most flexible? B.) Alphanumeric -------------------------------------------------------------- *Alphanumeric fields are the most flexible with data entry. You may also see this referred to as "char," "text," or "string" fields. The database's design and its usefulness hinges on the database designer recognizing the importance of being intentional with the choice of each field data type.

A data analyst named Dante works for a toiletry distribution company. He submits a proposal to the IT director to hire consultants to create an application. This application will use a set of protocols within a computer system so that two unrelated analysis platforms can communicate with each other. This application is an example of what? A.) A web service B.) Machine data C.) Application Programming Interface (API) D.) Web scraping

A data analyst named Dante works for a toiletry distribution company. He submits a proposal to the IT director to hire consultants to create an application. This application will use a set of protocols within a computer system so that two unrelated analysis platforms can communicate with each other. This application is an example of what? C.) Application Programming Interface (API) -------------------------------------------------------------- *An API is a set of protocols within a computer system that allows two unrelated systems to communicate. *A web service is a type of API that allows a hosted computer on a network to share data back and forth with a computer in the same hosted environment. *Machine data is produced by a machine rather than a human. For example, time stamps for computer logins are automatically generated. *Web scraping is the act of pulling information from a website and can be done with automation or by hand.

A data analyst named Jamario is working on a database for a gas and electricity company. He uses a customer ID number to query a table for a consumer's gas consumption along with a separate table that holds an electric bill. What type of data relationship is his query demonstrating? A.) A Many-to-Many Relationship B.) A One-to-One Relationship C.) A Many-to-One Relationship D.) A One-to-Many Relationship

A data analyst named Jamario is working on a database for a gas and electricity company. He uses a customer ID number to query a table for a consumer's gas consumption along with a separate table that holds an electric bill. What type of data relationship is his query demonstrating? D.) A One-to-Many Relationship -------------------------------------------------------------- *The customer ID is associated with multiple records in other tables (in which the key is foreign). A one-to-one relationship means that one record in a table will be associated with only one record in the other table.This is false because Jamario is pulling both the gas and electric tables with his query. *A many-to-many relationship has many records associated with many other records.

A social media start-up company is deciding how to best store large quantities of unstructured data. This data includes videos, GIFs, and images. What technology will the company most likely use to accomplish this? A.) Data Lake B.) Data Warehouse C.) Data Mart D.) Data Lakehouse

A social media start-up company is deciding how to best store large quantities of unstructured data. This data includes videos, GIFs, and images. What technology will the company most likely use to accomplish this? A.) Data Lake -------------------------------------------------------------- *Data lakes hold both structured and unstructured data, in its original format, before it is ready for cleaning or analysis. *A data mart is a data warehouse that is dedicated to a specific department or group. *Data warehouses are dedicated to the storage of company data from a wide range of sources and typically require more structure than data lakes. *A data lakehouse combines the flexibility of a data warehouse and the cost effectiveness of a data lake.

Angel is a data analyst for an investment firm. She must deliver a presentation on the size of investments the firm has made in different geographic areas. Which of the following would best suit her needs? A.) Dot Map B.) Waterfall Chart C.) Filled Map D.) Layered Map

Angel is a data analyst for an investment firm. She must deliver a presentation on the size of investments the firm has made in different geographic areas. Which of the following would best suit her needs? A.) Dot Map -------------------------------------------------------------- *There are two styles of maps that are commonly used to display geographic data, the dot map and the filled map. *The dot map uses markers to note specific spots on the map and the filled map fills in the borders of a location. *A layered map blends the attributes of both a dot map and a filled map and is overlaid to display more information. *A waterfall chart is used to show performance over time and visualizes how money flows from a starting balance to an ending balance. For example, this can be used for tracking operating expenses, cash flow, or growth in customers investments.

Angel is a data analyst for an investment firm. She must deliver a presentation on which states the firm has invested in while also displaying the amount of money invested. Which of the following would best suit her needs? A.) Dot Map B.) Filled Map C.) Waterfall Chart D.) Layered Map

Angel is a data analyst for an investment firm. She must deliver a presentation on which states the firm has invested in while also displaying the amount of money invested. Which of the following would best suit her needs? D.) Layered Map -------------------------------------------------------------- *A layered map blends the attributes of both a dot map and a filled map and is overlaid to display more information. A layered chart could both show where the firm is vested, with a filled map, and how much money is invested, with a dot map. *There are two styles of maps that are commonly used to display geographic data, the dot map and the filled map. The dot map uses markers to note specific spots on the map and the filled map fills in the borders of a location. *A waterfall chart is used to show performance over time and visualizes how money flows from a starting balance to an ending balance. For example, this can be used for tracking operating expenses, cash flow, or growth in customers investments.

Angel is the IT director for a nation-wide pet clothing consignment store. She has directed her data architects to create separate databases for clothing sales designed for each kind of common household pet to keep company systems organized. The cat, dog, bird, and rodent departments must all have separate databases to accomplish this. Which technology would best support Angel's requirement? A.) Data Mart B.) Data Warehouse C.) Data Lake D.) Data Lakehouse

Angel is the IT director for a nation-wide pet clothing consignment store. She has directed her data architects to create separate databases for clothing sales designed for each kind of common household pet to keep company systems organized. The cat, dog, bird, and rodent departments must all have separate databases to accomplish this. Which technology would best support Angel's requirement? A.) Data Mart -------------------------------------------------------------- *A data mart is a data storage technology used for specific departments or needs within a company. *Data Warehouses are dedicated to the storage of company data from a wide range of sources and typically require more structure than data lakes. *Data lakes hold both structured and unstructured data in its original format while it is not yet ready for cleaning or analysis. *Data lakehouses provide flexibility, like a data lake, and yet are often more cost effective than a data warehouse

Arina will take her data+ exam next week. Please assist her studies by identifying the type of reporting that would be used to fulfill a one-time request that is time sensitive. A.) Research-Driven Report B.) Compliance Report C.) Ad hoc Report D.) Operational Report

Arina will take her data+ exam next week. Please assist her studies by identifying the type of reporting that would be used to fulfill a one-time request that is time sensitive. C.) Ad hoc Report -------------------------------------------------------------- *Ad hoc reports are generated to fulfill one-time requests and are typically time sensitive. *A compliance report is a report that must be run for compliance or regulatory reasons and includes safety reports, financial reports, and health reports. *A research-driven report relies on research to inform and change business practices, usually to support reaching organizational goals. He would use an operational report. *An operational report is used to inform on the status of a project, product, or organization. These reports can be used to measure key performance indicators (KPIs) for the organization.

Bertha is creating a paginated report for her supervisor at Dion Extreme Sports Supply Corp. She needs to add page numbers to the report to keep her analysis organized. Where would she least likely want to add this information to her report? A.) Page Header B.) Report Header C.) Report Footer D.) Page Footer

Bertha is creating a paginated report for her supervisor at Dion Extreme Sports Supply Corp. She needs to add page numbers to the report to keep her analysis organized. Where would she least likely want to add this information to her report? C.) Report Footer -------------------------------------------------------------- *The report footer appears at the end of the reported data. This would not be an appropriate area to indicate page numbers because it only appears once when the report concludes. *The page footer appears at the bottom of each page of a report. The page footer is a common location for references, page numbers, and version numbers. *A report header appears at the top of the first page of a report. The report header can be used to title the report and the version number can be placed on the top right of the page. *The page header is located at the top of each page of a report and is a good place to include field headings and information that needs to be on every page, like a page number.

Bhavna is a data analyst working at Dion automotive. She has been tasked with identifying if the sales team met the dealership goal of selling one hundred vehicles in the second quarter. This goal represents which of the following? A.) Scope Creep B.) Simple Linear Regression C.) Key Performance Indicators (KPIs) D.) Regression Analysis

Bhavna is a data analyst working at Dion automotive. She has been tasked with identifying if the sales team met the dealership goal of selling one hundred vehicles in the second quarter. This goal represents which of the following? C.) Key Performance Indicators (KPIs) -------------------------------------------------------------- *KPIs are measurements or goals used to identify if a business is achieving its objectives. KPIs can be used to monitor the status of products, processes, or sales goals for example. *Scope creep is the adjustment of the outline of a project and its measurable tasks that are needed to meet a desired end state. These adjustments can cause issues in meeting deadlines or cause a gap in reaching the desired state on time and within budget. *Simple linear regression is used to study the relationship between one dependent variable and one predictor, or independent variable. This analysis informs an analyst on which predictor may have the largest impact. *Regression analysis is a statistical method used to estimate relationships between a dependent variable and one or more independent variables.

Bianca is working with Personal Information and needs a way to search for records while maintaining the confidentiality of their values. Which of the following would best meet her goal with the data? A.) An index field B.) Masking C.) Transposing D.) Appending

Bianca is working with Personal Information and needs a way to search for records while maintaining the confidentiality of their values. Which of the following would best meet her goal with the data? A.) An index field -------------------------------------------------------------- *An index field creates a unique ID for a record to not disclose its value. *Hiding the value of a field to protect sensitive data is known as masking. *Transposing data reverses its direction, so columns become rows and rows become columns. *Appending combines data from separate data sets.

Carl is preparing for his Data+ exam that he'll take in three weeks at a local testing center. He's currently studying domain 4, visualization, and forgot which report is run directly by the consumer. Please assist him and select the correct answer below. A.) Static Report B.) Self-Service Report C.) Dynamic Report D.) Point-in-Time Report

Carl is preparing for his Data+ exam that he'll take in three weeks at a local testing center. He's currently studying domain 4, visualization, and forgot which report is run directly by the consumer. Please assist him and select the correct answer below. B.) Self-Service Report -------------------------------------------------------------- *A self-service report, also known as an on-demand report, is one that is run directly by the consumer. When consumers can leverage dashboards or run their own reports from the systems the organization has purchased, they are doing self-service. *A static report is a report that does not update automatically. *A dynamic report, also known as a real-time report, is connected to the data and can be refreshed on demand or regularly updated automatically. *A point-in-time report reflects on a specific point in time and can cover a day, week, month, year, or more. The timing is flexible and based on the needs of the analysis.

Carl is preparing for his Data+ exam that he'll take in three weeks at a local testing center. He's currently studying domain 4, visualization, and forgot which report needs to be updated manually for current data. Please assist him and select the correct answer below. A.) Static Report B.) Dynamic Report C.) Point-in-Time Report D.) Self-Service Report

Carl is preparing for his Data+ exam that he'll take in three weeks at a local testing center. He's currently studying domain 4, visualization, and forgot which report needs to be updated manually for current data. Please assist him and select the correct answer below. A.) Static Report -------------------------------------------------------------- *A static report is a report that does not update automatically. *A dynamic report, also known as a real-time report, is connected to the data and can be refreshed on demand or regularly updated automatically. *A point-in-time report reflects on a specific point in time and can cover a day, week, month, year, or more. The timing is flexible and based on the needs of the analysis. *A self-service report, also known as an on-demand report, is one that is run directly by the consumer. When consumers can leverage dashboards, or run their own reports from the systems the organization has purchased, they are doing self-service.

Charlie is an economist for the state of Florida that specializes in housing prices. He has gathered a data set of 1,000 homes in Orlando and must calculate the middle value of real estate prices in the set. He is calculating which of the following? A.) Mean B.) Frequency C.) Mode D.) Median

Charlie is an economist for the state of Florida that specializes in housing prices. He has gathered a data set of 1,000 homes in Orlando and must calculate the middle value of real estate prices in the set. He is calculating which of the following? D.) Median -------------------------------------------------------------- *The median is the middle number within a group of sorted numbers. *Frequency is the number of times that a data point occurs in a data set. *The mean is the average of a set of numbers. *The mode is the number that shows up the highest amount of times in the data set.

Chuck is a data analyst that works for Dion Training. He has been tasked with determining the average time a student spends preparing for CompTIA exams on the Dion Training portal. Which of the following should Chuck use in the team database to derive this information? A.) Parsing B.) Indexing C.) Aggregate functions D.) Date functions

Chuck is a data analyst that works for Dion Training. He has been tasked with determining the average time a student spends preparing for CompTIA exams on the Dion Training portal. Which of the following should Chuck use in the team database to derive this information? D.) Date functions -------------------------------------------------------------- *Date functions derive attributes from date fields, like determining the day of the week, month, or year from a single date. *Aggregate functions are written for a group of records, not just for a single record, and work with a column of data. *Indexing is a field property setting that improves query speed and performance for fields that are commonly queried, sorted, or filtered. *Parsing breaks and extracts data out of a field for use.

Chuck is web scraping for a data science project and wants to collect text. Which programming language displays text on a web page? A.) Hypertext Markup Language (HTML) B.) Extensible Markup Language (XML) C.) Javascript Object Notation (JSON) D.) Standard Generalized Markup Language (SGML)

Chuck is web scraping for a data science project and wants to collect text. Which programming language displays text on a web page? A.) Hypertext Markup Language (HTML) -------------------------------------------------------------- *HTML displays text to a web browser when an end user is browsing the internet. *SGML is considered the parent of all markup languages and defines the standard of all child markup languages such as HTML and XML. *JSON is an object-oriented, event-driven programming language that allows users to interact with websites. *XML uses custom tags and is used for data transfers.

Chucky needs to switch data from rows to columns and columns to rows. This is known as which of the following? A.) An index field B.) Masking C.) Transposing D.) Appending

Chucky needs to switch data from rows to columns and columns to rows. This is known as which of the following? C.) Transposing -------------------------------------------------------------- *Transposing data reverses its direction, so columns become rows and rows become columns. *Data masking hides the value of a field to protect sensitive data. *An index field creates a unique ID for a record to not disclose its value. *Appending combines data from separate data sets.

Commas and spaces are characters intended to support the conversion data into a structured file format when exporting said data from a database. Before exporting the file out of the database, this type of file is known as what? A.) XML File B.) Delimited File C.) JSON File D.) Flat File

Commas and spaces are characters intended to support the conversion data into a structured file format when exporting said data from a database. Before exporting the file out of the database, this type of file is known as what? B.) Delimited File -------------------------------------------------------------- *This is known as a delimited file. These files have some form of character that separates each field of data from the other data fields, usually a comma, pipe, or tab. *Flat files are delimited files which have been exported out of the database and no longer have a connection to it. *Javascript Object Notation (JSON) is used for making webpages interactive. *Extensible Markup Language (XML) is used for data transfers.

Daria is joining tables in a data set. How can she join them so that only records that exist in both tables appear in the result? A.) Cross Join B.) Full Outer Join C.) Inner Join D.) Right/Left Join

Daria is joining tables in a data set. How can she join them so that only records that exist in both tables appear in the result? C.) Inner Join -------------------------------------------------------------- *Full outer joins display all data, whether matched or unmatched, in the result. *For a cross join, the data wouldn't have a direct join on a key field. *Left outer joins display all results of the left table, while only matching records in the other (right) table appear in the result. Right outer joins display all results of the right table, while only matching records in the other (left) table appear in the result.

Dekwon is a database administrator for a state government tax office. He works with large volumes of records on a daily basis. His supervisor has asked that he implement a processing system that allows faster processing for their vast data sets. Cost is not a factor in the implementation. What should he implement? A.) Batch Processing B.) Distributed Processing C.) Multiprocessing D.) Real-Time Processing

Dekwon is a database administrator for a state government tax office. He works with large volumes of records on a daily basis. His supervisor has asked that he implement a processing system that allows faster processing for their vast data sets. Cost is not a factor in the implementation. What should he implement? C.) Multiprocessing -------------------------------------------------------------- *Multiprocessing uses two or more processors to work on a single data set. This allows faster processing for exceptionally large data sets. This is more expensive than other processing options because of the machine expenses and overall volume or memory needed for it to function properly. *Distributed processing takes large-volume data sets and distributes them across multiple servers. This is done to build redundancy into the system so that if a server fails, another server can continue its processes. *Real-time processing is used for instantaneous results that are approximate. A common example of this is GPS turn-by-turn directions. This is similar to transaction processing but does not need the same level of accuracy in its results. *Batch processing is used when processing a large amount of data when accuracy is more important than speed. Batch processing will process the data in batches, saving on the resource costs that are allocated for processing.

Derrick is looking for a new job and wants to become more marketable by learning data transformation, visualization, statistical, and reporting tools. Which of the following would allow him to create data visualizations for a potential client? A.) Rapid Miner B.) Minitab C.) ArcGIS D.) Crystal Reports

Derrick is looking for a new job and wants to become more marketable by learning data transformation, visualization, statistical, and reporting tools. Which of the following would allow him to create data visualizations for a potential client? C.) ArcGIS -------------------------------------------------------------- *Although many of these tools are multifunctional, the Data+ exam defines their roles as the following: - Rapid Miner is a data transformation tool. - ArcGIS is a visualization tool. - Minitab is a statistical analysis tool. - Crystal Reports is a paginated reporting tool.

Derrick is looking for a new job and wants to become more marketable by learning data transformation, visualization, statistical, and reporting tools. Which of the following would allow him to transform data for a potential client? A.) Crystal Reports B.) Rapid Miner C.) Minitab D.) ArcGIS

Derrick is looking for a new job and wants to become more marketable by learning data transformation, visualization, statistical, and reporting tools. Which of the following would allow him to transform data for a potential client? B.) Rapid Miner -------------------------------------------------------------- *Although many of these tools are multifunctional, the Data+ exam defines their roles as the following: - Rapid Miner is a data transformation tool. - ArcGIS is a visualization tool. - Minitab is a statistical analysis tool. - Crystal Reports is a paginated reporting tool.

Devonte is preparing for his Data+ exam in 2 weeks, but hasn't prepared for objective 3.4, identify common analytic tools, yet. Please help him prepare by selecting the tool that is used as a platform: A.) Stata B.) Microsoft Power BI C.) IBM Cognos D.) SQL Reporting Services (SRSS)

Devonte is preparing for his Data+ exam in 2 weeks, but hasn't prepared for objective 3.4, identify common analytic tools, yet. Please help him prepare by selecting the tool that is used as a platform: C.) IBM Cognos -------------------------------------------------------------- *Although many of these tools are multifunctional, the Data+ exam defines their roles as the following: - Microsoft Power BI is a data transformation and visualization tool, but in this example it is specifically used as a transformation tool. - IBM Cognos is a platform tool. - Stata is a statistical analysis tool. - SQL Reporting Services (SSRS) is a paginated reporting tool.

Elias is a data analyst working for a social media company. He has been given the following data on a few recent uploads from a celebrity and how many times each post was shared: 10, 20, 30, 30, 40, 50. What is the mean of this data set? A.) 10 B.) 20 C.) 30 D.) 40 E.) 50

Elias is a data analyst working for a social media company. He has been given the following data on a few recent uploads from a celebrity and how many times each post was shared: 10, 20, 30, 30, 40, 50. What is the mean of this data set? C.) 30 -------------------------------------------------------------- *The mean is the average of a set of numbers. *The median is the middle number within a group of sorted numbers. *The mode is the number that shows up the highest amount of times in the data set. *Frequency is the number of times that a data point occurs in a data set.

Estaban, a senior data analyst at Dion Electronic Superstore, is mentoring a junior data analyst during her onboarding. During the training, he describes something that is useful for comparing the size difference between an expected result and actual result of a single variable. What is he describing? A.) Goodness of Fit B.) Test of Independence C.) Chi Square Statistic D.) Chi Square Test

Estaban, a senior data analyst at Dion Electronic Superstore, is mentoring a junior data analyst during her onboarding. During the training, he describes something that is useful for comparing the size difference between an expected result and actual result of a single variable. What is he describing? A.) Goodness of Fit -------------------------------------------------------------- *Both a goodness of fit and test of independence are chi square tests. *A goodness of fit tests a single variable and the test of independence is used to test multiple variables. *A chi-square test produces the chi-square statistic and is useful when analyzing data from a random sample and working with a categorical variable, like race or gender. *A chi-square statistic compares the size of the difference between an expected result and the actual result. This measures how a model compares to the actual data.

Estaban, a senior data analyst at Dion Electronic Superstore, is mentoring a junior data analyst during her onboarding. During the training, he describes something that is useful for producing information to help analyze a random sample and categorical data. What is he describing? A.) Chi Square Test B.) Chi Square Statistic C.) Goodness of Fit D.) Test of Independence

Estaban, a senior data analyst at Dion Electronic Superstore, is mentoring a junior data analyst during her onboarding. During the training, he describes something that is useful for producing information to help analyze a random sample and categorical data. What is he describing? A.) Chi Square Test -------------------------------------------------------------- *A chi-square test produces the chi-square statistic and is useful when analyzing data from a random sample and working with a categorical variable, like race or gender. *Both a goodness of fit and test of independence are chi square tests. *A chi-square statistic compares the size of the difference between an expected result and the actual result. This measures how a model compares to the actual data. *A goodness of fit tests a single variable and the test of independence is used to test multiple variables.

Fernando is a data analyst reviewing a data set. Please help him identify the term for the average squared distance from the mean of the data for a single data point: A.) Variance B.) Min C.) Max D.) Range

Fernando is a data analyst reviewing a data set. Please help him identify the term for the average squared distance from the mean of the data for a single data point: A.) Variance -------------------------------------------------------------- *Variance is the average squared distance from the mean of the data for a single data point. *The range is the difference between the highest and lowest values of the data set. *The min is the smallest number in the data set. *The max is the largest number in the data set.

Frank and Reed are financial analysts working at the accounting firm of Jason & Dion LLC. They are estimating yearly revenue for next year's tax season. The estimation of a value is known as what? A.) Recoding B.) Imputing C.) Reduction D.) Deriving

Frank and Reed are financial analysts working at the accounting firm of Jason & Dion LLC. They are estimating yearly revenue for next year's tax season. The estimation of a value is known as what? B.) Imputing -------------------------------------------------------------- *Imputing values replaces data with an estimated value. *Recoding data changes the current value of a variable to a different value. *A derived variable is a data point that is created from existing data. For example, subtracting two dates to determine how long a warehouse needed to fulfill a customer shipping order. *Data mining reduction reduces the overall volume of data.

Frank is a senior data analyst who has recently been promoted to a new position. One of his new key responsibilities will be to ensure local, state, and federal data protection laws are followed by the non-profit he works for. The official legal power at different levels of government is an example of which of the following? A.) Regulations B.) Data Sovereignty C.) Jurisdiction D.) Data Classifications

Frank is a senior data analyst who has recently been promoted to a new position. One of his new key responsibilities will be to ensure local, state, and federal data protection laws are followed by the non-profit he works for. The official legal power at different levels of government is an example of which of the following? C.) Jurisdiction -------------------------------------------------------------- *Jurisdiction is the official power to make legal decisions and judgments. *Regulations are rules that are implemented by an authority and have the backing of law. *Data classifications are a way to categorize information by sensitivity to an organization. There are usually three levels of classification within an organization: public, sensitive, and confidential. *Data sovereignty is the concept that the country that hosts the stored data has control over that data. This is an important legal dynamic for a global economy.

Frank needs to implement a database solution that allows complex analysis to be performed on large data sets without impacting transactional systems. Which of the following is the best technology for him to implement? A.) Online Transactional Processing (OLTP) B.) Data Lakehouse C.) Online Analytical Processing (OLAP) D.) Data Warehouse

Frank needs to implement a database solution that allows complex analysis to be performed on large data sets without impacting transactional systems. Which of the following is the best technology for him to implement? C.) Online Analytical Processing (OLAP) -------------------------------------------------------------- *OLAP is a class of software that allows complex analysis to be conducted on large databases without affecting transactional systems. *Online Transactional Processing (OLTP) is a technology used for real-time data queries and record creation. *Data lakehouses are a combination of a data warehouse and data lakes. They provide a cost-effective and flexible solution for data storage needs but are not focused on transactional data. *A data mart is a data storage technology used for specific departments or needs within a company.

Frank needs to implement a database solution that allows complex analysis to be performed on large data sets without impacting transactional systems. Which of the following is the best technology for him to implement? A.) Data Mart B.) OLTP C.) Data Lakehouse D.) OLAP

Frank needs to implement a database solution that allows complex analysis to be performed on large data sets without impacting transactional systems. Which of the following is the best technology for him to implement? D.) OLAP -------------------------------------------------------------- *OLAP is a class of software that allows complex analysis to be conducted on large databases without affecting transactional systems. *Online Transactional Processing (OLTP) is a technology used for real-time data queries and record creation. *Data lakehouses are a combination of a data warehouse and data lakes. They provide a cost-effective and flexible solution for data storage needs but are not focused on transactional data. *A data mart is a data storage technology used for specific departments or needs within a company.

Hans, a data analyst, has been asked to determine if an outcome from his data set is repeatable and statistically significant. Which of the following indicate this measurement? A.) p-value B.) R-value C.) N count D.) t-test

Hans, a data analyst, has been asked to determine if an outcome from his data set is repeatable and statistically significant. Which of the following indicate this measurement? A.) p-value -------------------------------------------------------------- *A high p-value indicates that the outcome is likely not repeatable and thus not significant, and a low p-value indicates that the event is likely repeatable and thus statistically significant. *A t-test is used when comparing two groups to determine if there is a significant difference between the means of both groups. The value of significance is referenced by the probability value. *An r-value is the correlation coefficient. An r-value that is close to 1 indicates that there is a strong correlation between dependent and independent variables, while an r-value of or close to 0 means there is no correlation. *N count is the amount of data being used in research. For example, a poll of 100 students would have an n count of 100.

Harshit was recently hired for the IT department at a produce distributor on the west coast of the United States. He now holds ultimate responsibility for maintaining the confidentiality, integrity, and availability of the data at the company. What is his new role? A.) Data Custodian B.) Lifecycle of Data C.) Data Steward D.) Data Owner

Harshit was recently hired for the IT department at a produce distributor on the west coast of the United States. He now holds ultimate responsibility for maintaining the confidentiality, integrity, and availability of the data at the company. What is his new role? D.) Data Owner -------------------------------------------------------------- *A data owner is a management role. The data owner holds ultimate responsibility for maintaining the confidentiality, integrity, and availability of the data. The owner also normally selects a steward and custodian, delegates their actions, sets a budget, and allocates resources for sufficient controls. *Data has a lifecycle. It's created, stored, used, archived, and deleted. Each stage in the lifecycle of data has different rules and requirements for the data an organization will work with related to the regulations and compliance requirements for the industry. *A data custodian manages the system where data assets are stored. This includes the responsibilities of enforcing access control, encryption, and backup/ recovery measures. *A data steward is fundamentally responsible for data quality. A data steward ensures data is labeled, identified with appropriate metadata, and collected and stored in a format that complies with applicable laws and regulations.

Hermione keeps a database of elixirs that her classmates use for a potion course. There are a few elixirs that are used more often than others and she wants to simplify the query process for ingredients. What should she implement? A.) Parsing B.) Indexing C.) Aggregate Functions D.) Date Functions

Hermione keeps a database of elixirs that her classmates use for a potion course. There are a few elixirs that are used more often than others and she wants to simplify the query process for ingredients. What should she implement? B.) Indexing -------------------------------------------------------------- *Indexing is a field property setting that improves query speed and performance for fields that are commonly queried, sorted, or filtered. *Aggregate functions are written for a group of records, not just for a single record, and work with a column of data. *Date functions derive attributes from date fields, like determining the day of the week, month, or year from a single date. *Parsing breaks and extracts data out of a field for use.

Jake is learning about data science and comes across the following definition: The tendency for data to follow a bell-shaped curve with the mean being the middle and all other data following three points to the left or three points to the right of the mean. What is he learning about? A.) The Empirical Rule B.) Non-parametric Data C.) Parametric Data D.) Normal Distribution

Jake is learning about data science and comes across the following definition: The tendency for data to follow a bell-shaped curve with the mean being the middle and all other data following three points to the left or three points to the right of the mean. What is he learning about? D.) Normal Distribution -------------------------------------------------------------- *The empirical rule refers to the tendency of most data points falling within three points of the mean either on the positive side or the negative side of the curve. *Parametric data exists when the data set is within the rules of normal distribution. *Non-parametric data exists when the data is not within the rules of normal distribution, with values that frequently deviate from the mean. *A normal distribution of data follows a bell-shaped curve with the mean being the middle and all other data following three points to the left or three points to the right of the mean.

Jake is unsure which term is useful for testing the difference between expected results and actual results of a single variable. Please assist him and select the best answer that matches this description. A.) Chi Squared Statistic B.) Chi Squared Test C.) Test of Independence D.) Goodness of Fit

Jake is unsure which term is useful for testing the difference between expected results and actual results of a single variable. Please assist him and select the best answer that matches this description. D.) Goodness of Fit -------------------------------------------------------------- *A goodness of fit tests a single variable and the test of independence is used to test multiple variables. *A chi-square statistic compares the size of the difference between an expected result and the actual result. This measures how a model compares to the actual data. *A chi-square test produces the chi-square statistic and is useful when analyzing data from a random sample and working with a categorical variable, like race or gender. Both a goodness of fit and test of independence are chi square tests.

Jan, a data analyst, has been asked by his supervisor Mark to compare two groups and identify if there is a significant difference between the means of the sets. What can she use to accomplish this? A.) p-value B.) N count C.) t-test D.) R-value

Jan, a data analyst, has been asked by his supervisor Mark to compare two groups and identify if there is a significant difference between the means of the sets. What can she use to accomplish this? C.) t-test -------------------------------------------------------------- *A t-test is used when comparing two groups to determine if there is a significant difference between the means of both groups. The value of significance is referenced by the probability value. This is known as p-value. A high p-value indicates that the outcome is likely not repeatable and thus not significant and a low p-value indicates that the event is likely repeatable and thus statistically significant. *An r value is the correlation coefficient. An r value that is close to 1 indicates that there is a strong correlation between dependent and independent variables, while an r value of or close to 0 means there is no correlation. *N count is the amount of data being used in research. For example, a poll of 100 students would have an n count of 100.

Jordy is preparing a dashboard for his data science team and wants to ensure the other analysts have the most detailed understandings of each field. Which of the following should he incorporate to accomplish this? A.) Captioning B.) Legend C.) Serif Fonts D.) Style Guides

Jordy is preparing a dashboard for his data science team and wants to ensure the other analysts have the most detailed understandings of each field. Which of the following should he incorporate to accomplish this? A.) Captioning -------------------------------------------------------------- *Captioning allows an analyst to designate more meaningful names for fields in a report or dashboard. *A legend is a labeling element that lets a viewer understand which color represents which value in a visual. *Serif fonts letters have edges or lines that make smaller text more readable. Sans serif fonts do not have edges or lines and can be useful for stylistic purposes when text is larger. *Style guides commonly are branding guidelines for an organization. These may contain different variations of an organization's logo and guidelines for how it can be used, along with color schemes, fonts, and naming conventions.

Jordy is preparing a visual for his data science team and wants to ensure that it is easily understood when he presents it. Which of the following should he incorporate to assist his audience? A.) Legend B.) Serif Fonts C.) Captioning D.) Style Guides

Jordy is preparing a visual for his data science team and wants to ensure that it is easily understood when he presents it. Which of the following should he incorporate to assist his audience? A.) Legend -------------------------------------------------------------- *A legend is a labeling element that lets a viewer understand which color represents which value in a visual. *Serif fonts letters have edges or lines that make smaller text more readable. Sans serif fonts do not have edges or lines and can be useful for stylistic purposes when text is larger. *Style guides commonly are branding guidelines for an organization. These may contain different variations of an organization's logo and guidelines for how it can be used, along with color schemes, fonts, and naming conventions. *Captioning allows an analyst to designate more meaningful names for fields in a report or dashboard.

Kyrie is examining his company's sales database. He notices that increased advertising spending has an association with the amount of products sold. This is an example of what? A.) Casual Relationship B.) Correlation C.) Pearson's Correlation Coefficient D.) R-Value

Kyrie is examining his company's sales database. He notices that increased advertising spending has an association with the amount of products sold. This is an example of what? B.) Correlation -------------------------------------------------------------- *Correlation is the statistical association between two or more equal variables. Correlation does not tell an analyst that a variable influences another, but it does indicate that if one variable changes, the other variable changes as well. *A casual relationship proves that a variable has an effect on another variable. *Pearson's correlation coefficient is a calculation used to measure a linear relationship between data points and returns an r-value that is plus or minus 1 to determine the strength of the relationship.

Marin is a recently hired data analyst at the Dion Fruit Co-op. Her supervisor, Patricia, has asked her to use both qualitative and quantitative data to measure the reception of a new product that launched 3 months ago. What has Patricia asked Marin to do? A.) Exploratory Analysis B.) Gap Analysis C.) Performance Analysis D.) Link Analysis

Marin is a recently hired data analyst at the Dion Fruit Co-op. Her supervisor, Patricia, has asked her to use both qualitative and quantitative data to measure the reception of a new product that launched 3 months ago. What has Patricia asked Marin to do? C.) Performance Analysis -------------------------------------------------------------- *Performance analysis uses both qualitative and quantitative data to measure a particular product, outcome, or scenario against a defined objective. *Gap analysis is the study of developing projects to move from a present state to a desired state. *Link analysis determines how a single data point links to other data points and focuses on relationships and connections in a database.

Microsoft Excel is a very common example of which of the following? A.) Spreadsheet B.) Paginated Report C.) Recurring Report D.) Dashboard

Microsoft Excel is a very common example of which of the following? A.) Spreadsheet -------------------------------------------------------------- *A spreadsheet is a worksheet of data in tabular form. Spreadsheets are an ideal tool for people in an organization that need to export and work with data as part of their roles. *A recurring report is set to repeatedly run on certain dates or at specific times, just like how many teams and organizations have daily, weekly, or monthly meetings. *A dashboard is an interactive, visual display of information. Dashboards can be designed for mobile devices, tablets, or monitors and should be created in a way that is easily understandable. *A paginated report is a multi-page report that is not suitable for display on a dashboard.

Pei Pei is working on a business intelligence report and has identified that increasing the advertising budget for the marketing team will NOT increase the total volume of sales for his company. Which of the following is this relationship known as? A.) Null Hypothesis B.) Research Questions C.) Alternative Hypothesis D.) Scope

Pei Pei is working on a business intelligence report and has identified that increasing the advertising budget for the marketing team will NOT increase the total volume of sales for his company. Which of the following is this relationship known as? A.) Null Hypothesis -------------------------------------------------------------- *A null hypothesis assumes that a relationship between two variables does not exist. *An alternative hypothesis assumes that a relationship between two variables does exist. *A scope includes measurable tasks that are needed to meet the desired end state of a project. *Research questions are the first step in preparing to research a topic. Research questions should be specific and answerable by a true or false statement.

Reed is preparing for his Data+ exam next week. Please assist him by selecting the term that uses algorithms to rearrange data into cyphertext. A.) Data Encryption B.) Data at Rest C.) Data in Transit D.) Data in Use

Reed is preparing for his Data+ exam next week. Please assist him by selecting the term that uses algorithms to rearrange data into cyphertext. A.) Data Encryption -------------------------------------------------------------- *Data encryption is the process of using algorithms that will rearrange data from its original plaintext into another form, known as cyphertext, so that it can't be read by someone without the encryption key. *Data that is actively being transferred is data in transit. *Data that is being stored is data at rest, and data that has been transmitted and is now present in memory or being queried is data in use.

Reed needs to build a database for a web application. He must choose the most flexible and scalable technology to satisfy requirements for the IT department. It is likely that the database will use a variety of programming languages to function. What technology would best fit Reed's requirements? A.) Non-relational database B.) Ordinal database C.) Shared drive D.) Relational database

Reed needs to build a database for a web application. He must choose the most flexible and scalable technology to satisfy requirements for the IT department. It is likely that the database will use a variety of programming languages to function. What technology would best fit Reed's requirements? A.) Non-relational database -------------------------------------------------------------- *A non-relational database will best satisfy Reed's requirements. *Ordinal refers to ordinal data and is not a database technology. *A shared drive spreadsheet could be used to share data amongst Reed's team but is not highly scalable nor a database technology. *A relational database requires more detailed planning than a non-relational database, is not as scalable, and is more dependent on standardized data entries compared to a non-relational database.

Ricardo is an IT director at the Dion Institute of Information Technology (DIIT). The institute has recently partnered with the Kane Academy of Data Science (KADS) to create courseware for CompTIA's Data+ certification. Before either organization begins sharing data, they must both agree to a document that addresses the use and exchange of information. What is this known as? A.) Data Use Agreement B.) Non-Disclosure Agreement (NDA) C.) Acceptable Use Agreement D.) Memorandum of Understanding (MOU)

Ricardo is an IT director at the Dion Institute of Information Technology (DIIT). The institute has recently partnered with the Kane Academy of Data Science (KADS) to create courseware for CompTIA's Data+ certification. Before either organization begins sharing data, they must both agree to a document that addresses the use and exchange of information. What is this known as? A.) Data Use Agreement -------------------------------------------------------------- *A data use agreement is any document that addresses the use and exchange or sharing of information. These agreements are normally legally binding and include contracts, non-disclosure agreements, memorandums of understanding, and other legal instruments. *A non-disclosure agreement (NDA) defines the conditions under which an entity (such as a person or supplier) cannot disclose information to parties outside of the agreement. An NDA includes specific descriptions of the legal ramifications for breaking the agreement to act as a deterrent to sharing said information. An acceptable use agreement describes not only how data can be used, but also for what purpose. *Acceptable use agreements also establish requirements for the removal of personal data, especially when privacy regulations like GDPR or HIPAA apply to the data. This is done to reduce the risk of the data being identified. *A memorandum of understanding (MOU) is an acceptable use agreement that establishes the rules of engagement between two parties and defines roles and expectations. MOUs are non-binding and are difficult to enforce because they are not formal contracts.

Ricardo is an IT director at the Dion Institute of Information Technology (DIIT). The institute has recently partnered with the Kane Academy of Data Science (KADS) to create courseware for CompTIA's Data+ certification. Both parties have signed a non-legally binding document that establishes a relationship between them on their roles and expectations moving forward. What is this known as? A.) Data Use Agreement B.) Memorandum of Understanding (MOU) C.) Acceptable Use Agreement D.) Non-Disclosure Agreement (NDA)

Ricardo is an IT director at the Dion Institute of Information Technology (DIIT). The institute has recently partnered with the Kane Academy of Data Science (KADS) to create courseware for CompTIA's Data+ certification. Both parties have signed a non-legally binding document that establishes a relationship between them on their roles and expectations moving forward. What is this known as? B.) Memorandum of Understanding (MOU) -------------------------------------------------------------- *A memorandum of understanding (MOU) is an acceptable use agreement that establishes the rules of engagement between two parties and defines roles and expectations. MOUs are non-binding and are difficult to enforce because they are not formal contracts. *A data use agreement is any document that addresses the use and exchange or sharing of information. These agreements are normally legally binding and include contracts, non-disclosure agreements, memorandums of understanding, and other legal instruments. *A non-disclosure agreement (NDA) defines the conditions under which an entity (such as a person or supplier) cannot disclose information to parties outside of the agreement. An NDA includes specific descriptions of the legal ramifications for breaking the agreement to act as a deterrent to sharing said information. *An acceptable use agreement describes not only how data can be used, but also for what purpose. Acceptable use agreements also establish requirements for the removal of personal data, especially when privacy regulations like GDPR or HIPAA apply to the data. This is done to reduce the risk of the data being identified.

Roger is a junior data analyst at a large IT consulting firm. He is currently working on a contract for the Dion Peanut Butter Production Company. He has been asked to identify how the increased cost of fertilizer connects to the rise in peanut price. What type of analysis is Roger being asked to do? A.) Exploratory Analysis B.) Performance Analysis C.) Gap Analysis D.) Link Analysis

Roger is a junior data analyst at a large IT consulting firm. He is currently working on a contract for the Dion Peanut Butter Production Company. He has been asked to identify how the increased cost of fertilizer connects to the rise in peanut price. What type of analysis is Roger being asked to do? D.) Link Analysis -------------------------------------------------------------- *Link analysis determines how a single data point links to other data points and focuses on relationships and connections in a database. *Exploratory analysis should be done on each data set that an analyst encounters. This analysis determines the main characteristics of a data set and identifies what data should be cleaned or transformed for use. *Performance analysis uses both qualitative and quantitative data to measure a particular product, outcome, or scenario against a defined objective.

Rupert is a data custodian at Kane and Dion regional automotive. He has worked his way up through the data analysis team and is now responsible for ensuring mission critical data can be used in real-time. What is he now responsible for? A.) Data Destruction B.) Data Retention C.) Data Transmission D.) Transaction Processing

Rupert is a data custodian at Kane and Dion regional automotive. He has worked his way up through the data analysis team and is now responsible for ensuring mission critical data can be used in real-time. What is he now responsible for? D.) Transaction Processing -------------------------------------------------------------- *Transaction processing is used for transactional data that is mission critical to an organization. These processes are captured and processed in real time. *Data transmission is the process of sending and receiving data. *Data retention defines the duration that data must be kept. This includes both the minimum and maximum times it can remain in storage before it is destroyed. *Data destruction describes the legally compliant means through which data must be removed and made inaccessible. The required level of destruction is directly related to the data's classification and sensitivity.

Rupert is a data custodian at Kane and Dion regional automotive. He has worked his way up through the data analysis team and is now responsible for ensuring that both the minimum and maximum allowable times of data storage are enforced in their customer database. He is now responsible for which of the following? A.) Data Retention B.) Data Destruction C.) Data Transmission D.) Transaction processing

Rupert is a data custodian at Kane and Dion regional automotive. He has worked his way up through the data analysis team and is now responsible for ensuring that both the minimum and maximum allowable times of data storage are enforced in their customer database. He is now responsible for which of the following? A.) Data Retention -------------------------------------------------------------- *Data retention defines the duration that data must be kept. This includes both the minimum and maximum times it can remain in storage before it is destroyed. *Data destruction describes the legally compliant means through which data must be removed and made inaccessible. The required level of destruction is directly related to the data's classification and sensitivity. *Transaction processing is used for transactional data that is mission critical to an organization. These processes are captured and processed in real time. *Data transmission is the process of sending and receiving data.

SUM, COUNT, and AVERAGE are examples of: A.) Parsing B.) Indexing C.) Aggregate functions D.) Date functions

SUM, COUNT, and AVERAGE are examples of: C.) Aggregate functions -------------------------------------------------------------- *Aggregate functions are written for a group of records, not just for a single record, and work with a column of data. *Date functions derive attributes from date fields, like determining the day of the week, month, or year from a single date. *Parsing breaks and extracts data out of a field for use. *Indexing is a field property setting that improves query speed and performance for fields that are commonly queried, sorted, or filtered.

Sahra is a new data analyst that was recently hired by a textile company. Her manager has tasked her with familiarizing herself with the team database and observes that it uses a tabular schema of storage. What type of database will Sahra be working with? A.) Relational Database Management System (RDMS) B.) Non-relational C.) Categorical D.) Relational

Sahra is a new data analyst that was recently hired by a textile company. Her manager has tasked her with familiarizing herself with the team database and observes that it uses a tabular schema of storage. What type of database will Sahra be working with? D.) Relational -------------------------------------------------------------- *A relational database is often called a tabular schema due to the rows and columns it employs. *An RMDS, Relational Database Management System, is software that maintains a relational database, but is not a type of database itself. *Non-Relational databases do not use tabular schemas and are often called No-SQL databases. *Categorical refers to a type of data which is qualitative and is not a type of database.

Sally is a data analyst that is about to use data visualization software. She will be visualizing categorical data, number-related values, and fields with a date hierarchy. These three aspects of the set fall under which term? A.) Field Attributes B.) Natural Order C.) Field Definitions D.) Data Model

Sally is a data analyst that is about to use data visualization software. She will be visualizing categorical data, number-related values, and fields with a date hierarchy. These three aspects of the set fall under which term? A.) Field Attributes -------------------------------------------------------------- *There are 3 types of data field attributes: dimensions, measures, and dates. - Dimensions are attributes for categorical data that are used to label and provide meaningful insight about the data - Measures are attributes for number-related values - Date fields may have an associated date hierarchy depending on the software an analyst is working with *Data that follows no natural order is often difficult to visualize. Frequently, an analyst will sort data in ascending and descending order, meaning A-Z or numerical values to make it more interpretable. *A data model organizes the data and relationships of data elements so that it is ready to use and meaningful. *Data field definitions clarify the information each field contains so that it is easily understandable.

Shawna is analyzing a random sample for her data analytics team and needs to compare the size difference between an expected result and actual result. Which of the following should Shawna use? A.) Chi Squared Test B.) Chi Squared Statistic C.) Test of Independence D.) Goodness of Fit

Shawna is analyzing a random sample for her data analytics team and needs to compare the size difference between an expected result and actual result. Which of the following should Shawna use? A.) Chi Squared Test -------------------------------------------------------------- *A chi-square test produces the chi-square statistic and is useful when analyzing data from a random sample and working with a categorical variable, like race or gender. *A chi-square statistic compares the size of the difference between an expected result and the actual result. This measures how a model compares to the actual data. *Both a goodness of fit and test of independence are chi square tests. *A goodness of fit tests a single variable and a test of independence is used to test multiple variables.

Susan is a data analyst assigned to the Department of Defense Advanced Research Agency. She is profiling a new database and identifies that it has a relational structure. After examining a customer ID in a table, she realizes that the ID is a unique identifier for a record. This is an example of what data relationship? A.) A foreign key B.) A primary key C.) Cascade update D.) Cascade delete

Susan is a data analyst assigned to the Department of Defense Advanced Research Agency. She is profiling a new database and identifies that it has a relational structure. After examining a customer ID in a table, she realizes that the ID is a unique identifier for a record. This is an example of what data relationship? B.) A primary key -------------------------------------------------------------- *This is an example of a primary key because it serves as a unique identifier for a record. *When a primary key is used in another table to refer to your record, it's known as a foreign key.

The Dion Institute for Public Wellbeing has just released a publicly available report. Before reading the report, a student glosses over a summary of the key findings from the institute. The student found this summary in which of the following? A.) References B.) Narrative C.) Watermark D.) Version Number

The Dion Institute for Public Wellbeing has just released a publicly available report. Before reading the report, a student glosses over a summary of the key findings from the institute. The student found this summary in which of the following? B.) Narrative -------------------------------------------------------------- *A narrative is a summary of the report contents and key findings. *Version numbers are used to document a series of reports so that a reader understands which iteration of a report they are reading. This can identify if the analysis is the most current version. *A watermark marks information as required for any given report. It provides a high-level reminder about the information contained in a report. *References provide information on the source of data and are normally attached to both page footers and a reference page at the end of the report.

The U.S. census is a popular example of which kind of data? A.) Publicly Available Information (PAI) B.) Aggregated data C.) Relational data D.) Public data

The U.S. census is a popular example of which kind of data? D.) Public data -------------------------------------------------------------- *Public data is information that has been made available to the public through legal requirements. *Publicly Available Information (PAI) is information available to the public even though there is no legal requirement. An example of PAI is research from nonprofits for the benefit of the public. *Aggregated data is data that has already been gathered and analyzed for analysis and reporting.

The following image emphasizes certain words because it has what removed from it? **Image of many words jumbled-up in the shape of a circle; one bold word in the center** A.) Infographic B.) Word Cloud C.) Heat Map D.) Stop Word

The following image emphasizes certain words because it has what removed from it? D.) Stop Word -------------------------------------------------------------- *A word cloud is a visual representation of the words used in a particular body of text. *Within this data, stop words appear frequently but do not need to be counted. Removing stop words leaves room for more relevant words to be counted and visualized. *An infographic is any combination of visuals, artwork, photos, and language that tells a story about data in a compelling and appealing way. *A heat map is any visual that uses color to draw attention to a spot, or a part of a visual that needs attention. A heat map can use color scales to draw attention to points on a geographic map.

Timothy just earned his Data+ exam and has already been hired as a data analyst at the Dion Softdrink Co. His company onboarding included training on a tactical dashboard, root cause analysis for any issues he encounters with the database, self-service reporting, and operational reporting. Which of the following would allow a consumer to receive information on-demand? A.) Compliance Report B.) Root Cause Analysis C.) Self-Service Report D.) Tactical Dashboard

Timothy just earned his Data+ exam and has already been hired as a data analyst at the Dion Softdrink Co. His company onboarding included training on a tactical dashboard, root cause analysis for any issues he encounters with the database, self-service reporting, and operational reporting. Which of the following would allow a consumer to receive information on-demand? C.) Self-Service Report -------------------------------------------------------------- *A self-service report, also known as an on-demand report, is one that is run directly by the consumer. When consumers can leverage dashboards, or run their own reports from the systems the organization has purchased, they are doing self-service. *Root cause analysis attempts to identify the cause of a problem that has occurred and can be a useful component of research-driven reports that seek to improve business processes. *A tactical dashboard is centered on the operational details of a process or operation. These dashboards can monitor strategic business initiatives that cover long periods of time and must be monitored and measured to be effective. *A compliance report is a report that must be run for compliance or regulatory reasons and includes safety reports, financial reports, and health reports.

Tom is going to his physician and will get a measurement of height, weight, and blood pressure. These are examples of what? A.) Nominal Data B.) Ordinal Data C.) Qualitative Data D.) Quantitative Data

Tom is going to his physician and will get a measurement of height, weight, and blood pressure. These are examples of what? D.) Quantitative Data -------------------------------------------------------------- *Nominal data follows no natural order. *Ordinal data follows a natural order, such as the progression of grades for a child's education. *Qualitative data is defined by qualities of the data.

Tony has been asked to identify measures of dispersion for his company's data set, but he's not confident about the following measurement: Which of the following is the average squared distance from the mean of a data set to a single data point? A.) Variance B.) Range C.) Z-score D.) Standard Deviation

Tony has been asked to identify measures of dispersion for his company's data set, but he's not confident about the following measurement: Which of the following is the average squared distance from the mean of a data set to a single data point? A.) Variance -------------------------------------------------------------- *Variance is the average squared distance from the mean of the data for a single data point. *The range is the difference between the highest and lowest values of the data set. *Standard deviation identifies the dispersion of data in relation to the mean of all data. *A Z-score identifies how many standard deviations a data point is from the mean.

Veronica is a data analyst team lead for a consulting firm. One of her primary responsibilities is to train junior analysts on analysis and reporting. She recently gave a presentation on something that is centered on operational details for long-term strategic business initiatives. What did she teach? A.) Tactical Dashboard B.) Root Cause Analysis C.) Self-Service Report D.) Compliance Report

Veronica is a data analyst team lead for a consulting firm. One of her primary responsibilities is to train junior analysts on analysis and reporting. She recently gave a presentation on something that is centered on operational details for long-term strategic business initiatives. What did she teach? A.) Tactical Dashboard -------------------------------------------------------------- *A tactical dashboard is centered on the operational details of a process or operation. These dashboards can monitor strategic business initiatives that cover long periods of time and must be monitored and measured to be effective. *A compliance report is a report that must be run for compliance or regulatory reasons and includes safety reports, financial reports, and health reports. *A self-service report, also known as an on-demand report, is one that is run directly by the consumer. When consumers can leverage dashboards or run their own reports from the systems the organization has purchased, they are doing self-service. *Root cause analysis attempts to identify the cause of a problem that has occurred and can be a useful component of research-driven reports that seek to improve business processes.

Walter works as a data analyst at Jason Dion Fruit & Co. and must combine two data sets into a single data set. Which of the following actions should he do to accomplish this? A.) An intermediate append B.) Transpose data C.) An inline append D.) Reduce data

Walter works as a data analyst at Jason Dion Fruit & Co. and must combine two data sets into a single data set. Which of the following actions should he do to accomplish this? C.) An inline append -------------------------------------------------------------- *An inline append will combine all selected data sets, leaving just the combined set. *An intermediate append will retain the separate data sets and also create a new data set with all the combined data. *Transposing data reverses its direction, so columns become rows and rows become columns. *Data reduction reduces the overall volume of data in a data set.

When a data analyst begins working with data, they learn the basics about the data that they are working with and identify information from that data. This describes which of the following? A.) Data profiling B.) Data extraction C.) Data loading D.) Data transformation

When a data analyst begins working with data, they learn the basics about the data that they are working with and identify information from that data. This describes which of the following? A.) Data profiling -------------------------------------------------------------- *Data profiling is the process of learning basic information about the data set. This includes examining the source of the data, identifying keys, looking at record counts, and understanding relationships in the set.

When working with a data set, unnecessary data fields that have no value to the analysis are known as: A.) Invalid data B.) Noise C.) Non-parametric data D.) Null data

When working with a data set, unnecessary data fields that have no value to the analysis are known as: B.) Noise -------------------------------------------------------------- *Null data has no value in the field and can be identified with NULL, N/A, or a blank field. *Invalid data is data that is incorrect. *Non-parametric data exists when the data is not within the rules of normal distribution, with values that frequently deviate from the mean.

Which of the following creates an inherent problem in a data set? A.) Invalid Data B.) Null Values C.) Duplicated Data D.) Redundant Data

Which of the following creates an inherent problem in a data set? A.) Invalid Data -------------------------------------------------------------- *Invalid data needs to be addressed by replacing it with valid data or removing it from your data set entirely. *There is no inherent problem with duplicated data, redundant data, or null values as long as the data analyst expects it in the set.

Which of the following describes a means to learn the basics about the data that you are working with and discern information from that data? A.) Data Loading B.) Data Extraction C.) Data Profiling D.) Data Transformation

Which of the following describes a means to learn the basics about the data that you are working with and discern information from that data? C.) Data Profiling -------------------------------------------------------------- *Data profiling is the process of learning basic information about the data set. This includes examining the source of the data, identifying keys, looking at record counts, and understanding relationships in the set.

Which of the following does not produce a written symbol? A.) Non-printable Characters B.) Masking C.) Trailing Spaces D.) Leading Spaces

Which of the following does not produce a written symbol? A.) Non-printable Characters -------------------------------------------------------------- *Non-printable characters are characters that do not produce a written symbol such as a tab or space. ASCII is the acronym for the American Standard Code Information Exchange, a modern standard for electronic communication. *Trailing spaces are invisible characters at the front of a field of information. *Leading spaces are invisible characters at the front of a field of information. *Masking is the act of hiding the original value of data by showing something else in its place.

Which of the following functions is most likely going to be used in the header or footer of a report? A.) System Functions B.) Logical Functions C.) Text Functions D.) Merge Fields Functions

Which of the following functions is most likely going to be used in the header or footer of a report? A.) System Functions -------------------------------------------------------------- *A system function is most likely going to be used in a header or footer of a report. System functions track report related information which remove the need for an analyst to manually add information on page numbers, refresh dates, report names etc. *Merge fields functions combine multiple fields into a single field. *Text functions manipulate data in text-based fields and can remove non-printable characters. *Logical functions will check if a condition is met and return an answer based on the result.

Which of the following is an example of data that has already been gathered and analyzed for the purposes of analysis and reporting? A.) Public Data B.) Publicly Available Information (PAI) C.) Aggregated Data D.) Relational Data

Which of the following is an example of data that has already been gathered and analyzed for the purposes of analysis and reporting? C.) Aggregated Data -------------------------------------------------------------- *Aggregated data is data that has already been gathered and analyzed for analysis and reporting. An example of this could be a nonprofit polling citizen in their country on tobacco use and then making the data available to the public after it is analyzed. *The U.S. census is a popular example of public data. Public data is information that has been made available to the public through legal requirements. *Publicly Available Information (PAI) is information available to the public even though there is no legal requirement. An example of PAI is research from nonprofits for the benefit of the public. *Relational data is data which has been formatted for use in a relational database.

Which of the following is automatically generated by a computer process, application, or other mechanism without the intervention of a human? A.) Machine Data B.) API C.) Web Scraping D.) A Web Service

Which of the following is automatically generated by a computer process, application, or other mechanism without the intervention of a human? A.) Machine Data -------------------------------------------------------------- *Machine data is produced by a machine rather than a human. For example, time stamps for computer logins are automatically generated. *Web scraping is the act of pulling information from a website and can be done with automation or by hand. Not all websites allow web scraping. *A web service is a type of API that allows a hosted computer on a network to share data back and forth with a computer in the same hosted environment. *An API is a set of protocols within a computer system that allows two unrelated systems to communicate.

Which of the following is defined as the study of developing projects to move from a present state to a desired state? A.) Exploratory Analysis B.) Gap Analysis C.) Link Analysis D.) Performance Analysis

Which of the following is defined as the study of developing projects to move from a present state to a desired state? B.) Gap Analysis -------------------------------------------------------------- *Link analysis determines how a single data point links to other data points and focuses on relationships and connections in a database. *Exploratory analysis should be done on each data set that an analyst encounters. This analysis determines the main characteristics of a data set and identifies what data should be cleaned or transformed for use. *Performance analysis uses both qualitative and quantitative data to measure a particular product, outcome, or scenario against a defined objective.

Which of the following is described as the creation, storage, use, archiving, and deletion of data? A.) Data Steward B.) Data Owner C.) Lifecycle of Data D.) Data Custodian

Which of the following is described as the creation, storage, use, archiving, and deletion of data? C.) Lifecycle of Data -------------------------------------------------------------- *Data has a lifecycle. It's created, stored, used, archived, and deleted. Each stage in the lifecycle of data has different rules and requirements for the data an organization will work with related to the regulations and compliance requirements for the industry. *A data custodian manages the system where data assets are stored. This includes the responsibilities of enforcing access control, encryption, and backup/ recovery measures. *A data steward is fundamentally responsible for data quality. A data steward ensures data is labeled, identified with appropriate metadata, and collected and stored in a format that complies with applicable laws and regulations. *A data owner is a management role. The data owner holds ultimate responsibility for maintaining the confidentiality, integrity, and availability of the data. The owner also normally selects a steward and custodian, delegates their actions, sets a budget, and allocates resources for sufficient controls.

Which of the following is information that has been made available to the public through legal requirements? A.) Relational Data B.) Aggregated Data C.) Publicly Available Information (PAI) D.) Public Data

Which of the following is information that has been made available to the public through legal requirements? D.) Public Data -------------------------------------------------------------- *The U.S. census is a popular example of public data. Public data is information that has been made available to the public through legal requirements. *Publicly Available Information (PAI) is information available to the public even though there is no legal requirement. An example of PAI is research from nonprofits for the benefit of the public. *Aggregated data is data that has already been gathered and analyzed for analysis and reporting. An example of this could be a nonprofit polling citizen in their country on tobacco use and then making the data available to the public after it is analyzed. *Relational data is data which has been formatted for use in a relational database.

Which of the following is necessary when loading data into a data system for the first time? A.) Extract Load Transform (ELT) B.) Extract Transform Load (ETL) C.) A Full Load D.) A Delta Load

Which of the following is necessary when loading data into a data system for the first time? C.) A Full Load -------------------------------------------------------------- *A full load is used when loading data into a storage system for the first time. *A delta load is the act of loading new data into a data system and updating any existing data that has changed since the last load. *The ELT process is a more modern method of preparing data for data lakes. Data lakes are most often used for unstructured data, which would best store the required video. *The ETL process is the most common method used to prepare data for a data warehouse.

Which of the following is not legally required by a government but often provided for education or research? A.) Public Data B.) Relational Data C.) Publicly Available Information (PAI) D.) Aggregated Data

Which of the following is not legally required by a government but often provided for education or research? C.) Publicly Available Information (PAI) -------------------------------------------------------------- *Publicly Available Information (PAI) is information available to the public even though there is no legal requirement. An example of PAI is research from nonprofits for the benefit of the public. *The U.S. census is a popular example of public data. Public data is information that has been made available to the public through legal requirements. *Aggregated data is data that has already been gathered and analyzed for analysis and reporting. An example of this could be a nonprofit polling citizen in their country on tobacco use and then making the data available to the public after it is analyzed. *Relational data is data which has been formatted for use in a relational database.

Which of the following is often collected incidentally when a system is designed to capture everything? A.) Redundant Data B.) Duplicated Data C.) Invalid Data D.) Null Values

Which of the following is often collected incidentally when a system is designed to capture everything? A.) Redundant Data -------------------------------------------------------------- *Redundant data is often collected incidentally. The primary challenge an analyst will encounter when working with this is determining which record represents the absolute truth and is the most accurate.

Which of the following is the most common type of delimited file? A.) .csv B.) .txt C.) .tsv D.) .tab

Which of the following is the most common type of delimited file? A.) .csv -------------------------------------------------------------- *.TXT, .TSV, and .TAB are all types of tab delimited files; but .CSV is the most common.

Which of the following is the most comprehensive programming language for data science? A.) Extensible Markup Language (XML) B.) Javascript Object Notation (JSON) C.) Hypertext Markup Language (HTML) D.) Structured Query Language (SQL)

Which of the following is the most comprehensive programming language for data science? D.) Structured Query Language (SQL) -------------------------------------------------------------- *Structured Query Language is the most comprehensive programming language for data science. This language does not have a dedicated syntax that is dedicated to working with data, making it highly flexible. *JSON is an object-oriented, event-driven programming language that allows us to interact with websites. *XML is used for data transfers. *HTML is a language dedicated to presenting data in a browser-based environment.

Which of the following is the process of accessing the source data from the system and then converting that data into a format that can be transformed and loaded into a data warehouse? A.) Extraction B.) Transformation C.) Loading D.) Conversion

Which of the following is the process of accessing the source data from the system and then converting that data into a format that can be transformed and loaded into a data warehouse? A.) Extraction -------------------------------------------------------------- *Transformation is the act of making the data more meaningful for the purposes of reporting and decision-making. *Loading data is the process of moving the data into the target destination, such as a data warehouse or data lake. *Conversion is not a recognized term for the Data+ exam in this context.

Which of the following is the process of accessing the source data from the system and then converting that data into a format that can be transformed and loaded into a data warehouse? A.) Loading B.) Conversion C.) Transformation D.) Extraction

Which of the following is the process of accessing the source data from the system and then converting that data into a format that can be transformed and loaded into a data warehouse? D.) Extraction -------------------------------------------------------------- *Transformation is the act of making the data more meaningful for the purposes of reporting and decision-making. *Loading data is the process of moving the data into the target destination, such as a data warehouse or data lake. *Conversion is not a recognized term for the Data+ exam in this context.

Which of the following is the process of loading new data into a data system and updating any existing data that has changed since the last load. A.) Extract Load Transform (ELT) B.) A Delta Load C.) A Full Load D.) Extract Transform Load (ETL)

Which of the following is the process of loading new data into a data system and updating any existing data that has changed since the last load. B.) A Delta Load -------------------------------------------------------------- *A delta load is the act of loading new data into a data system and updating any existing data that has changed since the last load. *A full load is used when loading data into a storage system for the first time. *The ELT process is a more modern method of preparing data for data lakes. Data lakes are most often used for unstructured data, which would best store the required video. *The ETL process is the most common method used to prepare data for a data warehouse.

Which of the following makes data processing faster and ultimately speeds up the performance of a query? A.) Aggregate Functions B.) Parsing C.) Indexing D.) Date Functions

Which of the following makes data processing faster and ultimately speeds up the performance of a query? C.) Indexing -------------------------------------------------------------- *Indexing is a field property setting that improves query speed and performance for fields that are commonly queried, sorted, or filtered. *Aggregate functions are written for a group of records, not just for a single record, and work with a column of data. *Date functions derive attributes from date fields, like determining the day of the week, month, or year from a single date. *Parsing breaks and extracts data out of a field for use.

Which of the following measures of central tendency is the average of a set of numbers? A.) Mode B.) Mean C.) Frequency D.) Median

Which of the following measures of central tendency is the average of a set of numbers? B.) Mean -------------------------------------------------------------- *The mean is the average of a set of numbers. *Frequency is the number of times that a data point occurs in a data set. *The median is the middle number within a group of sorted numbers. *The mode is the number that shows up the highest amount of times in the data set.

Which of the following provides possible requirements for a query to execute? Select the best answer. A.) Estimated Execution Plan B.) Query Execution Plan C.) Subquery D.) Actual Execution Plan

Which of the following provides possible requirements for a query to execute? Select the best answer. A.) Estimated Execution Plan -------------------------------------------------------------- *The estimated execution plan is a list of possible requirements for executing a query. *A subquery, also known as a nested query, nests a query within another query to reduce data and improve processing performance. This accesses a smaller set of data rather than querying the whole table. *A query execution plan is the order of steps in which a query is processed. Although this includes both an estimated execution plan and actual execution plan within it, the best answer is estimated execution plan because it is more specific.

Which of the following uses both qualitative and quantitative data to measure a particular product, outcome, or scenario against a defined objective? A.) Exploratory Analysis B.) Link Analysis C.) Gap Analysis D.) Performance Analysis

Which of the following uses both qualitative and quantitative data to measure a particular product, outcome, or scenario against a defined objective? D.) Performance Analysis -------------------------------------------------------------- *Performance analysis uses both qualitative and quantitative data to measure a particular product, outcome, or scenario against a defined objective. *Gap analysis is the study of developing projects to move from a present state to a desired state. *Link analysis determines how a single data point links to other data points and focuses on relationships and connections in a database. *Exploratory analysis should be done on each data set that an analyst encounters. This analysis determines the main characteristics of a data set and identifies what data should be cleaned or transformed for use.

Which process is a more modern method of preparing data for data lakes? A.) Extract Transform Load (ETL) B.) Full Load C.) Delta Load D.) Extract Load Transform (ELT)

Which process is a more modern method of preparing data for data lakes? D.) Extract Load Transform (ELT) -------------------------------------------------------------- *The Extract Load Transform (ELT) process is a more modern method of preparing data for data lakes. Data lakes are most often used for unstructured data, which would best store the required video. *The Extract Transform Load (ETL) process is the most common method used to prepare data for a data warehouse. *A delta load is the act of loading new data into a data system and updating any existing data that has changed since the last load.

Which programming language is most associated with relational databases? A.) HyperText Markup Language (HTML) B.) Extensible Markup Language (XML) C.) Structured Query Language (SQL) D.) Standard Generalized Markup Language (SGML)

Which programming language is most associated with relational databases? C.) Structured Query Language (SQL) -------------------------------------------------------------- *Structured Query Language (SQL) is used to query and manage data in a relational database. *HyperText Markup Language (HTML) presents data in a browser-based environment. *Extensible Markup Language (XML) is used to transfer data, not display it. *Standard Generalized Markup Language (SGML) provides the standard that defines all markup languages.

**Image on other side** </head> <body> <div class="main"> <div class="header"> <div class="block_header"> <div> <div class="clr"></div> <div> Which programming language is the following an excerpt from? A.) Extensible Markup Language (XML) B.) Javascript Object Notation (JSON) C.) Hypertext Markup Language (HTML) D.) Structured Query Language (SQL)

Which programming language is the following an excerpt from? C.) Hypertext Markup Language (HTML) -------------------------------------------------------------- *The syntax of HTML involves the use of tags in the form of angle brackets to mark up a document before being displayed on a web browser. *These tags consist of an opening tag (< >) indicating the start of an element and a closing tag (/ < >) indicating the end of the element. *For example, the opening and closing <div> </div> tags in the image.

Which type of function removes non-printable characters from a field? A.) System Functions B.) Text Functions C.) Merge Fields Functions D.) Logical Functions

Which type of function removes non-printable characters from a field? B.) Text Functions -------------------------------------------------------------- *Text functions manipulate data in text-based fields and can remove non-printable characters. *System functions track report related information which removes the need for an analyst to manually add information on page numbers, refresh dates, report names etc. *Merge fields functions combine multiple fields into a single field. *Logical functions will check if a condition is met and return an answer based on the result.

William is unsure which term is useful for testing the difference between expected results and actual results of multiple variables. Please assist him and select the best answer that matches this description. A.) Chi Square Statistic B.) Test of Independence C.) Chi Square Test D.) Goodness of Fit

William is unsure which term is useful for testing the difference between expected results and actual results of multiple variables. Please assist him and select the best answer that matches this description. B.) Test of Independence -------------------------------------------------------------- *Both a goodness of fit and test of independence are chi square tests. *A goodness of fit tests a single variable and the test of independence is used to test multiple variables. *A chi-square statistic compares the size of the difference between an expected result and the actual result. This measures how a model compares to the actual data. *A chi-square test produces the chi-square statistic and is useful when analyzing data from a random sample and working with a categorical variable, like race or gender.

Ximena is learning about invalid data at Dion Data Science Co. Please help her select the answer that describes spaces at the front of a field of information: A.) Masking B.) Trailing Spaces C.) Non-printable Characters D.) Leading Spaces

Ximena is learning about invalid data at Dion Data Science Co. Please help her select the answer that describes spaces at the front of a field of information: D.) Leading Spaces -------------------------------------------------------------- *Data becomes invalid when something as simple as a mistake in manual entry occurs. This is an example of leading spaces.

Yeager is five foot ten inches tall, yet his doctor's office has his height listed as five foot eight inches tall. This doctor's office database entry would be considered which of the following? A.) Invalid Data B.) Null Values C.) Duplicated Data D.) Redundant Data

Yeager is five foot ten inches tall, yet his doctor's office has his height listed as five foot eight inches tall. This doctor's office database entry would be considered which of the following? A.) Invalid Data -------------------------------------------------------------- *This data is invalid because it is inaccurate. Data can be invalid for many reasons, and there are a number of situations in which an analyst may encounter invalid data.

Zander is a data analyst at the Dion pumpkin farm. Each pumpkin is weighed and recorded in the company database before being sold to local grocery stores during Halloween. Zander has already executed a query on last year's data to provide an estimate of the average weight per pumpkin. When he executes this year's query, what will he use to confirm the computational resources that were used on the data set? A.) Query Execution Plan B.) Actual Execution Plan C.) Estimated Execution Plan D.) Subquery

Zander is a data analyst at the Dion pumpkin farm. Each pumpkin is weighed and recorded in the company database before being sold to local grocery stores during Halloween. Zander has already executed a query on last year's data to provide an estimate of the average weight per pumpkin. When he executes this year's query, what will he use to confirm the computational resources that were used on the data set? B.) Actual Execution Plan -------------------------------------------------------------- *An actual execution plan confirms the requirements used for a query. *The estimated execution plan is a list of possible requirements for executing a query. *A subquery, also known as a nested query, nests a query within another query to reduce data and improve processing performance. This accesses a smaller set of data rather than querying the whole table. *A query execution plan is the order of steps in which a query is processed.

Zander is a data analyst at the Dion pumpkin farm. Each pumpkin is weighed and recorded in the company database before being sold to local grocery stores during Halloween. Zander has already executed a query on last year's data to provide an estimate of the average weight per pumpkin. When he executes this year's query, what will he use to confirm the computational resources that were used on the data set? A.) Query Execution Plan B.) Subquery C.) Estimated Execution Plan D.) Actual Execution Plan

Zander is a data analyst at the Dion pumpkin farm. Each pumpkin is weighed and recorded in the company database before being sold to local grocery stores during Halloween. Zander has already executed a query on last year's data to provide an estimate of the average weight per pumpkin. When he executes this year's query, what will he use to confirm the computational resources that were used on the data set? D.) Actual Execution Plan -------------------------------------------------------------- *Zander will look at the actual execution plan after the query is completed this year. An actual execution plan confirms the requirements used for a query. *The estimated execution plan is a list of possible requirements for executing a query. *A subquery, also known as a nested query, nests a query within another query to reduce data and improve processing performance. This accesses a smaller set of data rather than querying the whole table. *A query execution plan is the order of steps in which a query is processed.

Zane is a data analyst for the Dion and Bidgood Mortgage Company which operates in both Maryland and Delaware. Zane's supervisor has requested that he look to see if a significant difference in average operating cost exists when comparing offices in the two states. What is Zane's supervisor asking him about? A.) T-test B.) Dependent Variable C.) Independent Variable D.) Population

Zane is a data analyst for the Dion and Bidgood Mortgage Company which operates in both Maryland and Delaware. Zane's supervisor has requested that he look to see if a significant difference in average operating cost exists when comparing offices in the two states. What is Zane's supervisor asking him about? A.) T-test -------------------------------------------------------------- *A t-test is used to determine if there is a significant difference between the means of two groups. *There are two important variables when conducting a t-test: the dependent variable, which is what is being measured, and the independent variable that is different between the groups. *The dependent variable is the main data point and is used to determine the mean, median, mode, and standard deviation. *Population is a group of records that meet a certain criterion.

Zoe is an amateur marathon runner who works as a data analyst for a dietary supplement retailer. While updating the company database she sees identical data that is stored in multiple places. This is an example of what? A.) Null values B.) Redundant data C.) Duplicate data D.) Invalid data

Zoe is an amateur marathon runner who works as a data analyst for a dietary supplement retailer. While updating the company database she sees identical data that is stored in multiple places. This is an example of what? B.) Redundant data -------------------------------------------------------------- *Redundant data is identical data that is stored in multiple places. *Duplicated data is data that is repeated in the same data set. *A null value means that there is no value in a field. *Invalid data is data that is incorrect.

Zoe is an amatuer marathon runner who works as a data analyst for a dietary supplement retailer. While updating the company database she sees identical data that is stored in multiple places. This is an example of what? A.) Invalid Data B.) Duplicate Data C.) Redundant Data D.) Null Values

Zoe is an amateur marathon runner who works as a data analyst for a dietary supplement retailer. While updating the company database she sees identical data that is stored in multiple places. This is an example of what? C.) Redundant Data -------------------------------------------------------------- *Redundant data is identical data that is stored in multiple places. *Duplicated data is data that is repeated in the same data set. *A null value means that there is no value in a field. *Invalid data is data that is incorrect.


Related study sets

Constitutional Law Spring 2023 Final

View Set

Bio Psych week 12 + 13 + 14 + 15

View Set

Chapter 35- Musculoskeletal Function

View Set

GI system Medsurg Brief A&P, Assessment, & Diagnostics

View Set

Lección 3 Contextos (Lesson, Contexts): Escoger (Choose): Audio

View Set

CompTIA Security+ Exam SY0-501 Wireless Security Quiz

View Set