Disc DA 3

¡Supera tus tareas y exámenes ahora con Quizwiz!

A. Claim number

A data scientist is creating a table in a relational database regarding auto claims. The data fields include claim number, insured last name, date of loss, annual premium, and amount of loss. Which one of the following data fields would the data scientist select as the primary key? Select one: A. Claim number B. Amount of loss C. Insured last name D. Date of loss

C. Claims with the highest number of transactions are likely to provide the data scientist with more complicated life cycles to examine, which, in turn, will help the data scientist prepare for unexpected data and data idiosyncrasies.

A data scientist is selecting claims data for a test sample. Which one of the following best explains why the data scientist chooses claims with the highest number of transactions? Select one: A. Claims with the highest number of transactions will represent how and from whom the data was collected. This will help the data scientist determine whether he or she is making accurate assumptions about policyholders. B. Claims with the highest number of transactions will have a known value for any model's target variable. C. Claims with the highest number of transactions are likely to provide the data scientist with more complicated life cycles to examine, which, in turn, will help the data scientist prepare for unexpected data and data idiosyncrasies. D. Claims with the highest number of transactions will have come from different sources and need less adjustment in the database.

C. Access data of interest on large websites.

A data scientist typically uses an application programming interface (API) to Select one: A. Access United States government information only. B. Communicate with multiple websites simultaneously. C. Access data of interest on large websites. D. Communicate with Facebook users.

B. Creating another index.

An approach to managing problems associated with large amounts of data is Select one: A. Referring to an index many times to speed up the process. B. Creating another index. C. Organizing the data by many different indexes and columns. D. Increasing the amount of normalization.

D. BETWEEN

As a condition used with a WHERE statement, which one of the following is used to denote a range of values that includes the endpoints? Select one: A. NOT B. = C. >= D. BETWEEN

D. Attempt to gather a test sample that represents how and from whom the auto data was collected.

Because of the variety of auto coverages available and the possibility that multiple autos are covered by one or more policies, a data scientist who is creating an auto data test sample should Select one: A. Only use well-established data, such as credit scores. B. Test all the data in a large database. C. Choose data for the test sample based on a random attribute, such as policy number. D. Attempt to gather a test sample that represents how and from whom the auto data was collected.

C. The first row in one table will be combined in the same row with the first row in the second table.

By including the names of two tables in the FROM statement in a query using Structured Query Language (SQL), a data scientist can expect that Select one: A. The data will likely be matched correctly. B. The query is likely to produce less rows of data than required. C. The first row in one table will be combined in the same row with the first row in the second table. D. Only identified and selected rows in each table will be combined.

B. Has to match and combine information from the tables.

By performing a query to obtain data from a policy table and a claims data table without using Structured Query Language (SQL), a data scientist Select one: A. Efficiently processes data for thousands of policies and claims. B. Has to match and combine information from the tables. C. Performs a query for each table, saving time and effort. D. Can seamlessly combine the tables together.

B. Unknown data at the present time may be considered null.

Concerning data scientists and null values in Structured Query Language (SQL), Select one: A. Tables can only be designed using the null default. B. Unknown data at the present time may be considered null. C. "Null" indicates, without exception, no value for a particular data field. D. Awareness of the null default is unimportant.

A. Purchasing but not yet paying for the insurance.

In Structured Query Language (SQL), an insurer's table of current auto policy customers has a column that indicates the date a premium payment was received. This data field is null for customers who are Select one: A. Purchasing but not yet paying for the insurance. B. Not purchasing the insurance. C. Choosing to remain with the insurer. D. Not reporting any claims.

D. Be a column serving as a unique identifier.

In using Structured Query Language (SQL), a primary key can Select one: A. Rarely be the basis for a designated index. B. Consist of a single column of data. C. Include null values. D. Be a column serving as a unique identifier.

B. The statements are not case-sensitive.

Joel is a data scientist learning how to query data from a database using Structured Query Language (SQL). He is studying the statements in his manager's query. Which one of the following best represents what he would have learned? Select one: A. Use the symbol "=" to end the query. B. The statements are not case-sensitive. C. Group functions are the same as aggregate functions. D. To obtain data from an SQL database, start with the FROM statement.

D. Performing a series of tests on the data.

Quantitative understanding is gained by Select one: A. Testing all the data. B. Talking to various professionals. C. Having qualitative understanding of the data. D. Performing a series of tests on the data.

C. Programming language.

R is a Select one: A. Data warehouse. B. Spreadsheet. C. Programming language. D. Relational database.

C. Err on the side of being complex rather than simplified.

Test data construction should Select one: A. Be as simplified as possible. B. Include all the data in a large database. C. Err on the side of being complex rather than simplified. D. Allow quantitative, but not qualitative, understanding.

B. The standards of atomicity, consistency, isolation, and durability.

Testing data can help ensure that the data adheres to Select one: A. An overly simplified data understanding that is applicable to the real world. B. The standards of atomicity, consistency, isolation, and durability. C. Retrospective rating policies. D. The defects in the database.

C. FROM.

The simplest way to combine data from two tables to retrieve needed data using Structured Query Language (SQL) is to include the names of both tables in which one of the following statements in the query? Select one: A. EQUAL JOIN. B. SELECT. C. FROM. D. WHERE.

B. DISTINCT

To eliminate duplicate data, a data scientist uses which one of the following with the SELECT statement? Select one: A. WHERE B. DISTINCT C. FROM D. COUNT

B. Uses an element to indicate the path to the file.

To import an HTML file from the internet, a data scientist Select one: A. Must request the data in the HTML language only. B. Uses an element to indicate the path to the file. C. Uses a form to provide structure for the data. D. Uses an API to transfer the data.

A. Unstructured data into structured data.

Web scraping transforms Select one: A. Unstructured data into structured data. B. Internet data into a library. C. Small amounts of data from the internet. D. Structured data into relational databases.

C. Must have known values for the model's target variable.

When a database is eventually used for modeling, both the training data and the test data Select one: A. Should be simple. B. Must have binary target variables. C. Must have known values for the model's target variable. D. Will be used to test the model.

B. Primary key and foreign key.

When joining two tables, the unique identifier in the primary table and the unique identifier in the other table, to which the primary data is joined, are respectively the Select one: A. Primary key and secondary key. B. Primary key and foreign key. C. RIGHT JOIN and LEFT JOIN. D. EQUAL JOIN and FULL OUTER JOIN.

A. The tables in a relational database have a logical connection to each other.

Which one of the following best describes the table structure in a relational database? Select one: A. The tables in a relational database have a logical connection to each other. B. Each row in the table corresponds to a data field. C. Insurers exclusively make use of relational databases. D. Each column in the table represents a unique instance of data.

A. Beautiful soup

Which one of the following can a data scientist use to read web languages and load their data into Python? Select one: A. Beautiful soup B. R C. SQL D. HTML

D. Beautiful Soup

Which one of the following can be used for web scraping? Select one: A. Application Programming Interface B. HTML C. XML D. Beautiful Soup

C. Extract, transform, and load (ETL)

Which one of the following is a process a data scientist uses to transform data into the appropriate format for a data warehouse? Select one: A. Business intelligence B. Query C. Extract, transform, and load (ETL) D. Structured Query Language (SQL)

C. A data scientist can obtain HTML data from multiple websites on a repeated basis.

Which one of the following is correct regarding how a data scientist can obtain HTML data from the internet? Select one: A. A data scientist must request all the information contained on a website. B. A data scientist can only obtain HTML data from one website at a time. C. A data scientist can obtain HTML data from multiple websites on a repeated basis. D. A data scientist must submit a request each time HTML data is needed.

D. Custom APIs are permitted if they conform to the website owner's guidelines.

Which one of the following is correct regarding the APIs that data scientists use? Select one: A. The only APIs available are those designed and permitted by a website's owner. B. Only the United States government allows APIs to be used on its websites. C. There are no specific rules or guidelines regarding how APIs can be used. D. Custom APIs are permitted if they conform to the website owner's guidelines.

B. Excel is suitable for smaller projects, but not for large projects.

Which one of the following is correct regarding the use of Excel to analyze data? Select one: A. Excel can be used to analyze any type of data. B. Excel is suitable for smaller projects, but not for large projects. C. Excel cannot be used to analyze data. D. Excel is suitable for large projects, but not for smaller projects.

B. The optimization feature automatically selects the most effective index.

Which one of the following is true regarding indexes, one of the features of Structured Query Language (SQL)? Select one: A. A primary key is considered effective for any large table. B. The optimization feature automatically selects the most effective index. C. Small tables generally require indexes. D. A difficulty for data scientists is that few database servers allow users to look at indexes.

C. SQL has become a standard for relational databases.

Which one of the following is true regarding querying data using Structured Query Language (SQL)? Select one: A. SQL is widely used to work with data in any type of database. B. Few other programming languages include SQL capability. C. SQL has become a standard for relational databases. D. SQL is flexible without the rules and syntax required of other languages.

D. FULL JOIN

Which one of the following statements includes all rows from two tables being joined, in which common rows between the two tables are matched together and rows found on only one table include NULL values for the data fields from the other table? Select one: A. LEFT JOIN B. RIGHT JOIN C. EQUAL JOIN D. FULL JOIN

D. When multiple columns are designated in an index, SQL will search to the order in which the columns are listed.

Which one of the following statements is true regarding Structured Query Language (SQL) advanced topics? Select one: A. Multiple columns can be used in designating a primary key, but not used in creating an index. B. A growing number of unique indexes helps guarantee efficient database performance. C. With all implementations of SQL, user-defined functions (UDFs) require a different language. D. When multiple columns are designated in an index, SQL will search to the order in which the columns are listed.

C. SELECT

Which one of the following statements specifies the order in which requested columns will appear? Select one: A. OR B. AND C. SELECT D. WHERE

C. Packages

Which one of the following types of software can a data scientist use to enable R to read data directly from another source? Select one: A. Relational database B. Excel C. Packages D. SQL

B. DISTINCT

With an EQUAL JOIN statement, which one of the following is used to denote the conditions that specifically limit the data to be retrieved (such as greater than a loss of $10,000)? Select one: A. FROM B. WHERE C. JOIN D. SELECT


Conjuntos de estudio relacionados

Nursing Process Chapter 20: Communication

View Set

Microbiology 240 - Exam 1 Multiple Choice

View Set

MAN 4720 Exit Interview Questions

View Set

Biology Honors: Natural Selection

View Set

PrepU: Chapter 19-Lung Assessment, Med Surg 2. Chapter 21, CC Ch. 21, Ch 21 Respiratory Care Modalities, PrepU Resp AH, MedSurg Chapter 21 Respiratory Care Modalities, Ch 21 - Respiratory Care Modalities, Ex. 4-Ch. 21 (Med Surg) Resp. Care Modalities

View Set