SQL and Python

Ace your homework & exams now with Quizwiz!

Select all of the valid math operators in SQL (select all that apply).

- + (addition) - * (multiplication) - - (subtraction) - / (division)

Which of the following are Data Integration and Transformation tools? (Select all that apply.)

- Apache Kafka - Apache Nifi - Apache AirFlow

Select the statements below that ARE NOT true of the ORDER BY clause (select all that apply).

- Can be anywhere in the select statement

IBM SPSS Modeler includes what kind of models?

- Classification Models (for data with categorical target) - Regression models (for data with a continuous target) - clustering models (for data with no target variables) - other kinds of models

Which of the following is an aggregate function? (select all that apply)

- Count() - Min and Max?

What are common tasks in data science?

- Data Management - Data Integration and Transformation - Data Visualization - Model Building - Model Development - Model Monitoring and Assessment

Which of the following is true of GROUP BY clauses? (Select all that apply.)

- Every column in your select statement may/can be present in a group by clause, except for aggregated calculations. - GROUP BY clauses can contain multiple columns - NULLs will be grouped together if your Group By column contains NULLs

Profiling data is helpful for which of the following? (Select all that apply)

- Filter out unwanted data elements - Understanding your data

When learning data science what open source tools are the most used?

- Jupyter Notebooks / JupyterLab - RStudio

Which of the following are supported in SQL when dealing with strings?

- Lower - Concatenate - Substring - Trim - Upper

Which statements are true about Open Source and Free Software? (Select all that apply.)

- Most of Free Software licenses also qualify for Open Source. - Open Source Software can be modified without sharing the modified source code depending on the Open Source license.

Examples of data management tools

- MySQL - PostgreSQL

Examples of SQL Databases

- MySQL - PostgreSQL - Oracle

Filtering data is used to do which of the following? (select all that apply)

- Narrows down the results of the data - Reduce the time it takes to run the query - Removes unwanted data in a calculation

Which of the following is used to make Artificial Intelligence and Machine Learning possible? (Select all that apply.)

- PyTorch - TensorFlow.js - Apache Spark

Which are the three most used languages for data science? (Select all that apply.)

- Pyton - R - SQL

Which of the following languages can be used for data science?

- R - Julia - Java - Javascript - Scala - SQL

Which tool do most Python developers use?

- RStudio

When learning data science what open source tools are the most used?

- RStudio - Jupyter Notebooks / JupyterLab

Case statements can only be used for which of the following statements (select all that apply)?

- Select - Insert

Which of the following statements are true of Entity Relationship (ER) Diagrams?

- They show you the relationships between tables. - They are usually a representation of a business process.

Which statements about IBM Watson Studio and OpenScale are correct? (Select all that apply.)

- Watson Studio together with Watson OpenScale covers the complete development life cycle for all data science, machine learning and AI tasks. - Watson Studio together with Watson OpenScale is available as a Cloud offering as well as a package running on top of Kubernetes/RedHat OpenShift in a local data center called IBM Cloud Pak for Data.

Select all that are true regarding wildcards (Select all that apply.)

- Wildcards take longer to run compared to a logical operator - Wildcards at the end of search patterns take longer to run

Which of the following is true regarding Aliases? (Select all that apply.)

- an alias only exists for the duration of a query - aliases are often used to make column names more readable - SQL aliases are used to give a table, or a column in a table, a temporary name

Select which of the following statements are true regarding inner joins. (Select all that apply)

- performance will most likely worsen with the more joins you make - there is no limit to the number of tables you can join with an inner join

Which of the following statements about Unions is true? (select all that apply)

- the columns must also have similar data types - each SELECT statement within UNION must have the same number of columns - the UNION operator is used to combine the result-set of two or more SELECT statements

What type of file format is a Jupyter Notebook?

.ipynb

What type of node is used to partition the data into a training and testing set in Modeler flows?

A partition node

What type of node is used to define metadata for features in Modeler flows?

A type node

Which of the following statements is true?

All of the above

What are some of the Data Refinery abilities when working with data?

Analyzes and transforms data quickly

What does the "BI" in BI Tools stand for?

Business Intelligence

What type of model would you use if you wanted to find the relationship between dependent and independent variables?

Clustering model

How does Data Refinery help build repeatable Data Pipelines for workloads of almost any size?

Create a scheduled Job and use a custom environment to run the data flow/pipeline on different workloads.

What is data governance?

Creating processes and controls around the access of data

How does a data scientist and DBA differ in how they use SQL?

DBAs manager the database for other users

Fill in the blank: ________________ is the heart of every organization.

Data

Open Neural Network eXchange (ONNX) was originally created for what models?

Deep learning models.

A null and a zero value effectively mean the same thing. True or false?

False

True or False: The Jupyter Notebook kernel must be installed on a local server.

False

You are only allowed to have one condition in a case statement. True or false?

False

_________ filters after the data is grouped

HAVING

What type of environment is RStudio?

Integrated Development Environment (IDE)

If you can accomplish the same outcome with a join or a subquery, which one should you always choose?

Joins are usually faster, but subqueries can be more reliable, so it depends on your situation.

Which tool unifies documentation, source code and data visualizations into a single document?

Jupyter Notebooks / JupyterLab

Which statement about JupyterLab is correct?

JuypterLab can run R and Python code in addition to other programming languages.

What type of assets does the Watson Knowledge Catalog let you discover in Watson Studio?

Machine Learning

PyTorch is what type of Python library?

Machine learning

Jupyter Notebooks is the tool most R developers use.

No

Storing data in tables in a function that RStudio provides.

No

What open source tool was developed and built by statisticians?

RStudio

What tool do most R developers use?

RStudio

Which statement about RStudio is correct?

RStudio is the primary choice for development in the Python programming language.

SQL is what type of database management system?

Relational

Data scientists need to use joins in order to: (select the best answer)

Retrieve data from multiple tables.

In order to retrieve data from a table with SQL, every SQL statement must contain?

SELECT

Which of these is a database query language?

SQL

Which of these is a machine learning or deep learning library for Python?

Scikit-learn

When debugging a query, what should you always remember to do first?

Start simple and break it down first

Comma Separated Values (CSV) is a commonly used format to store:

Tabular data

What is the difference between a left join and a right join?

The only difference between a left and right join is the order in which the tables are relating.

Is the following statement true or false: R integrates well with other computer languages like C++, Java, C, .Net and Python.

True

True or false? Jupyter Notebooks / JupyterLab support development in R.

True

True or false? RStudio supports development in Python.

True

What is the most important step before beginning to write queries?

Understanding your data

Data Refinery provides which of the following services?

Visualize and prepare data.

Is Keras a machine learning or deep learning library for Python?

Yes

Is it possible to use machine learning within a web browser with Javascript?

Yes

RStudio is the tool most R developers use.

Yes

When using the "CREATE TABLE" command and creating new columns for that table, which of the following statements is true?

You must assign a data type to each column

Fill in the blank: It's a best practice to remove or replace _____________ before publishing to GitHub.

credentials

Fill in the blank: In the __________ tab you can define the hardware size and software configuration for the runtime associated with Watson Studio tools such as Notebook.

environments

Which command is used to install packages in R?

install.packages("package name")

What type of file format is a Jupyter Notebook?

ipynb

Fill in the blank: If you'd like to schedule a notebook in Watson Studio to run at a different time, you can create a(n) ________.

job

Which is the correct order of occurrence in a SQL statement?

select, from, where, group by, having

_________ always process the innermost query first and then work outward

subquery


Related study sets

ACCT 405 - Multiple Choice - Ch. 10

View Set

prepu ch 1 intro to nursing and professional formation

View Set

12 - Project Procurement Management

View Set

Chapter 12 SmartBook - MGMT 3000

View Set

Chapter 3: Identification of Units of Comparison

View Set

Chapter 36: Management of Patients with Musculoskeletal Disorders

View Set

TEST 2 MICROBIOLOGY Chapter 9 Guaranteed 100% mr.trott

View Set