Alibaba Cloud

Ace your homework & exams now with Quizwiz!

. In each release of E-MapReduce, the software and software version are flexible. You can select multiple software versions.

False

which of the following descriptions about maxcompute security are not correct

MaxCompute recognizes RAM users but cannot recognize permissions

There are three types of node instances in an E-MapReducecluster: master, core, and _____ .

task

MaxCompute SQL is suitable for processing less real-time massive data, and employs a syntax similar to that of SQL. The efficiency of data query can be improved through creating proper indexes in the table.

False

The FTP data source in DataWorks allows you to read/write data to FTP, and supports configuring synchronization tasks in wizard and script mode.

true

Which of the following is not proper for granting the permission on a L4 MaxCompute table to a user? (L4 is a level in MaxCompute Label-based security (LabelSecurity), it is a required MaxCompute Access Control (MAC) policy at the project space level. It allows project administrators to control the user access to column-level sensitive data with improved flexibility.)

. The user needs to create a project in simple mode

A business flows in DataWorks integrates different node task types by business types, such a structure improves business code development facilitation. Which of the following descriptions about the node type is INCORRECT?

A zero-load node is a control node that does not generate any data. The virtual node is generally used as the root node for planning the overall node workflow.

.DataWorks provides powerful scheduling capabilities including time-based or dependency-based task trigger mechanisms to perform tens of millions of tasks accurately and punctually each day based on DAG relationships. It supports multiple scheduling frequency configurations like: (Number of correct answers: 4)

A. By Minute B. By Hour C. By Day D. By Week

The cost of an E-MapReduce product consists of the cost of _____, the cost of the _____ , and the cost of the _____. (Number of correct answers: 3)

A. ECS B. E-MapReduce C. Download

If a MySQL database contains 100 tables, and jack wants to migrate all those tables to MaxCompute using DataWorks Data Integration, the conventional method would require him to configure 100 data synchronization tasks. With _______ feature in DataWorks, he can upload all tables at the same time.

Add data sources in Bulk Mode

Which of the following Hadoop ecosystem componets can you choose to setup a streaming log analysis system?(Number of correct answers: 3)

Apache Flume .Apache Spark Apache Lucene

.A dataset includes the following items (time, region, sales amount). If you want to present the information above in a chart, ______ is applicable.

Bubble Chart

DataWorks provides powerful scheduling capabilities including time-based or dependency-based task trigger mechanisms to perform tens of millions of tasks accurately and punctually each day based on DAG relationships. It supports multiple scheduling frequency configurations like: (Number of correct answers: 4)

By Minute By Hour By Day By Week

AliOrg Company plans to migrate their data with virtually no downtime. They want all the data changes to the source database that occur during the migration are continuously replicated to the target, allowing the source database to be fully operational during the migration process. After the database migration is completed, the target database will remain synchronized with the source for as long as you choose, allowing you to switch over the database at a convenient time. Which of the following Alibaba products is the right choice for you to do it:

DTS(Data Transmission Service)

MaxCompute supports two kinds of charging methods: Pay-As-You-Go and Subscription (CU cost). Pay-As-You-Go means each task is measured according to the input size by job cost. In this charging method the billing items do not include charges due to ______.

Data download

MaxCompute is a general purpose, fully managed, multi-tenancy data processing platform for large-scale data warehousing, and it is mainly used for storage and computing of batch structured data. Which of the following is not a use case for MaxCompute?

Date Warehouse

A start-up company wants to use Alibaba Cloud MaxCompute to provide product recommendation services for its users. However, the company does not have much users at the initial stage, while the charge for MaxCompute is higher than that of ApsaraDB RDS, so the company should be recommended to use MaxCompute service until the number of its users increases to a certain size

False

An enterprise uses Alibaba Cloud MaxCompute for storage of service orders, system logs and management data. Because the security levels for the data are different, it is needed to register multiple Alibaba Cloud accounts for data management.

False

DataWorks can be used to create all types of tasks and configure scheduling cycles as needed. The supported granularity levels of scheduling cycles include days, weeks, months, hours, minutes and seconds.

False

Different versions of Spark can run in MaxCompute at the same time.

False

In each release of E-MapReduce, the software and software version are flexible. You can select multiple software versions.

False

LabelSecurity is a workspace-level mandatory access control (MAC) policy that enables workspace administrators to control user access to row-level sensitive data more flexibly.

False

Which of the following is not proper for granting the permission on a L4 MaxCompute table to a user. (L4 is a level in MaxCompute Label-based security (LabelSecurity), it is a required MaxCompute Access Control (MAC) policy at the project space level. It allows project administrators to control the user access to column-level sensitive data with improved flexibility.)

If no permissions have been granted to the user and the user does not belong to the project, add the user to the project. The user does not have any permissions before they are added to the project.

Your company stores user profile records in an OLTP databases. You want to join these records with web server logs you have already ingested into the Hadoop file system. What is the best way to obtain and ingest these user records?

Ingest using Hive

E-MapReduce simplifies big data processing, making it easy, fast, scalable and cost-effective for you to provision distributed Hadoop clusters and process your data. This helps you to streamline your business through better decisions based on massive data analysis completed in real time. Which of the following descriptions about E-MapReduce is not true ?

It supports the PAy- AS-You Go payment method which means the cost of each task is measured according to the input size

Users can use major BI tools, such as Tablueu and FineReport, to easily connect to MaxCompute projects, and perform BI analysis or ad hoc queries. The quick query feature in MaxCompute is called _________ allows you to provide services by encapsulating project table data in APIs, supporting diverse application scenarios without data migration.

Lightning

DataV is a powerful yet accessible data visualization tool, which features geographic information systems allowing for rapid interpretation of data to understand relationships, patterns, and trends. When a DataV screen is ready, it can embed works to the existing portal of the enterprise through ______.

MD5 code obtained after the release

MaxCompute Tunnel provides high concurrency data upload and download services. User can use the Tunnel service to upload or download the data to MaxCompute. Which of the following descriptions about Tunnel is NOT correct:

MaxCompute provides two data import and export methods: using Tunnel Operation on the console directly or using TUNNEL written with java

Which node type in DataWorks can edit the Python code to operate data in MaxCompute

PyODPS

Alibaba Cloud Quick BI reporting tools support a variety of data sources, facilitating users to analyze and present their data from different data sources. ______ is not supported as a data source yet.

Results returned from the API

Function Studio is a web project coding and development tool independently developed by the Alibaba Group for function development scenarios. It is an important component of DataWorks. Function Studio supports several programming languages and platform-based function development scenarios except for ______ .

Scala

Where is the meta data(e.g.,table schemas) in Hive?

Stored in the RDBMS like MySQL

._______ instances in E-MapReduce are responsible for computing and can quickly add computing power to a cluster. They can also scale up and down at any time without impacting the operations of the cluster.

Task

DataWorks provides two billing methods: Pay-As-You-Go (post-payment) and subscription (pre-payment). When DataWorks is activated in pay-as-you-go mode, Which of the following billing items will not apply?

Task nodes created by developer

Apache Spark included in Alibaba E-MapReduce(EMR) is a fast and general-purpose cluster computing system. It provides high-level APIs in Java, Scala, Python and R, and an optimized engine that supports general execution graphs. It also supports a rich set of higher-level tools. Which of the following tools does not be included in Spark

TensorFlow for AI

MaxCompute SQL is suitable for scenarios : there is massive data (TB Level) to be processed ans real-time requirement is not high, It takes seconds or even minutes to prepare each job and submit each job, so MaxCompute SQL is not acceptable for the services which need to process thousands to tens of thousands of transactions per second. Which of the following descriptions about MaxCompute SQL is NOT correct :

The synax of ODPS SQL is similar to SQL. It can be considered as a subset of standard SQL

.Table is a data storage unit in MaxCompute. It is a two-dimensional logical structure composed of rows and columns. All data is stored in the tables. Operating objects of computing tasks are all tables. A user can perform create table, drop table, and tunnel upload as well as update the qualified data in the table.

True

Alibaba Cloud Elastic MapReduce (E-MapReduce) is a big data processing solution to quickly process huge amounts of data. Based on open source Apache Hadoop and Apache Spark, E-MapReduce flexibly manages your big data use cases such as trend analysis, data warehousing, and analysis of continuously streaming data.

True

All users new and beta must activate DataWorks in Payas you go mode first

True

Assume that Task 1 is configured to run at 02:00 each day. In this case, the scheduling system automatically generates a snapshot at the time predefined by the periodic node task at 23:30 each day. That is, the instance of Task 1 will run at 02:00 the next day. If the system detects the upstream task is complete, the system automatically runs the Task 1 instance at 02:00 the next day.

True

Data Migration Unit (DMU) is used to measure the amount of resources consumed by data integration, including CPU, memory, and network. One DMU represents the minimum amount of resources used for a data synchronization task.

True

If the DataWorks(MaxCompute) tables in your request belong to two owners. In this case, Data Guard(DataWorks component) automatically splits your request into two by table owner.

True

Jindofs is a cloud native file system that combines the advantages of OSS and local storage

True

In a scenario where a large enterprise plans to use MaxCompute to process and analyze its data, tens of thousands of tables and thousands of tasks are expected for this project, and a project team of 40 members is responsible for the project construction and O&M. From the perspective of engineering, which of the following can considerably reduce the cost of project construction and management?

Use DataWorks

There are multiple connection clients for MaxCompute, which of the following is the easiest way to configure workflow and scheduling for MaxCompute tasks?

Use Intelij IDEA

Tom is the administrator of a project prj1 in MaxCompute. The project involves a large volume of sensitive data such as user IDs and shopping records, and many data mining algorithms with proprietary intellectual property rights. Tom wants to properly protect these sensitive data and algorithms. To be specific, project users can only access the data within the project, all data flows only within the project. What operation should he perform?

Use Policy authorization to set the status to read-only for all users

DataWorks provides powerful scheduling capabilities including time-based or dependency-based task trigger mechanisms to perform tens of millions of tasks accurately and punctually each day based on DAG relationships Which of the following descriptions about scheduling and dependencies in DataWorks is INCORRECT?

Users can configure an upstream dependency for a task. In this way, even if the current task instance reaches the scheduled time, the task only run after the instance upstream task is completed.

Machine Learning Platform for Artificial Intelligence (PAI) node is one of the node types in DataWorks business flow. It is used to call tasks created on PAI and schedule production activities based on the node configuration. PAI nodes can be added to DataWorks only _________ .

after PAI experiments are created on PAI

In a maxcompute command line if you want to view all tables in a porject you can execute command t

desc tables

.A distributed file system like GFS and Hadoop are design to have much larger block(or chunk) size like 64MB or 128MB, which of the following descriptions are correct? (Number of correct answers: 4)

leave out - it reduces the size of metadata

When odpscmd is used to connect to a project in MaxCompute, the command ______ can be executed to view the size of the space occupied by table table_a.

size table_a;

A company originally handled the local data services through the Java programs. The local data have been migrated to MaxCompute on the cloud, now the data can be accessed through modifying the Java code and using the Java APIs provided by MaxCompute.

true

MaxCompute takes Project as a charged unit. The bill is charged according to three aspects: the usage of storage, computing resource, and data download respectively. You pay for compute and storage resources by the day with no long-term commitments.

true

There are various methods for accessing to MaxCompute, for example, through management console, client command line, and Java API. Command line tool odpscmd can be used to create, operate, or delete a table in a project.

true

a partition can be created through the following statement in MAxcompute SQL

true

by default the resource group in dataworks provides you 50 slots and each DMU occupoies 2 slots

true

in maxcompute if error occurs in tunnel transmission due to network or tunnel service the user can resume the last update

true


Related study sets

TEXAS POLITICS REVIEW 2 (chapter 4&5)

View Set

GRE Math Definitions, Formulas and Problems (Algebra)

View Set

MANGT 595 - Final - Quiz Questions

View Set

(6) Events Leading to Secession and War

View Set