Midterm 4770

Pataasin ang iyong marka sa homework at exams ngayon gamit ang Quizwiz!

What is MPI?

Message passing interface. Standardized message passing library.

What is Block-based PFS?

Objects are created as fixed-width blocks. (file can be split up)

What is Object-based PFS?

Objects are created as variable length regions of the file. A file has a constant number of objects

Lines starting with _____ will be interpreted by the PBS scheduler

#PBS

Use the ____ flag to name the pbs job

#PBS -N

Use the ______ flag to configure PBS to merge output and error logs

#PBS -j oe

Use the ____ flag to specify the desired resources

#PBS -l

What is a shared file?

(N-to-1), single file is created, and all application tasks write to that file

What is a file-per-process?

(N-to-N), each application task creates a separate file and writes only to that file

Which filesystems on Palmetto are parallel?

/scratch1

Which filesystems on Palmetto are general purpose?

/scratch2, /scratch3

What is the envvar of the index of job launched as part of a PBS job array

PBS_ARRAY_INDEX

What is the envvar for the name of the file containing list of compute nodes allocated for the job?

PBS_NODEFILE

What is the environment variable indicating the directory from which qsub was called?

PBS_O_WORKDIR

MPI4PY uses ____ to facilitate sending and receiving of complex data

Pickle

What is a GPU?

Processor on graphics card designed to support graphic rendering (numerical manipulation)

Pros and Cons of FPGA?

Pros: Power efficient, low heat. Cons: expensive and difficult to program

What is parallel efficiency?

Ratio of performance improvement per individual unit of computing resource

Amdahl's Law

S(p) = p/((p-1)f + 1) f is fraction of program that isn't parallelizable

Equation for Speedup

S(p) = seq_run_time/par_run_time, where p is num processors

What is MPI size? And what statement?

The number of MPI processes in the communicator group. size = comm.Get_size()

What is theoretical max?

The point where adding more compute resources doesn't help speedup or efficiency

What module should you load for MPI in jupyterhub?

openmpi/1.10.3

How do you delete a job?

qdel <jobID> only done from head node

How do you view the status of your job from any node?

qstat -anu <username>

What command is used to submit a job to palmetto?

qsub

What flag is used to start an interactive job in palmetto?

qsub -I

In a scatter, the data is divided according to ____

rank

What is the call for MPI broadcast?

recv_buffer = comm.bcast(data, root=<rprocess>)

What is the call to MPI scatter?

recv_data = comm.scatter(data_array, root=<rprocess>)

What PBS tag is used to specify the number of chunks?

select

What are the two MPI commands for point-to-point communication?

send and receive

What command is run to login to palmetto?

ssh [email protected]

What statement for MPI reduce?

sum = comm.reduce(rank, op=MPI.SUM, root=0)

What is MPI Rank? And what statement is used?

unique ID for individual processes within a communicator group. rank = comm.Get_rank()

What PBS tag is used to specify the amount of time to run the script?

walltime

What is the address of the file management node?

xfer01-ext.palmetto.clemson.edu

Which processes call MPI reduce?

All of them

Who calls Numpy MPI scatter?

All processes

Who calls Bcast?

All processes after the root value is set

What does a distributed file system do?

Allows transparent access to files stored on a remote disk

What are the benefits of file-per-process?

Avoids lock contention Can create massive amount of small files

How do the MPI function names differ for Numpy variables?

Begin with a capital letter

What is the issue with a shared file?

Can create lock contention and reduce performance

What is the issue with file-per-process?

Can't support app restart on different number of tasks

What is the difference between DFS and PFS regarding workloads?

DFS is geared for loosely coupled, distributed apps (big data), PFS targets HPC applications that require coordinated I/O access with massive bandwidth requirements

What is the difference between DFS and PFS regarding symmetry?

DFS runs on architecture where storage is with the application, PFS separates storage from compute system

What is the difference between DFS and PFS regarding data distribution?

DFS stores entire file on a node, PFS splits the file across many nodes

What is the difference between DFS and PFS regarding fault-tolerance?

DFS takes on fault tolerance responsibilities, PFS runs on enterprise shared storage (no fault-tolerance but rely on hardware quality)

Efficiency equation

E = 1/((p-1)f + 1) * 100% E is the ratio of speedup over the number of processors

What does a parallel file system do?

Enable parallel access to files

(T/F) MPI processes are launched as they are needed during program execution

False, all launched at the beginning of execution

(T/F) Only the process broadcasting should call comm.bcast

False, all processes, the source is specified in the function call

(T/F) A resource request (qsub) can be done from any node?

False, only login001

(T/F) All MPI processes share memory space

False, they have their own and have access to the same source codes

TOP500 ranks supercomputers based on their ____ score

LINPACK

What statement should be used to initialize the communicator?

comm = MPI.COMM_WORLD

What is the call for Numpy MPI broadcast?

comm.Bcast(values, root=<rootrank>)

What is the call to Numpy MPI gather?

comm.Gather(smallarray, bigarray, root=<rootrank>)

What is the call for Numpy MPI receive?

comm.Recv(values, source=<srcrank>, tag=MPI.ANY_TAG, status=status)

What is the call to Numpy MPI reduce?

comm.Reduce(localval, reducedval, op=<MPI.OP>, root=<rootrank>)

What is the call for Numpy MPI scatter?

comm.Scatter(bigarray, localarray, root=<rootrank>)

What is the call for a Numpy MPI send?

comm.Send(values, dest=<destrank>, tag=<tag>)

How is MPI send called?

comm.send(data, dest_rank)

How is MPI receive called?

data = comm.recv(source_rank)

GRAPH500 ranks systems based on benchmarks designed for ______ computing

data-intensive

GREEN500 ranks supercomputers with an emphasis on ______

energy usage (LINPACK/ power consumption)

What is FPGA?

field programmable gate array. Dynamically reconfigurable circuit board

What is an MPI communicator used for?

groups for point-to-point and collective communications

The ____ node is the only gateway to access Palmetto nodes from outside

login

What PBS tag is used to specify the amount of ram per chunk?

mem

Processes handle their own ____, data is passed between processes via _____

memory, messages

Processes communicate via ____

messages

What is the package to enable mpi in Python?

mpi4py

What PBS tag is used to specify the number of mpi processors per chunk?

mpiprocs

What call is made to execute an MPI program with 4 processes?

mpirun -np 4 <executable>

How do you get the name of the host for an MPI process?

name = MPI.Get_processor_name()

What PBS tag is used to specify the number of cpus per chunk?

ncpus

What are the two aspects of a parallel PROBLEM?

1. Can be broken into discrete pieces of work that can be solved simultaneously 2. Can be solved in less time with multiple compute resources than a single compute resource

What is one aspect of a parallel EXECUTION FRAMEWORK?

1. Can execute multiple program instructions concurrently at any moment in time

What are the aspects of a parallel COMPUTE RESOURCE?

1. Might be a single computer with multiple processors 2. Might be an arbitrary number of computers connected by a network 3. May be a GPU

What are the limiting factors of distributed computing?

1. Non-parallelizable code 2. Communication overhead

How is data distributed in PFS?

1. Original file is converted into a sequence of offsets 2. Offsets are mapped to objects 3. Objects are distributed across PFS servers

What are the 4 models in Flynn's taxonomy?

1. SISD 2. SIMD 3. MISD 4. MIMD

What character is used to separate tags on a PBS job

:

What character is used to set the value of tags on a PBS job?

=

What is a Distributed Computing System?

A collection of individual computing devices that can communicate with each other

What is HPCC?

High Performance Computing Challenge. Runs a ****ton of different tests.

What is parallel speedup?

How much faster the program becomes once some computing resources are added

What is TestDFSIO?

I/O performance of mapreduce/hadoop dfs

What is the benefit of a shared file?

Increased usability: only one file is needed

What is the issue with round robin object placement on PFS?

Scalability. Two dimensional distribution and limits number of server per file

What is SHOC? and what is it used for?

Scalable Heterogeneous Computing, used for non-traditional systems (GPUs)

What are the two file access mechanisms in PFS?

Shared file (N-to-1) File-per-Process (N-to-N)

(T/F) All processes make the call to MPI scatter

True

(T/F) MPI broadcast is a blocking call for all processes

True

(T/F) The number of MPI processes is user-specified

True

What is a PBS script?

a bash script that can be submitted to palmetto to run when resources are available for a certain amount of time

Who calls Numpy MPI gather?

all processes

What statement for MPI gather?

all_data = comm.gather(data, root=<rprocess>)


Kaugnay na mga set ng pag-aaral

Consumers in a Global Market: Chapter 7 Global Fashion Retailing and Tourism

View Set

BUSL 2000 Chapter 24: Employment and Discrimination Law

View Set

Real Estate Pre-Licensing Section 6

View Set

Exam 3 (chapter 15, part of chapter 16)

View Set

Ap European History: Chapter 16 Scientific Revolution Vocab

View Set

Chapter 18- Personality Disorders

View Set