SAS Question Bank 1 (DPH)

Réussis tes devoirs et examens dès maintenant avec Quizwiz!

In an INPUT statement, character variables are denoted with _____?

$ (dollar sign)

Which file is a permanent SAS file? a. work.reality b. reality c. sasuser.reality d. enough.reality e. c & d f. none of the above

E (C&D)

The SAS Institute was founded in _____________________. (year)

1976 (*first developed in 1966 with NIH grant, incorporated as SAS Institute in 1976)

IDRE

=Institute for Digital Research and Education (at UCLA).

3. The LINESIZE= option a. specifies the width of the print line for your procedures output b. specifies the width of the print line for your log output c. specifies the width of the print line for your enhanced editor d. is not available in SAS 9.2 e. none of the above

A

6. In the PROC below, which of the following statements IS true: Proc sort data=new.clinic out=work.health; By weight age; Run; a. The values will be in ascending order of age within weight. b. The values will be in descending order of age with weight. c. The values will be in ascending order of weight within age. d. All values will be in descending order. e. none of the above.

A

9. Which of the following PROC PRINT steps is correct if you are going to print temporary variable labels? a. Proc print data=sashelp.ses label; label ses= "Socioeconomic status"; run; b. Proc print data=sashelp.ses; label ses= "Socioeconomic status"; run; c. Proc print data sashelp.ses label noobs; run; d. Proc print sashelp.ses label; label ses= "Socioeconomic status"; run;

A

To display $5,678.00 you will have to use the following format: a. dollar9.2 b. dollar8.0 c. comma9.2 d. comma 8.2 e. dollar8.2 f. none of the above

A

Usually PROC FREQ crosstabs are more informative if the variables are: a. categorical b. numerical c. character d. continuous e. date values

A

Which of the following items is NOT created during the compilation phase? a. observation 1 b. program data vector c. data descriptor d. _N_ e. _ERROR_ f. none of the above

A

A libref can be no more than how many characters? a. 8 b. 16 c. 32 d. 54 e256

A (8)

Which variable names are valid? a. _12 b. q12 c. q12_ d. 1_2 e. &1_2 f. 12_ g. q1-2

A (_12), B (q12), and C (q12_). RULES FOR SAS NAMES: - Can contain letters, numbers, or underscores - Can be up to 32 chars long - Must begin with a letter or underscore (NOT number) - No dashes or other symbols

How many program steps are executed when the program below is submitted? data lisa.jobs; infile fabulous; input last $ first $ age; run; proc sort data=lisa.jobs; by last; run; proc freq data=lisa.jobs; tables age; run; a. three b. four c. five d. eight e. ten

A - 3

5. In the following dataset, what type of variable is AGE? LAST GENDER AGE Smith 1 30 Johnson . Randell 2 23 Liu 3 58 a. Numeric b. Date c. Character d. Cannot tell because values are missing

A - numeric

Missing numeric data are identified by...

A period

TRANWRD

A search and replace function, which will replace all occurrences of a substring in a character string

What should you put within the brackets in the array statement below? ARRAY enough{ } x1-x6; a. I b. J c. * d. 5 e. 6 f. dim

E

_N_ and _Error_ are ________________________ variables, created during the ____________________ phase

Automatic variables; Compilation phase

10. Which of the following programs correctly references a SAS dataset called statisticalanalysis? a. Data uclabruin.statisticalanalysis; set ucla.healthdata; if age > 35; run; b. data ucla.bruins; set ucla.statisticalanalysis; if age > 65; run; c. xproc print data=statisticalanalysis.healthdata; var id last first; run; d. proc freq data=2012data.statisticalanalysis; tables gender*food/cmh; run;

B

If you ran your program and it generated error, what is the value of the automatic variable _ERROR_ for the fourth error-generating observation in the dataset? a. 0 b. 1 c. 2 d. 3 e. 4

B

If you submitted the following program, what would happen? PROC SORT data=enough.already; Run; PROC PRINT data=enough.already label; var gender race age; where testtype = 'blood'; Run; a. PROC SORT permanently sorts the dataset ALREADY. b. PROC SORT will not run, but PROC PRINT generates unsorted output. c. PROC SORT runs successfully, but PROC PRINT does not. d. PROC PRINT runs successfully and the variables GENDER, RACE and AGE are sorted. e. none of the above

B

RTF is typically used in an ODS statement to generate: a. Excel workbook b. Word document c. PDF d. all of the above e. none of the above

B

The libname statement remains in effect until it is modified, canceled, or if the session ends. Therefore, a libname is a ________ statement. a. permanent b. global c. optional d. all of the above

B

Which statement converts the 3-byte character variable DAYS to a numeric? a. days=input(days, 3.); b. newdays=input(days, 3.); c. newdays=put(days, 3.); d. days=put(days, 3.); e. none of the above

B (INPUT converts character to numeric; best to create a new variable)

A libref cannot contain a(n) a. underscore b. dollar sign c. numbers d. a & c e. all of the above

B (can contain letter, #, underscore)

1. After running your SAS program, you see five DATA step errors. What is the value of the automatic variable _ERROR_ when the observation that contains the fifth error is processed? a. 0 b. 1 c. 3 d. 4 e. 5

B - 1

How many statements does the following SAS program contain? PROC FREQ data=enough.already; Tables income*death/cmh; Where gender = 'F'; Weight count; Title 'The effect of income on mortality'; Run; a. five b. six c. three d. four e. seven

B- 6

In the array below, the variables are:? ARRAY text{5} age time1 salary days1 days2; Do I = 1 to 5; Newage{I} = text{I} * 1.5; End; Drop I; a. character b. numeric c. dates d. both character and numeric e. need more information

B- numeric

5. By default, PROC MEANS creates output for a. All character variables b. All numeric variables, except ID numbers, which have no statistical meaning c. All numeric variables d. none of the above

C

By default, PROC MEANS produces which statistic? a. range b. coefficient of variation c. standard deviation d. standard error of the mean e. none of the above

C

To generate a WORD document (.doc), you use which following ODS statement? a. ods pdf file= / ods pdf close b. ods html file= / ods html close c. ods rtf file= / ods rtf close d. ods html file = / ods close e. ods doc file= / ods doc close

C

To insert a page break, you would use the following syntax: a. options charform = ('-'); b. options insertbreak = ('-'); c. options formdlim = ('-'); d. options pagebreak = ('-'); e. none of the above

C

Which input style would be able to read the three observations below? Johnson UCLA 38 Ritz USC 29 Washington NC 30 a. column b. formatted (=absolute column pointers) c. list d. mixed e. none of the above

C

Which statement is TRUE? a. During the execution phase each SAS statement is scanned for syntax errors. b. During the compilation phase, PROC statements are generated. c. During the execution phase, the DATA step writes observations to the new dataset. d. all the above e. none of the able

C

5. In the following dataset, what type of variable is GENDER? LAST GENDER AGE Smith 1 30 Johnson . Randell 2 23 Liu 3 58 a. Numeric b. Date c. Character d. Cannot tell because values are missing

C - CHARACTER (can tell from how the missing value is coded; also note that it's a categorical variable).

In SAS, which function is used to convert the Microsoft ACCESS date format datetime18. to a valid SAS date? a. DATAPART b. DATETRIM c. DATEPART d. PARTDATE e. TRIM

C - DATEPART

5. What typically occurs when SAS encounters a syntax error? a. SAS continues processing the step and the log messages indicated that the program ran successfully. b. SAS continues to process the step, but the log window displays a warning message. c. SAS stops processing the step, and the log window displays the error message. d.SAS stops processing the step, and the output window displays the error message.

C - SAS stops processing

2. The PAGESIZE= option a. specifies how many lines each page of the program contains b. specifies how many lines each page of the log contains c. specifies how many lines each page of the output contains d. is not available in SAS 9.2 e. none of the above

C - Specifies how many lines each page of the output contains

In the dataset below, the variable IDnum is: IDnum Factor 45397_1 X _63434 Jones Y Martin Z a. numeric b. date c. character d. numeric or character e. none of the above

C - character

What is the length of the variable TESTTYPE below? DATA enough.now; Set enough.already; If male = '1' then testtype = 'Blood'; Else testtype = 'Samedaytest'; Length testtype $11; Run; a. 11 b. 8 c. 5 d. 4 e. none of the above

C -5

A variable can either be ________________ or ________________

Character or numeric

Concatenate

Combines two or more datasets, stacking them one on top of the other into a single dataset.

Descriptor info is created in the ____ phase and can be displayed using _____ _______.

Compilation phase PROC CONTENTS

INVALUE Definition

Creates an informat for character variables, which allows you to tell SAS how to read in special character values

What color are numbers in SAS?

Cyan (blue-greenish)

3. Which statement is FALSE with regard to VALUE statements? a. A single value, such as 28 or 'S', is valid. b. A range of numeric values, such as 0-5000, is valid. c. A range of character values, such as 'B-X', is valid. d. A list of numeric and character values separated by commas, such as 40, "F", "7", "R", is valid.

D

4. In the table below, what is the default length for the numeric value COST? LAST COST Woo 43.75 Parvi 142.36 Watson 270.53 Crick 1424.62 a. 5 b. 6 c. 7 d. 8 e.hard to tell because the values do not contain a dollar sign

D

4. Which keyword can be used to label missing values and any values that are not specified in a range? a. HIGH b. MISS c. MISSING d. OTHER e. ERROR

D

7. If you ran the following program, which variables appear in the dataset HEART? (The dataset "eligible" consists of the following variables gender, age and race.) data heart (drop=gender age); set ucla.eligible (keep=gender race); run; Proc print data=heart; run; a. Gender, Age, Race b. Age, Race c. Age d.none of the above

D

A LABEL statement cannot contain no more than how many characters? a. 80 b. 156 c. 200 d. 256 e. 32,756

D

The dataset enough.already contains the variables below. Which variable's mean would be meaningless? a. age b. weight c. days hospitalized d. idno e. none of the above

D

Unless otherwise indicated, the DATA step executes: a. once for each variable in the dataset. b. once for each statement in the data step. c. once for each variable in the execution step. d. once for each observation in the input file. e. none of the above

D

Which function calculates the average of variables x1, x2, x3 and x4 in SAS? a. mean(of x1, x4) b. mean(x1, x4) c. mean(x1-x4) d. mean(of x1-x4) e. (x1 + x2 + x3 + x4) / 4 f. none of the above

D (*BEST answer - if there are missing data, wil calculate average of remaining data instead of marking mean as missing)

What happens if you submit the following program in which you merged datasets 1 and 2? DATASET 1 DATASET 2 ID GENDER AGE ID AGE RESULT 100 M 38 100 38 Positive 101 F 43 101 43 Negative 102 M 28 102 28 Positive DATA enough.today; Merge enough.dataset1 enough.dataset2 (IN=OK); By ID; If OK; Run; a. The sorted datasets merge on the unique ID number. b. The merged dataset contains 3 observations, 4 variables. c. The merged dataset contains 6 observations, 3 variables. d. Nothing happens.

D (according to Dr. Smith - but I ran it and it actually did run. Answer is you *shouldn't* run it since you have overlapping variables (age)

2. What is wrong with this program? Data ucla.update; infile jazz Input IDnum $ 1-10 Gender $ 11 Months 12 Coinfect 13 Total=Months+Coinfect; Run; a. Missing colon on the second line b. Missing colon on the third line c. a & b d. none of the above

D (missing semicolons, not colons)

What is the value of the index variable that references DAYS1 in the statement below? ARRAY text{5} age time1 salary days1 days2; Do I = 1 to 5; Newage{I} = text{I} * 1.5; End; Drop I; a. 1 b. 2 c. 3 d. 4 e. 5 f. none of the above

D - 4

5. How many observations and variables does the data below contain? LAST GENDER AGE Smith 1 30 Johnson . Randell 2 23 Liu 3 58 a. 3 observations, 4 variables b. 3 observations, 3 variables c. 4 observations, 2 variables d. 4 observations, 3 variables e. cannot tell because values are missing

D - 4 obs, 3 vars

What is this PROC FREQ designed to process? PROC FREQ data=enough.already; Tables income*death/cmh; Where gender = 'F'; Weight count; Title 'The effect of income on mortality'; Run; a. Frequency counts b. Line listing of observations c. Aggregate data d. A & c e. All of the above

D - A & C

What statistics will the following SAS program produce? PROC FREQ data=enough.already; Tables income*death/cmh; Where gender = 'F'; Weight count; Title 'The effect of income on mortality'; Run; a. Risk ratio b. Odds ratio c. T-test d. A & b e. All of the above

D - A (risk ratio) & B (odds ratio) - key here is the CMH option, aka Cochrane Mantel-Haenszel (chi square), which auto-outputs ORs & RRs.

11. What happens if you submitted the following program? Proc sort data=health.study out=obese; by gender; Run; Proc print data=obese label double; Label stomach = 'stomach cancer; Var id gender race age; Where site = 'c' and rate > 90; Sum cost; Run; a. Program runs successfully; log shows no errors. b. A log message indicates that an option is invalid or not recognized. c. A "PROC SORT running" message appears at the top of the active (program) window. d. A "PROC PRINT running" message appears at the top of the active (program) window.) e. None of the above.

D - PROC PRINT running (learned by testing). EXPLANATION: Missing right apostrophe after 'stomach cancer - SAS will keep running looking for the other apostrophe

The TABLE statement is used in which PROC(s)? a. PROC FREQ b. PROC PRINT c. PROC LOGISTIC d. PROC TABULATE e. PROC MEANS

D - PROC TABULATE (proc freq uses TABLES)

1. The QUIT statement is typically used in which PROC? a. TABULATE b SORT c. SETINIT d. DATASETS e. none of the above

D - datasets

Which of these WHERE statements will subset a dataset? a. where school = 'UCLA' or 'USC' or 'Harvard'; b. where school in 'UCLA' or 'USC' or 'Harvard'; c. where school in (UCLA, USC, Harvard); d. where school in ('UCLA', 'USC', 'Harvard'); e. a & d

D is *best answer* (seems parentheses help ensure all are counted, avoid overwriting... or something). According to Dr. Smith D is "more powerful" and "less prone to errors"

______ step creates SAS datasets

DATA

SAS program typically contains 2 parts: __ and ___

DATA and PROC steps

The __________________ of the SAS dataset contains information about the data: dataset names, creation time and date, and number of variables and observations.

Descriptor portion

10. Which PROC PRINT statement(s) created the following output? Day Classes Distance Score 3 4 10 60 4 2 8 20 5 4 2 20 10 20 100 a. var day classes distance; sum classes distance score; b. var day classes; sum classes distance score; c. var day score; sum classes distance score; d. var day; sum classes distance score; e.all of the above f. none of the above

E

11. Which statement is true about the following program? Proc format; picture soc low-high = '999-99-9999'; run; Data soc; Input social; cards; 123456789 234567899 345678999 ; Run; Proc print data = soc; format social soc.; run; A. The PICTURE keyword does not exist in SAS code. B. "Low-high" tells SAS to apply the format to all values. C. Output will be generated, but without dashes between the numbers. D. The "999-99-9999" is called digit originators. E. All of the above.

E

8. What statement is true concerning the DESCENDING option? Proc sort data=center.numbers out=change; by descending weight age; run; Proc print data=change; var age height weight cost; where age >29; run; a. Age will be sorted in ascending order within descending order of Weight b. Age will be sorted in descending order within descending order of Weight c. Age will be not be sorted, but Weight will d. The DESCENDING option only applies to the variable Weight e. a & d f. none of the above

E

In SAS, nonstandard numeric data is considered to be: a. fractions b. commas c. dollar signs d. percent signs e. all of the above f. none of the above

E

Which statement is NOT true about value statements? a. A value statement cannot end with a number. b. A value statement cannot end with a period. c. A value statement cannot be a name of a variable in the dataset. d. A value statement cannot be more than 8 characters long e. c & d f. none of the above

E

8. During the compilation phase, variables are created in the program data vector (PDV) and the values are set to: a. Missing b. 0 c. _N_ d. Blank e. none of the above

E (... CONFIRM THIS ONE. What's in book & what Dr. Lisa say differ.)

Which is the best way to order your SAS program? a. formats, informats, DATA statement, labels, and new variables. b. PROC FORMAT, DATA statement, new variables, SET statement, labels and formats. c. DATA statement, PROC FORMAT, new variable, labels, and formats. d. DATA statement, SET statement, PROC FORMAT, new variable, labels, and formats e. none of the above

E (B comes closest, but SET should come before new variables)

6. By default the following code generates the output below: ods rtf file='F:\mydata.doc'; ods pdf file = 'F:\mydate.pdf'; a. RTF and PDF b. RTF only c. PDF only d. RTF, PDF and listing

E - *note: ods close statement is best practice but not required.

A standard numeric value may contain all of the following EXCEPT a. numbers b. scientific notation c. decimal point d. plus and minus signs e. dollar signs

E - dollar signs

If you did not choose to label your variables in the DATA step, which of the following statements would generate labels. a. PROC PRINT data=enough.already label noobs; Run; b. PROC PRINT; Label sex = 'gender'; Run; c. PROC PRINT data=enough.already title; Label sex = 'gender'; Run; d. all the above e. none of the above

E - none of the above (there will be no labels to generate unless you have both label option in the DATA statemetn line (i.e. data-mydata label;), AND create new labels in the proc print step (e.g. sex='gender').

In the code below, what are x1-x6 called? ARRAY enough{ } x1-x6;

Elements

DO WHILE

Executes statement in a DO-loop repetitively while a condition is true

Variables values are set to missing [in the PDV] during the ______________ phase. [to make way for next iteration/new obs]

Execution

If you want to label missing numeric variables including any values not specified in a range within a PROC FORMAT statement, you should use: a. low b. high c. missing d. MISS e. . (period) f. other g. none of the above

F

Your supervisor did not use the LIBRARY= option in his PROC FORMAT statement, so which statement is true? a. Formats are stored in work.formats. b. Formats exist for the current DATA step. c. Formats exist for the current procedure. d. Formats exist for the current SAS session. e. a & b f. a & d.

F

TRUE OR FALSE: A LIBNAME statement is used to read text data.

FALSE

TRUE OR FALSE: Column input can read nonstandardized numeric data in SAS.

FALSE

TRUE OR FALSE: In order to concatenate data, you must sort the datasets first.

FALSE

TRUE OR FALSE: Length statements are typically used for numeric characters.

FALSE

TRUE OR FALSE: The CATX function is same as the concatenator operator (||).

FALSE

TRUE OR FALSE: When you delete a libname statement, you delete the SAS folder.

FALSE

TRUE OR FALSE: ODS stands for output delivery syntax.

FALSE (output delivery system)

TRUE OR FALSE: The CATX function joins two or more character strings, leaving leading and/or trailing blanks unchanged.

FALSE (trims leading & trailing spaces, adds separator)

TRUE OR FALSE: In logistic regression, the BY statement and parameterization option automatically creates dummy variables.

FALSE - CLASS and PARAM=ref are used to automatically gen dummy vars. CODE: proc logistic data=""c:\data\binary"" descending; class rank / param=ref; model admit = gre gpa rank; run;

TRUE OR FALSE: The DATETIMEn. function stores dates from ACCESS as SAS dates.

FALSE - DATEPART does this

TRUE OR FALSE: Typically, it is easier to display four variables simultaneously in PROC FREQ than PROC TABULATE.

FALSE - Easier with PROC TABULATE (proc freq for 2 vars only)

TRUE OR FALSE: An put statement can be used to convert character variables to numeric variables.

FALSE - PUT does numeric to character, INPUT does character to numeric

TRUE OR FALSE: Text data are first read into the product vector buffer then into the input buffer.

FALSE - input buffer first, should be program data vector

TRUE OR FALSE: A SAS dataset name cannot exceed 40 characters.

FALSE - max 32 characters

TRUE OR FALSE: The BY statement in PROC MEANS replaces the need to use PROC SORT.

FALSE - still have to do SORT BY first. Use CLASS to group without sorting.

TRUE OR FALSE: When on the left side of the equal sign, the SUBSTR function will extract a specific string.

FALSE - will replace. (e.g. SUBSTR(x,1,4)="hello" will replace characters 1-4 with "hello"

TRUE OR FALSE: All character variables have a default storage length of 8 bytes.

FALSE. Numeric vars have default storage of 8 bytes. Character vars have default storage same as FIRST OCCURRENCE of this variable.

TRUE OR FALSE: To permanently associate a format with a variable you should place the FORMAT statement in the PROC step.

False (should place in DATA step)

Input statement starts with the keyword...

INPUT

Input (function) vs Input (keyword)

Input (FUNCTION): Used to convert character values to numeric or other character values (e.g. dates) Input (KEYWORD): Used to read raw data from an external file or in-stream data

3 items created during the compilation phase

Input buffer, program data vector (PDV), & descriptor information/portion

Informat

Instructs SAS how to data, like datae? values, into SAS vars

Format

Instructs SAS how to write the values of SAS variables

The stored SAS date value of 1 corresponds to what calendar date?

January 2, 1960. (1/1/1960 = 0)

1. PROC CONTENTS is called _______________, which is data about data.

Metadata

SAS was created at ___________ University as a project to analyze agricultural research.

NC State

What is wrong with the program below? Data rob; input date : mmddyy10; datalines; 12/30/1959 12/31/1959 01/01/1960 01/02/1960 ; Run; Proc print; run;

No period after format; no data specified for proc print

Set of all data values for one item

Observation

PROC to determine the expiration date of your SAS software.

PROC SETINIT

All formats end with a __________________

Period

File extensions for SAS programs & SAS datasets

Programs: .sas Datasets: .sas7bdat (SAS v7 binary data)

Put (function) vs Put (keyword)

Put (FUNCTION): Returns a value using a specified format, used to convert a numeric value to a character value Put (KEYWORD): Writes lines to the SAS log, to the SAS output window, or to an external location that is specified in the most recent FILE statement, valid in a DATA step

Where to go for help with SAS

SAS help & documentation SAS knowledge base IDRE Google

Every SAS statement ends with a _____

Semicolon

Call Routine

Similar to functions, but different in that you cannot use them in assignment statements or expressions

Infile

Specifies an external file to read with an INPUT statement, valid in a DATA step

TRUE OR FALSE: The picture option is a way for you to create your own numeric formats by providing SAS with a picture of what you want the output to look like.

TRUE

TRUE OR FALSE: To update a dataset, you must first sort the dataset by the BY variable.

TRUE

TRUE OR FALSE: Unless specified, PROC FREQ creates frequencies and percentages for both numeric and character variables.

TRUE

TRUE OR FALSE: Values of variables are initialized in the execution phase.

TRUE

TRUE OR FALSE: When you do not intend to sort the data permanently, use the OUT= option.

TRUE

TRUE OR FALSE: The double trailing at sign (@@) must be the last item specified in the INPUT statement.

TRUE - but study more

TRUE OR FALSE: A colon can be used to read informats.

TRUE - used in list input

TRUE OR FALSE: The IN operator is used to subset data and the IN= option is used to merge data.

TRUE: Use WHERE/IN to subset data, IN= specifies whether data is in one dataset or another

Each column of data is called...

Variable

Three important windows in the SAS data systems are the: ________________, _________________, and _________________________.

[Program] Editor, Log, and Output

Color for comments in SAS

light green


Ensembles d'études connexes

NIOSH Hazardous Drug Dispensing & Handling Procedures

View Set

How to Set Measurable and Achievable Project Management Goals

View Set