PSTAT130 Review

¡Supera tus tareas y exámenes ahora con Quizwiz!

Comment lines start with an asterisk ___ Create multiple-line comments using ____

("*"), /* and */

In the Data Portion, missing numeric values are replaced with a ____, missing character values are replaced with ______.

(.) "dot" ; are left blank

SAS Libraries

- A reference or nickname for storing SAS files - Points to a location on a disk drive - You assign most library names, except: SASUSER - permanent library; WORK - temporary library -Library reference (libref) must be: 8 characters or less - Start with a letter or underscore

Command line

- Allows execution of program commands (Submit, Recall, Include)

Ranges for User Defined Formats:

- Can be single values or ranges of values

Labels for User Defined Formats:

- Can be up to 32,767 characters in length - Are typically enclosed in quotes, although it is not required

Datalines Properties:

- Can only be used once in a DATA step - Each data line is a separate observation - Default delimiter is a blank space - Is the last statement in the DATA step

Title and Footnote Options:

- Color= / C= - Set the color (i.e. red, green, blue) - Height= / H= - Set the size of the text (can also specify unit of measurement) - Font= / F= - Set the system font - Bold - Italic - Justify = Left | Center | Right

SAS Datasets

- Data are stored in SAS format - Each dataset has a: Descriptor portion - information about the overall dataset; Variable attributes - information about each variable (name & label, type, length, position, label, informat, format) - SAS datasets can be read by SAS but not most other programs

Formatting Data Values

- Display formatted values using SAS formats in a list report -Create user-defined formats using the FORMAT procedure - Apply user-defined formats to variables in a list report

Title and Footnotes info:#2

- Footnotes appear at the bottom of the page of output - There is no default footnote - You can have more than footnote - The value of n can be from 1 to 10 and refers to footnote line - An unnumbered -FOOTNOTE is equivalent to FOOTNOTE1 - Footnotes remain in effect until they are changed - The null FOOTNOTE statement, footnote; cancels all footnote

Format Name for User Defined Formats:

- Names the format you are creating (i.e., DATE7.) - Cannot be more than 8 characters - For character values, must have a dollar $ as the first character , a letter or underscore as the second character, and no more than 6 additionalcharacters, number and underscores - For numeric values, must have a letter or underscore as the first character, and no more than 7 additionalcharacters, number and underscores - Cannot end in a number - Cannot be the same as a SAS format - Does not end with a period in the VALUE statement

Datalines Statement:

- Requires the use of an input statement - Identifies the order of values in the data lines - Creates variable names - Assigns variable types - Assigns input values to corresponding variables - Variables take one of two types - Character ($) - Numeric

proc print ex: (3)

- Simple Version: proc print; run; - With Data File: proc print data=data1.empdata; run; - With Variable Selection proc print data=data1.empdata; var JobCode EmpID Salary; run;

Title and Footnotes info:

- Titles appear at the top of the page of output - The default title is "The SAS System" - You can have more than title line - The value of n can be from 1 to 10 and refers to title line - An unnumbered TITLE is equivalent to TITLE1 - Titles remain in effect until they are changed - The null TITLE statement , title; cancels all titles

PROC PRINT Default Output:

- Use variable names as column headings - Display all variables contained in data set - Display all observations contained in data set - Display variable values in their "native" format

Proc Print Default Option:

- Uses variables as column headers - Displays variable values in their "native" format - Displays observation numbers - Displays all variables contained in the data set - Displays all observations contained in the data set

Proc Step

- perform "Utility" operations on a data set (like Print or Sort) - analyze data - output results or reports

title and footnote ex:

- title1 'PSTAT 130 Homework #1'; - footnote2 'Confidential';

Benefits for Datalines method:

-Allows you to see your data directly -Is good for small amounts of data -Is often used to create "test" data sets

Special Operators:

-BETWEEN - AND -CONTAINS (?) - LIKE -SOUNDS LIKE -IS MISSING (orIS NULL)

List Input Attributes:

-Character and numeric values cannot contain spaces -Character values cannot be longer than 8 characters -Numeric values cannot contain commas or dollar signs -Dates will be read as characters rather than date values

Formatted Input Attributes:

-Data can be in "non-standard" format -Numbers can contain commas and dollar signs -Dates can be read as numeric variables -Data can be free-form or fixed text files

Output/results Viewer:

-Displays the results of he SAS procedures -output window: results as a text listing -results viewer: results in HTML or PDF format -Output is directed to HTML results viewer by default

5 Main Windows of SAS:

-Editor window -Log Window -Output/results Viewer -Results Window -Explorer Window

Import an Excel file:

-Imports the file 'X:\PStat 130\data1\DallasLA.xls' -Outputs the data set to work.tdfwlax -Specifies the type of file to import as XLS (dbms) -Overwrites an existing SAS data set (replace) -Specifies which sheet SAS should import (default is 1st sheet) -Specifies to SAS to use the first row of data as variable names

Benefits for Infile method:

-Is more common -Is necessary if your data comes from an outside source (i.e. client, download, website, etc.) -Is preferred for large data sets -Allows you to easily re-run your programs on updated data

Types of raw data input:

-List input -Column Input -Formatted Input

To Convert Numeric to Character values, use the _____ Function. To convert Character to Numeric values, use the _____ Function.

-PUT -INPUT

The SORT procedure:

-Rearranges the observations in a SAS data set -Can create a new SAS data set containing the rearranged observations -Can sort on multiple variables -Can sort in ASCENDING (default) or DESCENDING order -Does not generate printed output -Treats missing values as the smallest possible value

The ____ function is used to extract words from a character value when you know the order of the other words. The _____ function is used when you know the exact position of the substring to extract

-Scan -Substr

Column Input Attributes:

-The data values must occupy the same columns within each observation -This is called "fixed" or "aligned" -Character variables can -Be longer than 8 characters -Contain spaces -You can skip some data fields, if desired -The data must be in "standard" format -Numbers may not contain commas or dollar signs -Dates will be read as character, instead of numeric, variables

Examples of creating new variables using arithmetic operators:

-TotalComp = Salary + Bonus -NetPay = GrossPay - Tax -NewPay = Salary * (1 + Raise) -Percent = Score/Maximum

To create a SAS data set from an existing SAS data set, you need to:

-Use a DATA statement to name the output SAS data set -Use the SET statement (in the DATA step) to read the existing SAS data set -SET can refer to an existing SAS data set, temporary, or permanent Ex: data work.temp_empdata; set data1.empdata; run;

If you're using data from excel and want to import it into SAS, what option do you use:

-Use the IMPORT option on the File menu to read it in and create a new SAS data set -You will need to tell SAS -What format the data are in -Where the file is stored -What character is used to separate data values (i.e. the "delimiter") -Save your SAS data set in an existing library or create a new library.

Statistical Analysis with SAS

-input Data -Refine data -Statistical Analysis -Generate output

You can use the DATASETS procedure to modify variable's such as:

-name -label -format -informat date work.repertory; set data1.repertory; run; proc datasets library=work; modify repertory; format date date9.; label play='Name of Play'; run;

User Defined Formats Code Ex:

-proc format; value $codefmt 'FLTAT'='Flight Attendant' run; -proc print data=data1.empdata; format Jobcode $codefmt.; run; proc format; value money low-<25000='Less than 25,000' 25000-50000='25,000 to 50,000' 50000-high='More than 50,000'; run; proc print data=data1.empdata; format Salary money.; run;

Data Step

-read in a data file - assign variable names, label and formats - select specific observations

How SAS stores values:

0 - 01/01/1960 -1 - 12/31/1959 1 - 01/02/1960...

SAS first released

1971

Finish the ARRAY statement below to create temporary array elements that have initial values of 9000, 9300, 9600, and 9900. array goal{4} ... ; _temporary_ (9000 9300 9600 9900) temporary (9000 9300 9600 9900) _temporary_ 9000 9300 9600 9900 (temporary) 9000 9300 9600 9900

A To create temporary array elements, specify _TEMPORARY_ after the array name and dimension. Specify an initial value for each element, separated by either blanks or commas, and enclose the values in parentheses.

Datalines:

Allows raw data to be placed into SAS program

Operators:

Arithmetic Symbols, SAS functions(+, - , **, *, /,...)

What type of variable is the variable AcctNum in the data set below? AcctNum Gender 3456_1 M 2451_2 Roman F Cho M numeric character can be either character or numeric can't tell from the data shown

B It must be a character variable, because the values contain letters and underscores, which are not valid characters for numeric values.

Special Operators code: (done in proc print)

Between, And; where Salary between 50000 and 70000; where 50000 <= Salary <= 70000; Contains: where LastName ? 'LAM'; (LAMBERT, BELLAMY, and ELAM are selected) Like: where Code like 'E_U%' (Selects observations where the value of Code begins with an E, followed by a single character, followed by a U, followed by any number of characters) Sounds Like: where Name=*'SMITH'; Is Null or Is Missing: where Flight is missing; where Flight is null;

How many observations and variables does the data set below contain? Name Sex Age Pick M 32 Flet 28 Rom F . Cho M 42 3 observations, 4 variables 3 observations, 3 variables 4 observations, 3 variables can't tell because some values are missing

C Rows in the data set are called observations, and columns are called variables. Missing values don't affect the structure of the data set.

What is the value of the index variable that references Jul in the statements below? array quarter{4} Jan Apr Jul Oct; do i=1 to 4; yeargoal=quarter{i}*1.2; end; 1 2 3 4

C The index value represents the position of the array element. In this case, the third element is Jul.

For the observation shown below, what is the result of the IF-THEN statements? if status='OK' and type=3 then Count+1; if status='S' or action='E' then Control='Stop'; Status Type Count Action Control Ok 3 12 E Go Count = 12 Control = Go Count = 13 Control = Stop Count = 12 Control = Stop Count = 13 Control = Go

C You must enclose character values in quotation marks, and you must specify them in the same case in which they appear in the data set. The value Ok is not identical to OK, so the value of Count is not changed by the IF-THEN statement.

Two types of Variables:

Character Numeric

Variable is in the:

Column

Character Variable:

Contains any value: letters, numbers, special characters, and blanks. Character values are stored with a length of 1 to 32,767 bytes. One byte equals one character.

Both the @ and @@ must be placed in what part of the data step? input if statement while statement anywhere within the data step

Correct Answer: A whether an input statement in in an do, if, or while statement, the @ and @@ have to be at the end of the input statement.

What type of variable is the variable Wear in the data set below, assuming that there is a missing value in the data set? Brand Wear Acme 43 Ajax 34 Atlas . numeric character can be either character or numeric can't tell from the data shown

Correct answer: A It must be a numeric variable, because the missing value is indicated by a period rather than by a blank.

How many program steps are executed when the program below is processed? data user.tables; infile jobs; input date name $ job $; run; proc sort data=user.tables; by name; run; proc print data=user.tables; run; three four five six

Correct answer: A When it encounters a DATA, PROC, or RUN statement, SAS stops reading statements and executes the previous step in the program. The program above contains one DATA step and two PROC steps, for a total of three program steps.

When you specify an engine for a library, you always specify: the file format for files that are stored in the library. the version of SAS that you are using. access to other software vendors' files. instructions for creating temporary SAS files.

Correct answer: A A SAS engine is a set of internal instructions that SAS uses for writing to and reading from files in a SAS library. Each engine specifies the file format for files that are stored in the library, which in turn enables SAS to access files with a particular format. Some engines access SAS files, and other engines support access to other vendors' files.

Which time span is used to interpret two-digit year values if the YEARCUTOFF= option is set to 1950? 1950-2049 1950-2050 1949-2050 1950-2000

Correct answer: A The YEARCUTOFF= option specifies which 100-year span is used to interpret two-digit year values. The default value of YEARCUTOFF= is 1920. However, you can override the default and change the value of YEARCUTOFF= to the first year of another 100-year span. If you specify YEARCUTOFF=1950, then the 100-year span will be from 1950 to 2049.

Which statement prints a summary of all the files stored in the library named Area51? proc contents data=area51._all_ nods; proc contents data=area51 _all_ nods; proc contents data=area51 _all_ noobs proc contents data=area51 _all_.nods;

Correct answer: A To print a summary of library contents with the CONTENTS procedure, use a period to append the _ALL_ option to the libref. Adding the NODS option suppresses detailed information about the files.

Which of the following PROC PRINT steps is correct if labels are not stored with the data set? proc print data=allsales.totals label; label region8='Region 8 Yearly Totals'; run; proc print data=allsales.totals; label region8='Region 8 Yearly Totals'; run; proc print data allsales.totals label noobs; run; proc print allsales.totals label; run;

Correct answer: A You use the DATA= option to specify the data set to be printed. The LABEL option specifies that variable labels appear in output instead of variable names.

Suppose you have submitted a SAS program that contains spelling errors. Which set of steps should you perform, in the order shown, to revise and resubmit the program? Correct the errors. Clear the Log window. Resubmit the program. Check the Log window. Correct the errors. Resubmit the program. Check the Output window. Check the Log window. Correct the errors. Clear the Log window. Resubmit the program. Check the Output window. Correct the errors. Clear the Output window. Resubmit the program. Check the Output window.

Correct answer: A To modify programs that contain errors, if you use the Program Editor window, you usually need to recall the submitted statements from the recall buffer to the Program Editor window, where you can correct the problems. After correcting the errors, you can resubmit the revised program. However, before doing so, it's a good idea to clear the messages from the Log window so that you don't confuse the old error messages with the new messages. Remember to check the Log window again to verify that your program ran correctly.

Which of the following programs correctly references a SAS data set named SalesAnalysis that is stored in a permanent SAS library? data saleslibrary.salesanalysis; set mydata.quarter1sales; if sales>100000; run; data mysales.totals; set sales_99.salesanalysis; if totalsales>50000; run; proc print data=salesanalysis.quarter1; var sales salesrep month; run; proc freq data=1999data.salesanalysis; tables quarter*sales; run;

Correct answer: B Librefs must be 1 to 8 characters long, must begin with a letter or underscore, and can contain only letters, numerals, or underscores. After you assign a libref, you specify it as the first level in the two-level name for a SAS file.

SAS does not automatically make adjustments for daylight saving time, but it does make adjustments for: leap seconds leap years Julian dates time zones

Correct answer: B SAS automatically makes adjustments for leap years.

Which of the following programs contains a syntax error? proc sort data=sasuser.mysales; by region; run; dat sasuser.mysales; set mydata.sales99; run; proc print data=sasuser.mysales label; label region='Sales Region'; run; none of the above

Correct answer: B The DATA step contains a misspelled keyword (dat instead of data). However, this is such a common (and easily interpretable) error that SAS produces only a warning message, not an error.

How can you tell whether you have specified an invalid option in a SAS program? A log message indicates an error in a statement that seems to be valid. A log message indicates that an option is not valid or not recognized. The message "PROC running" or "DATA step running" appears at the top of the active window. You can't tell until you view the output from the program.

Correct answer: B When you submit a SAS statement that contains an invalid option, a log message notifies you that the option is not valid or not recognized. You should recall the program, remove or replace the invalid option, check your statement syntax as needed, and resubmit the corrected program.

How can you create SAS output in HTML format on any SAS platform? by specifying system options by using programming statements by using SAS windows to specify the result format you can't create HTML output on all SAS platforms

Correct answer: B You can create HTML output using programming statements on any SAS platform. In addition, on all except mainframe platforms, you can use SAS windows to specify HTML as a result format.

Which of the following commands opens a file in the code editing window? file 'd:\programs\sas\newprog.sas' include 'd:\programs\sas\newprog.sas' open 'd:\programs\sas\newprog.sas' all of the above

Correct answer: B One way of opening a file in the code editing window is by using the INCLUDE command. Using the INCLUDE command enables you to open a single program or combine stored programs in a single window. To save a SAS program, you can use the FILE command.

Suppose you submit a short, simple DATA step. If the active window displays the message "DATA step running" for a long time, what probably happened? You misspelled a keyword. You forgot to end the DATA step with a RUN statement. You specified an invalid data set option. Some data values weren't appropriate for the SAS statements that you specified.

Correct answer: B Without a RUN statement (or a following DATA or PROC step), the DATA step doesn't execute, so it continues to run. Unbalanced quotation marks can also cause the "DATA step running" message if relatively little code follows the unbalanced quotation mark. The other three problems above generate errors in the Log window.

How many statements does the following SAS program contain? proc print data=new.prodsale label double; var state day price1 price2; where state='NC'; label state='Name of State'; run; three four five six

Correct answer: C The five statements are: 1) the PROC PRINT statement (two lines long); 2) the VAR statement; 3) the WHERE statement (on the same line as the VAR statement); 4) the LABEL statement; and 5) the RUN statement

Which of the following variable names is valid? 4BirthDate $Cost _Items_ Tax-Rate

Correct answer: C Variable names follow the same rules as SAS data set names. They can be 1 to 32 characters long, must begin with a letter (A-Z, either uppercase or lowercase) or an underscore, and can continue with any combination of numerals, letters, or underscores.

If you submit the following program, how does the output look? options pagesize=55 nonumber; proc tabulate data=clinic.admit; class actlevel; var age height weight; table actlevel,(age height weight)*mean; run; options linesize=80; proc means data=clinic.heart min max maxdec=1; var arterial heart cardiac urinary; class survive sex; run; The PROC MEANS output has a print line width of 80 characters, but the PROC TABULATE output has no print line width. The PROC TABULATE output has no page numbers, but the PROC MEANS output has page numbers. Each page of output from both PROC steps is 55 lines long and has no page numbers, and the PROC MEANS output has a print line width of 80 characters. The date does not appear on output from either PROC step.

Correct answer: C When you specify a system option, it remains in effect until you change the option or end your SAS session, so both PROC steps generate output that is printed 55 lines per page with no page numbers. If you don't specify a system option, SAS uses the default value for that system option.

SAS date values are the number of days since which date? January 1, 1900 January 1, 1950 January 1, 1960 January 1, 1970

Correct answer: C A SAS date value is the number of days from January 1, 1960, to the given date.

If you want to sort your data and create a temporary data set named Calc to store the sorted data, which of the following steps should you submit? proc sort data=work.calc out=finance.dividend; run; proc sort dividend out=calc; by account; run; proc sort data=finance.dividend out=work.calc; by account; run; proc sort from finance.dividend to calc; by account; run;

Correct answer: C In a PROC SORT step, you specify the DATA= option to specify the data set to sort. The OUT= option specifies an output data set. The required BY statement specifies the variable(s) to use in sorting the data.

A syntax error occurs when Some data values are not appropriate for the SAS statements that are specified in a program. the form of the elements in a SAS statement is correct, but the elements are not valid for that usage. program statements do not conform to the rules of the SAS language. none of the above

Correct answer: C Syntax errors are common types of errors. Some SAS system options and features of the code editing window can help you identify syntax errors. Other types of errors include data errors and logic errors.

What generally happens when a syntax error is detected? SAS continues processing the step. SAS continues to process the step, and the Log window displays messages about the error. SAS stops processing the step in which the error occurred, and the Log window displays messages about the error. SAS stops processing the step in which the error occurred, and the Output window displays messages about the error.

Correct answer: C Syntax errors generally cause SAS to stop processing the step in which the error occurred. When a program that contains an error is submitted, messages regarding the problem also appear in the Log window. When a syntax error is detected, the Log window displays the word ERROR, identifies the possible location of the error, and gives an explanation of the error.

Assuming you are using SAS code and not special SAS windows, which one of the following statements is false? LIBNAME statements can be stored with a SAS program to reference the SAS library automatically when you submit the program. When you delete a libref, SAS no longer has access to the files in the library. However, the contents of the library still exist on your operating system. Librefs can last from one SAS session to another. You can access files that were created with other vendors' software by submitting a LIBNAME statement.

Correct answer: C The LIBNAME statement is global, which means that librefs remain in effect until you modify them, cancel them, or end your SAS session. Therefore, the LIBNAME statement assigns the libref for the current SAS session only. You must assign a libref before accessing SAS files that are stored in a permanent SAS data library.

What should you do after submitting the following program in the Windows or UNIX operating environment? proc print data=mysales; where state='NC; run; Submit a RUN statement to complete the PROC step. Recall the program. Then add a quotation mark and resubmit the corrected program. Cancel the submitted statements. Then recall the program, add a quotation mark, and resubmit the corrected program. Recall the program. Then replace the invalid option and resubmit the corrected program.

Correct answer: C This program contains an unbalanced quotation mark. When you have an unbalanced quotation mark, SAS is often unable to detect the end of the statement in which it occurs. Simply adding a quotation mark and resubmitting your program usually does not solve the problem. SAS still considers the quotation marks to be unbalanced. To correct the error, you need to resolve the unbalanced quotation mark before you recall, correct, and resubmit the program.

What is the default length for the numeric variable Balance? Name Balance Adams 105.73 Geller 107.89 Martinez 97.45 Noble 182.5 5 6 7 8

Correct answer: D The numeric variable Balance has a default length of 8. Numeric values (no matter how many digits they contain) are stored in 8 bytes of storage unless you specify a different length.

As you write and edit SAS programs it's a good idea to: begin DATA and PROC steps in column one. indent statements within a step. begin RUN statements in column one. do all of the above.

Correct answer: D Although you can write SAS statements in almost any format, a consistent layout enhances readability and enables you to understand the program's purpose. It's a good idea to begin DATA and PROC steps in column one, to indent statements within a step, to begin RUN statements in column one, and to include a RUN statement after every DATA step or PROC step.

In order for the date values 05May1955 and 04Mar2046 to be read correctly, what value must the YEARCUTOFF= option have? a value between 1947 and 1954, inclusive 1955 or higher 1946 or higher any value

Correct answer: D As long as you specify an informat with the correct field width for reading the entire date value, the YEARCUTOFF= option doesn't affect date values that have four-digit years.

What is a SAS library? collection of SAS files, such as SAS data sets and catalogs in some operating environments, a physical collection of SAS files a group of SAS files in the same folder or directory. all of the above

Correct answer: D Every SAS file is stored in a SAS library, which is a collection of SAS files, such as SAS data sets and catalogs. In some operating environments, a SAS library is a physical collection of files. In others, the files are only logically related. In the Windows and UNIX environments, a SAS library is typically a group of SAS files in the same folder or directory.

Which of the following statements selects from a data set only those observations for which the value of the variable Style is RANCH, SPLIT, or TWOSTORY? where style='RANCH' or 'SPLIT' or 'TWOSTORY'; where style in 'RANCH' or 'SPLIT' or 'TWOSTORY'; where style in (RANCH, SPLIT, TWO-STORY); where style in ('RANCH','SPLIT','TWOSTORY');

Correct answer: D In the WHERE statement, the IN operator enables you to select observations based on several values. You specify values in parentheses and separated by spaces or commas. Character values must be enclosed in quotation marks and must be in the same case as in the data set.

What happens if you submit the following program? proc sort data=clinic.stress out=maxrates; by maxhr; run; proc print data=maxrates label double noobs; label rechr='Recovery Heart Rate; var resthr maxhr rechr date; where toler='I' and resthr>90; sum fee; run; Log messages indicate that the program ran successfully. A "PROC SORT running" message appears at the top of the active window, and a log message may indicate an error in a statement that seems to be valid. A log message indicates that an option is not valid or not recognized. A "PROC PRINT running" message appears at the top of the active window, and a log message may indicate that a quoted string has become too long or that the statement is ambiguous.

Correct answer: D The missing quotation mark in the LABEL statement causes SAS to misinterpret the statements in the program. When you submit the program, SAS is unable to resolve the PROC step, and a "PROC PRINT running" message appears at the top of the active window.

In a DATA step, how can you reference a temporary SAS data set named Forecast? Forecast Work.Forecast Sales.Forecast (after assigning the libref Sales) only a and b above

Correct answer: D To reference a temporary SAS file in a DATA step or PROC step, you can specify the one-level name of the file (for example, Forecast) or the two-level name using the libref Work (for example, Work.Forecast).

Which of the following files is a permanent SAS file? Sashelp.PrdSale Sasuser.MySales Profits.Quarter1 all of the above

Correct answer: D To store a file permanently in a SAS data library, you assign it a libref other than the default Work. For example, by assigning the libref Profits to a SAS data library, you specify that files within the library are to be stored until you delete them. Therefore, SAS files in the Sashelp and Sasuser and Profits libraries are permanent files.

Which is an example of standard numeric data? -34.245 $ 24,234.25 1/ 2 50%

Correct answer: a A standard numeric value can contain numbers, scientific notation, decimal points, and plus and minus signs. Nonstandard numeric data includes values that contain fractions or special characters such as commas, dollar signs, and percent signs.

What is wrong with this program? data perm.update; infile invent input Item $ 1-13 IDnum $ 15-19 Instock 21-22 BackOrd 24-25; total=instock+backord; run; missing semicolon on second line missing semicolon on third line incorrect order of variables incorrect variable type

Correct answer: a A semicolon is missing from the second line. It will cause an error because the INPUT statement will be interpreted as invalid INFILE statement options.

Which is an example of standard numeric data? -34.245 $24,234.25 1/2 50%

Correct answer: a A standard numeric value can contain numbers, scientific notation, decimal points, and plus and minus signs. Nonstandard numeric data includes values that contain fractions or special characters such as commas, dollar signs, and percent signs.

Which statement is false regarding an ARRAY statement? It is an executable statement. It can be used to create variables. It must contain either all numeric or all character elements. It must be used to define an array before the array name can be referenced.

Correct answer: a An ARRAY statement is not an executable statement; it merely defines an array.

In the SAS windowing environment for Microsoft Windows and UNIX, what is the purpose of closing the HTML destination in the code shown below? ods html close; ods rtf ... ; It conserves system resources. It simplifies your program. It makes your program compatible with other hardware platforms. It makes your program compatible with previous versions of SAS.

Correct answer: a By default, in the SAS windowing environment for Microsoft Windows and UNIX, SAS programs produce HTML output. If you want only RTF output, it's a good idea to close the HTML destination before creating RTF output, as an open destination uses system resources.

During the compilation phase, SAS scans each statement in the DATA step, looking for syntax errors. Which of the following is not considered a syntax error? incorrect values and formats invalid options or variable names missing or invalid punctuation missing or misspelled keywords

Correct answer: a Syntax checking can detect many common errors, but it cannot verify the values of variables or the correctness of formats.

When the code shown below is run, what file will be referenced by the links in D:\Output\contents.html? ods html body='d:\output\body.html' contents='d:\output\contents.html' frame='d:\output\frame.html'; D:\Output\body.html D:\Output\contents.html D:\Output\frame.html There are no links from the file D:\Output\contents.html.

Correct answer: a The CONTENTS= option creates a table of contents containing links to the body file, D:\Output\body.html.

Which statement is false regarding DO UNTIL statements? The condition is evaluated at the top of the loop, before the enclosed statements are executed. The enclosed statements are always executed at least once. SAS statements in the DO loop are executed until the specified condition is true. The DO loop must have a closing END statement.

Correct answer: a The DO UNTIL condition is evaluated at the bottom of the loop, so the enclosed statements are always excecuted at least once.

Which of the following statements is true regarding direct access of data sets? You cannot specify END= with POINT=. You cannot specify OUTPUT with POINT=. You cannot specify STOP with END=. You cannot specify FIRST. with LAST.

Correct answer: a The END= option and POINT= option are incompatible in the same SET statement. Use one or the other in your program.

The variable Address2 contains values such as Piscataway, NJ. How do you assign the two-letter state abbreviations to a new variable named State? State=scan(address2,2); State=scan(address2,13,2); State=substr(address2,2); State=substr(address2,13,2);

Correct answer: a The SCAN function is used to extract words from a character value when you know the order of the words, when their position varies, and when the words are marked by some delimiter. In this case, you don't need to specify delimiters, because the blank and the comma are default delimiters.

In the following program, complete the statement so that the program stops generating observations when Distance reaches 250 miles or when 10 gallons of fuel have been used. data work.go250; set perm.cars; do gallons=1 to 10 ... ; Distance=gallons*mpg; output; end; run; while(Distance<250) when(Distance>250) over(Distance le 250) until(Distance=250)

Correct answer: a The WHILE expression causes the DO loop to stop executing when the value of Distance becomes equal to or greater than 250.

What is the length of the variable Type, as created in the DATA step below? data finance.newloan; set finance.records; TotLoan+payment; if code='1' then Type='Fixed'; else Type='Variable'; length type $ 10; run; 5 8 10 it depends on the first value of Type

Correct answer: a The length of a new variable is determined by the first reference in the DATA step, not by data values. In this case, the length of Type is determined by the value Fixed. The LENGTH statement is in the wrong place; it must occur before any other reference to the variable in the DATA step. if it were: data finance.newloan; set finance.records; TotLoan+payment; length type $ 10; if code='1' then Type='Fixed'; else Type='Variable'; run; then the length of variable type = 10

Based on the DATA step shown below, in what order will the variables be stored in the new data set? data perm.update; infile invent; input IDnum $ Item $ 1-13 Instock 21-22 BackOrd 24-25; Total=instock+backord; run; IDnum Item InStock BackOrd Total Item IDnum InStock BackOrd Total Total IDnum Item InStock BackOrd Total Item IDnum InStock BackOrd

Correct answer: a The order in which variables are defined in the DATA step determines the order in which the variables are stored in the data set.

Which of the following statements will store your formats in a permanent catalog? libname library 'c:\sas\formats\lib'; proc format lib=library ...; libname library 'c:\sas\formats\lib'; format lib=library run; ...; library='c:\sas\formats\lib'; proc format library ...; library='c:\sas\formats\lib'; proc library ...;

Correct answer: a To store formats in a permanent catalog, you first write a LIBNAME statement to associate the libref with the SAS data library in which the catalog will be stored. Then add the LIB= (or LIBRARY=) option to the PROC FORMAT statement, specifying the name of the catalog.

The data set Survey.Health includes the following variables. Which is a poor candidate for PROC MEANS analysis? IDnum Age Height Weight

Correct answer: a Unlike Age, Height, or Weight, the values of IDnum are unlikely to yield any useful statistics.

Which of the following statements is true regarding BY group processing? BY variables must be either indexed or sorted. Summary statistics are computed for BY variables. BY group processing is preferred when you are categorizing data that contains few variables. BY group processing overwrites your data set with the newly grouped observations.

Correct answer: a Unlike CLASS processing, BY group processing requires that your data already be indexed or sorted in the order of the BY variables. You might need to run the SORT procedure before using PROC MEANS with a BY group.

On January 1 of each year, $5000 is invested in an account. Complete the DATA step below to determine the value of the account after 15 years if a constant interest rate of ten percent is expected. data work.invest; ... Capital+5000; capital+(capital*.10); end; run; do count=1 to 15; do count=1 to 15 by 10%; do count=1 to capital; do count=capital to (capital*.10);

Correct answer: a Use a DO loop to perform repetitive calculations starting at 1 and looping 15 times.

Which SAS statement associates the fileref Crime with the raw data file C:\States\Data\Crime.dat? filename crime 'c:\states\data\crime.dat'; filename crime c:\states\data\crime.dat; fileref crime 'c:\states\data\crime.dat'; filename 'c:\states\data\crime' crime.dat;

Correct answer: a You assign a fileref by using a FILENAME statement in the same way that you assign a libref by using a LIBNAME statement.

Which set of statements is equivalent to the code shown below? if code='1' then Type='Fixed'; if code='2' then Type='Variable'; if code^='1' and code^='2' then Type='Unknown'; if code='1' then Type='Fixed'; else if code='2' then Type='Variable'; else Type='Unknown'; if code='1' then Type='Fixed'; if code='2' then Type='Variable'; else Type='Unknown'; if code='1' then type='Fixed'; else code='2' and type='Variable'; else type='Unknown'; if code='1' and type='Fixed'; then code='2' and type='Variable'; else type='Unknown';

Correct answer: a You can write multiple ELSE statements to specify a series of mutually exclusive conditions. The ELSE statement must immediately follow the IF-THEN statement in your program. An ELSE statement executes only if the previous IF-THEN/ELSE statement is false.

If ODS is set to its default settings in the SAS windowing environment for Microsoft Windows and UNIX, what types of output are created by the code below? ods html file='c:\myhtml.htm'; ods pdf file='c:\mypdf.pdf'; HTML and PDF PDF only HTML, PDF, and listing No output is created because ODS is closed by default.

Correct answer: a HTML output is created by default in the SAS windowing environment for Microsoft Windows and UNIX, so these statements create HTML and PDF output.

Which SAS statement reads the value for code (the first field), and holds the record until an INPUT statement reads the remaining value from the same record in the same iteration of the DATA step? input code $2. @; input code $ 2. @@; retain code; none of the above

Correct answer: a An INPUT statement is used to read the value for code. The single @ sign at the end of the INPUT statement holds the current record for a later INPUT statement in the same iteration of the Data Step.

Which SAS statement correctly reads the values for Fname, Lname, Address, City, State and Zip in order? 1---+----10---+----20--- Lawrence Caldwell 1010 lake Street Anaheim CA 94122 Rachel Chevont 3719 Olive View Road Hartford Ct 06183 input Fname $ Lname $ / Address $ 20. / City $ State $ Zip $; input Fname $ Lname $ /; Address $ 20. /; City $ State $ Zip $; input / Fname $ Lname $ / Address $ 20. City $ State $ Zip $; input / Fname $ Lname $; / Address $ 20.; City $ State $ Zip $;

Correct answer: a The INPUT statement uses the / line pointer control to move the input pointer forward from the first record to the second record, and from the second record to the third record. The / line pointer control only moves the input pointer forward and must be specified after the instructions for reading the values in the current record. You should place a semicolon only at the end of a complete INPUT statement.

Which option below, when used in a DATA step, writes an observation to the data set after each value for Activity has been read? do choice = 1 to 3; input Activity : $ 10. @; output; end; run; do choice = 1 to 3; input Activity : $ 10. @; end; output; run; do choice = 1 to 3; input Activity : $ 10. @; end; run; both a and b

Correct answer: a The OUTPUT statement must be in the loop so that each time a value for Activity is read, an observation is immediately written to the data set.

What is the default value of the YEARCUTOFF = system option? 1920 1910 1900 1930

Correct answer: a The default value of YEARCUTOFF

Which SAS statement repetitively executes several statements when the value of an index variable named count ranges from 1 to 50, incremented by 5? do count = 1 to 50 by 5; do while count = 1 to 50 by 5; do count = 1 to 50 + 5; do while (count = 1 to 50 + 5);

Correct answer: a The iterative DO statement begins the execution of a loop based on the value of an index variable. Here, the loop executes when the value of count ranges from 1 to 50, incremented by 5.

Which of the following FORMAT procedures is written correctly? proc format lib=library value colorfmt; 1='Red' 2='Green' 3='Blue' run; proc format lib=library; value colorfmt 1='Red' 2='Green' 3='Blue'; run; proc format lib=library; value colorfmt; 1='Red' 2='Green' 3='Blue' run; proc format lib=library; value colorfmt 1='Red'; 2='Green'; 3='Blue'; run;

Correct answer: b A semicolon is needed after the PROC FORMAT statement. The VALUE statement begins with the keyword VALUE and ends with a semicolon after all the labels have been defined.

You can place the FORMAT statement in either a DATA step or a PROC step. What happens when you place it in a DATA step? You temporarily associate the formats with variables. You permanently associate the formats with variables. You replace the original data with the format labels. You make the formats available to other data sets.

Correct answer: b By placing the FORMAT statement in a DATA step, you permanently associate the defined format with variables.

Which of the following is not created during the compilation phase? the data set descriptor the first observation the program data vector the _N_ and _ERROR_ automatic variables

Correct answer: b During the compilation phase, the program data vector is created. The program data vector includes the two automatic variables _N_ and _ERROR_. The descriptor portion of the new SAS data set is created at the end of the compilation phase. The descriptor portion includes the name of the data set, the number of observations and variables, and the names and attributes of the variables. Observations are not written until the execution phase.

For the program below, select an iterative DO statement to process all elements in the contrib array. data work.contrib; array contrib{4} qtr1-qtr4; ... contrib{i}=contrib{i}*1.25; end; run; do i=4; do i=1 to 4; do until i=4; do while i le 4;

Correct answer: b In the DO statement, you specify the index variable that represents the values of the array elements. Then specify the start and stop positions of the array elements.

Which of the following would you use to compare the result of investing $4,000 a year for five years in three different banks that compound interest monthly? Assume a fixed rate for the five-year period. DO WHILE statement nested DO loops DO UNTIL statement a DO group

Correct answer: b Place the monthly calculation in a DO loop within a DO loop that iterates once for each year. The DO WHILE and DO UNTIL statements are not used here because the number of required iterations is fixed. A non-iterative DO group would not be useful.

What is the purpose of the URL= suboptions shown below? ods html body='d:\output\body.html' (url='body.html') contents='d:\output\contents.html' (url='contents.html') frame='d:\output\frame.html'; To create absolute link addresses for loading the files from a server. To create relative link addresses for loading the files from a server. To allow HTML files to be loaded from a local drive. To send HTML output to two locations.

Correct answer: b Specifying the URL= suboption in the file specification provides a URL that ODS uses in the links it creates. Specifying a simple (one name) URL creates a relative link address to the file.

There are 500 observations in the data set Usa. What is the result of submitting the following program? data work.getobs5; obsnum=5; set company.usa(keep=manager payroll) point=obsnum; stop; run; an error an empty data set continuous loop a data set that contains one observation

Correct answer: b The DATA step outputs observations at the end of the DATA step. However, in this program, the STOP statement stops processing before the end of the DATA step. An explicit OUTPUT statement is needed in order to produce an observation. need this: data work.getobs5; obsnum=5; set company.usa(keep=manager payroll) point=obsnum; output; run;

Which of the following statements is false regarding the program shown below? data work.invest; do year=1990 to 2004; Capital+5000; capital+(capital*.10); output; end; run; The OUTPUT statement writes current values to the data set immediately. The last value for Year in the new data set is 2005. The OUTPUT statement overrides the automatic output at the end of the DATA step. The DO loop performs 15 iterations.

Correct answer: b The OUTPUT statement overrides the automatic output at the end of the DATA step. On the last iteration of the DO loop, the value of Year, 2004, is written to the data set.

How is the variable Amount labeled and formatted in the PROC PRINT output? data credit; infile creddata; input Account $ 1-5 Name $ 7-25 Type $ 27 Transact $ 29-35 Amount 37-50; label amount='Amount of Loan'; format amount dollar12.2; run; proc print data=credit label; label amount='Total Amount Loaned'; format amount comma10.; run; label Amount of Loan, format DOLLAR12.2 label Total Amount Loaned, format COMMA10. label Amount, default format The PROC PRINT step does not execute because two labels and two formats are assigned to the same variable.

Correct answer: b The PROC PRINT output displays the label Total Amount Loaned for the variable Amount and formats this variable using the COMMA10. format. Temporary labels or formats that are assigned in a PROC step override permanent labels or formats that are assigned in a DATA step.

Suppose you need to create the variable FullName by concatenating the values of FirstName, which contains first names, and LastName, which contains last names. What's the best way to remove extra blanks between first names and last names? data work.maillist; set retail.maillist; length FullName $ 40; fullname=trim firstname||' '||lastname; run; data work.maillist; set retail.maillist; length FullName $ 40; fullname=trim(firstname)||' '||lastname; run; data work.maillist; set retail.maillist; length FullName $ 40; fullname=trim(firstname)||' '||trim(lastname); run; data work.maillist; set retail.maillist; length FullName $ 40; fullname=trim(firstname||' '||lastname); run;

Correct answer: b The TRIM function removes trailing blanks from character values. In this case, extra blanks must be removed from the values of FirstName. Although answer c also works, the extra TRIM function for the variable LastName is unnecessary. Because of the LENGTH statement, all values of FullName are padded to 40 characters.

Suppose you run a program that causes three DATA errors. What is the value of the automatic variable _ERROR_ when the observation that contains the third error is processed? 0 1 2 3

Correct answer: b The default value of _ERROR_ is 0, which means there is no data error. When an error occurs, whether one error or multiple errors, the value is set to 1.

Select the ARRAY statement that defines the array in the following program. data coat; input category high1-high3 / low1-low3; array compare{2,3} high1-high3 low1-low3; do i=1 to 2; do j=1 to 3; compare{i,j}=round(compare{i,j}*1.12); end; end; datalines; 5555 9 8 7 6 43 21 8888 21 12 34 64 13 14 15 16 ; run; array compare{1,6} high1-high3 low1-low3; array compare{2,3} high1-high3 low1-low3; array compare{3,2} high1-high3 low1-low3; array compare{3,3} high1-high3 low1-low3;

Correct answer: b The nested DO loops indicate that the array is named compare and is a two-dimensional array that has two rows and three columns.

The table of contents created by the CONTENTS= option contains a numbered heading for: each procedure. each procedure that creates output. each procedure and DATA step. each HTML file created by your program.

Correct answer: b The table of contents contains a numbered heading for each procedure that creates output.

If you submit the following program, which variables appear in the new data set? data work.cardiac(drop=age group); set clinic.fitness(keep=age weight group); if group=2 and age>40; run; none Weight Age, Group Age, Weight, Group

Correct answer: b The variables Age, Weight, and Group are specified using the KEEP= option in the SET statement. After processing, Age and Group are dropped in the DATA statement.

The format JOBFMT was created in a FORMAT procedure. Which FORMAT statement will apply it to the variable JobTitle in the program output? format jobtitle jobfmt; format jobtitle jobfmt.; format jobtitle=jobfmt; format jobtitle='jobfmt';

Correct answer: b To associate a user-defined format with a variable, PLACE A PERIOD AT THE END OF THE FORMAT NAME when it is used in the FORMAT statement.

There is no end-of-file condition when you use direct access to read data, so how can your program prevent a continuous loop? Do not use a POINT= variable. Check for an invalid value of the POINT= variable. Do not use an END= variable. Include an OUTPUT statement.

Correct answer: b To avoid a continuous loop when using direct access, either include a STOP statement or use programming logic that executes a STOP statement when the data step encounters an invalid value of the POINT= variable. If SAS reads an invalid value of the POINT= variable, it sets the automatic variable _ERROR_ to 1. You can use this information to check for conditions that cause continuous looping.

Which DO statement would not process all the elements in the factors array shown below? array factors{*} age height weight bloodpr; do i=1 to dim(factors); do i=1 to dim(*); do i=1,2,3,4; do i=1 to 4;

Correct answer: b To process all the elements in an array, you can either specify the array dimension or use the DIM function with the array name as the argument.

Which program contains an error? data clinic.stress(drop=timemin timesec); infile tests; input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33 RecHR 35-37 TimeMin 39-40 TimeSec 42-43 Tolerance $ 45; TotalTime=(timemin*60)+timesec; SumSec+totaltime; run; proc print data=clinic.stress; label totaltime='Total Duration of Test'; format timemin 5.2; drop sumsec; run; proc print data=clinic.stress(keep=totaltime timemin); label totaltime='Total Duration of Test'; format timemin 5.2; run; data clinic.stress; infile tests; input ID $ 1-4 Name $ 6-25 RestHR 27-29 MaxHR 31-33 RecHR 35-37 TimeMin 39-40 TimeSec 42-43 Tolerance $ 45; TotalTime=(timemin*60)+timesec; keep id totaltime tolerance; run;

Correct answer: b To select variables, you can use a DROP or KEEP statement in any DATA step. You can also use the DROP= or KEEP= data set options following a data set name in any DATA or PROC step. However, you cannot use DROP or KEEP statements in PROC steps. drop sumsec;

Within the data set Hrd.Temp, PayRate is a character variable and Hours is a numeric variable. What happens when the following program is run? data work.temp; set hrd.temp; Salary=payrate*hours; run; SAS converts the values of PayRate to numeric values. No message is written to the log. SAS converts the values of PayRate to numeric values. A message is written to the log. SAS converts the values of Hours to character values. No message is written to the log. SAS converts the values of Hours to character values. A message is written to the log.

Correct answer: b When this DATA step is executed, SAS automatically converts the character values of PayRate to numeric values so that the calculation can occur. Whenever data is automatically converted, a message is written to the SAS log stating that the conversion has occurred.

A typical value for the character variable Target is 123,456. Which statement correctly converts the values of Target to numeric values when creating the variable TargetNo? TargetNo=input(target,comma6.); TargetNo=input(target,comma7.); TargetNo=put(target,comma6.); TargetNo=put(target,comma7.)

Correct answer: b You explicitly convert character values to numeric values by using the INPUT function. Be sure to select an informat that can read the form of the values.

Which statement identifies a raw data file to be read with the fileref Products and specifies that the DATA step read only records 1-15? infile products obs 15; infile products obs=15; input products obs=15; input products 1-15;

Correct answer: b You use an INFILE statement to specify the raw data file to be read. You can specify a fileref or an actual filename (in quotation marks). The OBS= option in the INFILE statement enables you to process only records 1 through n.

Which INPUT statement correctly reads the values for City, State, and Zip? 1---+----10---+----20--- Dina Fields 904 Maple Circle Durham NC 27713 Elizabeth Garrison 1293 Oak Avenue Chapel Hill NC 27614 David Harrington 2426 Elmwood Lane Raleigh NC 27803 input #3 City $ State $ Zip $; input #3 City & $ 11. State $ Zip $; input #3 City $ 11. + 2 State $ 2. + 2 Zip $ 5.; all of the above

Correct answer: b A combination of modified and simple list input can be used to read the values for City, State, and Zip. You need to use modified list input to read the values for City, because one of the values is longer than eight characters and contains an embedded blank. You cannot use formatted input, because the values do not begin and end in the same column in each record. NOTE: &= indicates that a character value can have one or more single embedded blanks. This format modifier reads the value from the next non-blank column until the pointer reaches two consecutive blanks, the defined length of the variable, or the end of the input line, whichever comes first. ex: Chapel Hill

Which pointer control can be used to read records non-sequentially? @n #n + n /

Correct answer: b The #n line pointer control is used to read records non-sequentially. The #n specifies the absolute number of the line to which you want to move the pointer.

Which SAS statement checks for the condition that Record equals C and executes a single statement to read the values for Amount? if record = c then input @3 Amount comma7.; if record =' C' then input @3 Amount comma7.; if record =' C' then do input @3 Amount comma7.; if record = C then do input @3 Amount comma7.;

Correct answer: b The IF-THEN statement defines the condition that Record equals C and executes an INPUT statement to read the value for Amount when the condition is true. C must be enclosed in quotation marks and must be specified exactly as shown because it is a character value.

Which SAS statement correctly reads the values for Flavor and Quantity? Make sure the length of each variable can accommodate the values shown. chocolate chip 10,453 oatmeal 10,187 peanut butter 11,546 sugar 12,331 input Flavor & $ 9. Quantity : comma.; input Flavor & $ 14. Quantity : comma.; input Flavor : $ 14. Quantity & comma.; input Flavor $ 14. Quantity : comma.;

Correct answer: b The INPUT statement uses list input with format modifiers and informats to read the values for each variable. The ampersand (&) modifier enables you to read character values that contain single embedded blanks. The colon (:) modifier enables you to read nonstandard data values and character values that are longer than eight characters, but which contain no embedded blanks.

The minimum width of the TIMEw. informat is: 4 5 6 7

Correct answer: b The minimum acceptable field width for the TIMEw. informat is five. If you specify a w value less than five, you will receive an error message in the SAS log.

Which choice below is an example of a sum statement? totalpay = 1; totalpay + 1; totalpay* 1; totalpay by 1;

Correct answer: b The sum statement adds the result of an expression to an accumulator variable. The + sign

Suppose your program creates two variables from an input file. Both variables are stored as SAS date values: FirstDay records the start of a billing cycle, and LastDay records the end of that cycle. The code for calculating the total number of days in the cycle would be: TotDays = lastday-firstday; TotDays = lastday-firstday + 1; TotDays = lastday/ firstday; You cannot use date values in calculations.

Correct answer: b To find the number of days spanned by two dates, subtract the first day from the last day and add one. Because SAS date values are numeric values, they can easily be used in calculations.

A record that is being held by a single trailing at sign (@) is automatically released when: the input pointer moves past the end of the record. the next iteration of the DATA step begins. another INPUT statement that has an @ executes. another value is read from the observation.

Correct answer: b Unlike the double trailing at sign (@@), the single trailing at sign (@) is automatically released when control returns to the top of the DATA step for the next iteration. The trailing @ does not toggle on and off. If another INPUT statement that has a trailing @ executes, the holding effect is still on.

Select the DO WHILE statement that would generate the same result as the program below. data work.invest; capital=100000; do until(Capital gt 500000); Year+1; capital+(capital*.10); end; run; do while(Capital ge 500000); do while(Capital=500000); do while(Capital le 500000); do while(Capital>500000);

Correct answer: c Because the DO WHILE loop is evaluated at the top of the loop, you specify the condition that must exist in order to execute the enclosed statements.

Suppose the YEARCUTOFF= system option is set to 1920. Which MDY function creates the date value for January 3, 2020? MDY(1,3,20) MDY(3,1,20) MDY(1,3,2020) MDY(3,1,2020)

Correct answer: c Because the YEARCUTOFF= system option is set to 1920, SAS sees the two-digit year value 20 as 1920. Four-digit year values are always read correctly.

Frequency distributions work best with variables that contain continuous values. numeric values. categorical values. unique values.

Correct answer: c Both continuous values and unique values can result in lengthy, meaningless tables. Frequency distributions work best with categorical values.

By default, PROC FREQ creates a table of frequencies and percentages for which data set variables? character variables numeric variables both character and numeric variables none: variables must always be specified

Correct answer: c By default, PROC FREQ creates a table for all variables in a data set.

The default statistics produced by the MEANS procedure are n, mean, minimum, maximum, and... median range standard deviation standard error of the mean.

Correct answer: c By default, the MEANS procedure produces the n, mean, minimum, maximum, and standard deviation.

Column input specifies the variable's name, followed by a dollar ($) sign if the values are character values, and the beginning and ending column locations of the raw data values. Which is not an advantage of column input? It can be used to read character variables that contain embedded blanks. No placeholder is required for missing data. Standard as well as nonstandard data values can be read. Fields do not have to be separated by blanks or other delimiters.

Correct answer: c Column input is useful for reading standard values only.

Which is not an advantage of column input? It can be used to read character variables that contain embedded blanks. No placeholder is required for missing data. Standard as well as nonstandard data values can be read. Fields do not have to be separated by blanks or other delimiters.

Correct answer: c Column input is useful for reading standard values only.

Which statement is false regarding the use of DO loops? They can contain conditional clauses. They can generate multiple observations. They can be used to combine DATA and PROC steps. They can be used to read data.

Correct answer: c DO loops are DATA step statements and cannot be used in conjunction with PROC steps.

If you don't specify the LIBRARY= option on the PROC FORMAT statement, your formats are stored in Work.Formats, and they exist ... only for the current procedure. only for the current DATA step. only for the current SAS session. permanently.

Correct answer: c If you do not specify the LIBRARY= option, formats are stored in a default format catalog named Work.Formats. As the libref Work implies, any format that is stored in Work.Formats is a temporary format that exists only for the current SAS session.

Consider the IF-THEN statement shown below. When the statement is executed, which expression is evaluated first? if finlexam>=95 and (research='A' or (project='A' and present='A')) then Grade='A+'; finlexam>=95 research='A' project='A' and present='A' research='A' or (project='A' and present='A')

Correct answer: c Logical comparisons that are ENCLOSED IN PARENTHESES are evaluated as true or false before they are compared to other expressions. In the example above, the AND comparison within the NESTED PARENTHESES IS EVALUATED BEFORE BEING COMPARED TO THE OR COMPARISON.

The DATA step executes: continuously if you use the POINT= option and the STOP statement. once for each variable in the output data set. once for each observation in the input data set. until it encounters an OUTPUT statement.

Correct answer: c The DATA step executes once for each observation in the input data set. You use the POINT= option with the STOP statement to prevent continuous looping.

Unless otherwise directed, the DATA step executes... once for each compilation phase. once for each DATA step statement. once for each record in the input file. once for each variable in the input file.

Correct answer: c The DATA step executes once for each record in the input file, unless otherwise directed.

When the code shown below is run, what will the file D:\Output\frame.html display? ods html body='d:\output\body.html' contents='d:\output\contents.html' frame='d:\output\frame.html'; The file D:\Output\contents.html. The file D:\Output\frame.html. The files D:\Output\contents.html and D:\Output\body.html. It displays no other files.

Correct answer: c The FRAME= option creates an HTML file that integrates the table of contents and the body file.

Due to growth within the 919 area code, the telephone exchange 555 is being reassigned to the 920 area code. The data set Clients.Piedmont includes the variable Phone, which contains telephone numbers in the form 919-555-1234. Which of the following programs will correctly change the values of Phone? data work.piedmont(drop=areacode exchange); set clients.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then scan(phone,1,3)='920'; run; data work.piedmont(drop=areacode exchange); set clients.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then phone=scan('920',1,3); run; data work.piedmont(drop=areacode exchange); set clients.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then substr(phone,5,3)='920'; run; data work.piedmont(drop=areacode exchange); set clients.piedmont; Areacode=substr(phone,1,3); Exchange=substr(phone,5,3); if areacode='919' and exchange='555' then phone=substr('920',1,3); run;

Correct answer: c The SUBSTR function replaces variable values if it is placed on the left side of an assignment statement. When placed on the right side (as in Question 7), the function extracts a substring.

How many records will be read for each execution of the DATA step? 1---+----10---+----20--- skirt black Cotton 036499 $44.98 Skirt Navy Linen 36899 $51.50 Dress Red Silk 037299 $76.98 data spring.sportswr; infile newitems; input #1 Item $ Color $ #3 @8 Price comma6. #2 Fabric $ #3 SKU $ 1-6; run; one two three four

Correct answer: c The first time the DATA step executes, the first three records are read, and an observation is written to the data set. During the second iteration, the next three records are read, and the second observation is written to the data set. During the third iteration, the last three records are read, and the final observation is written to the data set.

During each execution of the following DO loop, the value of Earned is calculated and is added to its previous value. How many times does this DO loop execute? data finance.earnings; Amount=1000; Rate=.075/12; do month=1 to 12; Earned+(amount+earned)*rate; end; run; 0 1 12 13

Correct answer: c The number of iterations is determined by the DO statement's stop value, which in this case is 12.

Based on the ARRAY statement below, select the array reference for the array element q50. array ques{3,25} q1-q75; ques{q50} ques{1,50} ques{2,25} ques{3,0}

Correct answer: c This two-dimensional array would consist of three rows of 25 elements. The first row would contain q1 through q25, the second row would start with q26 and end with q50, and the third row would start with q51 and end with q75.

Choose the statement below that selects rows in which the amount is less than or equal to $5000 the account is 101-1092 or the rate equals 0.095. where amount <= 5000 and account='101-1092' or rate = 0.095; where (amount le 5000 and account='101-1092') or rate = 0.095; where amount <= 5000 and (account='101-1092' or rate eq 0.095); where amount <= 5000 or account='101-1092' and rate = 0.095;

Correct answer: c To ensure that the compound expression is evaluated correctly, you can use parentheses to group account='101-1092' or rate eq 0.095

If SAS detects syntax errors, then... data set variables will contain missing values. the DATA step does not compile. the DATA step still compiles, but it does not execute. the DATA step still compiles and executes.

Correct answer: c When SAS cannot detect syntax errors, the DATA step compiles, but it does not execute.

When the code shown below is run, what will the file D:\Output\body.html contain? ods html body='d:\output\body.html'; proc print data=work.alpha; run; proc print data=work.beta; run; ods html close; The PROC PRINT output for Work.Alpha. The PROC PRINT output for Work.Beta. The PROC PRINT output for both Work.Alpha and Work.Beta. Nothing. No output will be written to D:\Output\body.html.

Correct answer: c When multiple procedures are run while HTML output is open, PROCEDURE OUTPUT IS APPENDED TO THE SAME BODY FILE.

Which of the following programs correctly reads the data set Orders and creates the data set FastOrdr? data catalog.fastordr(drop=ordrtime); set july.orders(keep=product units price); if ordrtime<4; Total=units*price; run; data catalog.orders(drop=ordrtime); set july.fastordr(keep=product units price); if ordrtime<4; Total=units*price; run; data catalog.fastordr(drop=ordrtime); set july.orders(keep=product units price ordrtime); if ordrtime<4; Total=units*price; run; none of the above

Correct answer: c You specify the data set to be created in the DATA statement. The DROP= data set option prevents variables from being written to the data set. Because you use the variable OrdrTime when processing your data, you cannot drop OrdrTime in the SET statement. If you use the KEEP= option in the SET statement, then you must list OrdrTime as one of the variables to be kept.

What happens if you submit the following program? proc sort data=clinic.diabetes; run; proc print data=clinic.diabetes; var age height weight pulse; where sex='F'; run; The PROC PRINT step runs successfully, printing observations in their sorted order. The PROC SORT step permanently sorts the input data set. The PROC SORT step generates errors and stops processing, but the PROC PRINT step runs successfully, printing observations in their original (unsorted) order. The PROC SORT step runs successfully, but the PROC PRINT step generates errors and stops processing.

Correct answer: c The BY statement is required in PROC SORT. Without it, the PROC SORT step fails. However, the PROC PRINT step prints the original data set as requested.

Which SAS program correctly creates a separate observation for each block of data? 1---+----10---+----20---+----30---+----40---+ 1001 apple 1002 banana 1003 cherry 1004 guava 1005 kiwi 1006 papaya 1007 pineapple 1008 raspberry 1009 strawberry data perm.produce; infile fruit; input Item $ Variety : $ 10.; run; data perm.produce; infile fruit; input Item $ Variety : $ 10. @; run; data perm.produce; infile fruit; input Item $ Variety : $ 10. @@; run; data perm.produce; infile fruit @@; input Item $ Variety : $ 10.; run;

Correct answer: c Each record in this file contains three repeating blocks of data values for Item and Variety. The INPUT statement reads a block of values for Item and Variety, and then holds the current record by using the double-trailing at sign (@@). The values in the program data vector are written to the data set as the first observation. In the next iteration, the INPUT statement reads the next block of values for item and variety from the same record.

You can position the input pointer on a specific record by using: column pointer controls. column specifications. line pointer controls. line hold specifiers.

Correct answer: c Information for one observation can be spread out over several records. You can write one INPUT statement that contains line pointer controls to specify the record( s) from which values are read.

What input style should be used to read free-formatted data? column formatted list mixed

Correct answer: c List input should be used to read data that is free-format because you do not need to specify the column locations of the data.

Which SAS statement repetitively executes while the value of Cholesterol is greater than 200? do cholesterol > 200; do cholesterol gt 200; do while (cholesterol > 200); do while cholesterol > 200;

Correct answer: c The DO WHILE statement checks for the condition that Cholesterol is greater than 200. The expression must be enclosed in parentheses. The expression is evaluated at the top of the loop before the loop executes. If the condition is true, the DO WHILE loop executes. If the expression is false the first time it is evaluated, the loop does not execute.

Which pointer control is used to read multiple records sequentially? @n + n / all of the above

Correct answer: c The forward slash (/) line pointer control is used to read multiple records sequentially. Each time a / pointer is encountered, the input pointer advances to the next line. @n and + n are column pointer controls.

Suppose your input data file contains the date expression 13APR2009. The YEARCUTOFF = system option is set to 1910. SAS will read the date as: 13APR1909 13APR1920 13APR2009 13APR2020

Correct answer: c The value of the YEARCUTOFF = system option does not affect four-digit year values. Four-digit values are always read correctly.

Which program does not read the values in the first record as a variable named Item and the values in the second record as two variables named Inventory and Type? 1---+----10---+----20--- Colored Pencils 12 Boxes Watercolor Paint 8 Palettes Drawing Paper 15 Pads data perm.supplies; infile instock pad; input Item & $ 16. / Inventory 2. Type $ 8.; run; data perm.supplies; infile instock pad; input Item & $ 16. / Inventory 2. Type $ 8.; run; data perm.supplies; infile instock pad; input #1 Item & $ 16. Inventory 2. Type $ 8.; run; data perm.supplies; infile instock pad; input Item & $ 16. #2 Inventory 2. Type $ 8.; run;

Correct answer: c The values for Item in the first record are read, then the following / or #n line pointer control advances the input pointer to the second record, to read the values for Inventory and Type.

Suppose the YEARCUTOFF = system option is set to 1920. An input file contains the date expression 12/ 08/ 1925, which is being read with the MMDDYY8. informat. Which date will appear in your data? 08DEC1920 08DEC1925 08DEC2019 08DEC2025

Correct answer: c The w value of the informat MMDDYY8. is too small to read the entire value, so the last two digits of the year are truncated. The last two digits thus become 19 instead of 25. Because the YEARCUTOFF = system option is set to 1920, SAS interprets this year as 2019. To avoid such errors, be sure to specify an informat that is wide enough for your date expressions.

Which is true for the following statements (X indicates a header record)? if code =' X' then do; if _n_ > 1 then output; Total = 0; input Name $ 3-20; end; _N_ equals the number of times the DATA step has begun to execute. When code =' X' and _n_ > 1 are true, an OUTPUT statement is executed. Each header record causes an observation to be written to the data set. a and b

Correct answer: d _N_ is an automatic variable whose value is the number of times the DATA step has begun to execute. The expression _n_ > 1 defines a condition where the DATA step has executed more than once. When the conditions code =' X' and _n_ > 1 are true, an OUTPUT statement is executed, and Total is initialized to zero. Thus, each header record except the first one causes an observation to be written to the data set.

Which keyword, when added to the PROC FORMAT statement, will display all the formats in your catalog? CATALOG LISTFMT FMTCAT FMTLIB

Correct answer: d Adding the keyword FMTLIB to the PROC FORMAT statement displays a list of all the formats in your catalog, along with descriptions of their values.

At the start of DATA step processing, during the compilation phase, variables are created in the program data vector (PDV), and observations are set to: blank missing 0 there are no observations.

Correct answer: d At the bottom of the DATA step, the compilation phase is complete, and the descriptor portion of the new SAS data set is created. There are no observations because the DATA step has not yet executed.

In the data set Work.Invest, what would be the stored value for Year? data work.invest; do year=1990 to 2004; Capital+5000; capital+(capital*.10); end; run; missing 1990 2004 2005

Correct answer: d At the end of the fifteenth iteration of the DO loop, the value for Year is incremented to 2005. Because this value exceeds the stop value, the DO loop ends. At the bottom of the DATA step, the current values are written to the data set.

Assuming that the data set Company.USA has five or more observations, what is the result of submitting the following program? data work.getobs5; obsnum=5; set company.usa(keep=manager payroll) point=obsnum; output; stop; run; an error an empty data set a continuous loop a data set that contains one observation

Correct answer: d By combining the POINT= option with the OUTPUT and STOP statements, your program can output a single observation.

Which statement below is false regarding the use of arrays to create variables? The variables are added to the program data vector during the compilation of the DATA step. You do not need to specify the array elements in the ARRAY statement. By default, all character variables are assigned a length of eight. Only character variables can be created.

Correct answer: d Either numeric or character variables can be created by an ARRAY statement.

Formatted input can be used to read: standard free-format data standard data in fixed fields nonstandard data in fixed fields both standard and nonstandard data in fixed fields

Correct answer: d Formatted input can be used to read both standard and nonstandard data in fixed fields.

Filerefs remain in effect until . . . you change them. you cancel them. you end your SAS session. all of the above

Correct answer: d Like LIBNAME statements, FILENAME statements are global; they remain in effect until you change them, cancel them, or end your SAS session.

Which keyword can be used to label missing numeric values as well as any values that are not specified in a range? LOW MISS MISSING OTHER

Correct answer: d MISS and MISSING are invalid keywords, and LOW does not include missing numeric values. The keyword OTHER can be used in the VALUE statement to label missing values as well as any values that are not specifically included in a range.

The data sets Ensemble.Spring and Ensemble.Summer both contain a variable named Blue. How do you prevent the values of the variable Blue from being overwritten when you merge the two data sets? data ensemble.merged; merge ensemble.spring(in=blue) ensemble.summer; by fabric; run; data ensemble.merged; merge ensemble.spring(out=blue) ensemble.summer; by fabric; run; data ensemble.merged; merge ensemble.spring(blue=navy) ensemble.summer; by fabric; run; data ensemble.merged; merge ensemble.spring(rename=(blue=navy)) ensemble.summer; by fabric; run;

Correct answer: d Match-merging overwrites same-named variables in the first data set with same-named variables in subsequent data sets. TO PREVENT OVERWRITING, RENAME VARIABLES BY USING THE RENAME= DATA SET OPTION IN THE MERGE STATEMENT.

The COMMAw.d informat can be used to read which of the following values? 12,805 $177.95 18% all of the above

Correct answer: d The COMMAw.d informat strips out special characters, such as commas, dollar signs, and percent signs, from numeric data and stores only numeric values in a SAS data set.

Now consider the revised program below. What is the value of Count after the third observation is read? 1. 10 2. 20 3. 4. 40 5. 50 data work.newnums; infile numbers; input Tens 2-3; retain Count 100; count+tens; run; missing 0 100 130

Correct answer: d The RETAIN statement assigns an initial value of 100 to the variable Count, so the value of Count in the third observation would be 100+10+20+0, or 130.

The variable IDCode contains values such as 123FA and 321MB. The fourth character identifies sex. How do you assign these character codes to a new variable named Sex? Sex=scan(idcode,4); Sex=scan(idcode,4,1); Sex=substr(idcode,4); Sex=substr(idcode,4,1);

Correct answer: d The SUBSTR function is best used when you know the exact position of the substring to extract from the character value. You specify the position to start from and the number of characters to extract.

Which of the following can determine the length of a new variable? the length of the variable's first reference in the DATA step the assignment statement the LENGTH statement all of the above

Correct answer: d The length of a variable is determined by its first reference in the DATA step. When creating a new character variable, SAS allocates as many bytes of storage space as there are characters in the reference to that variable. The first reference to a new variable can also be made with a LENGTH statement or an assignment statement.

When creating a format with the VALUE statement, the new format's name cannot end with a number, cannot end with a period, cannot be the name of a SAS format, and...cannot be the name of a data set variable. must be at least two characters long. must be at least eight characters long. must begin with a dollar sign ($) if used with a character variable.

Correct answer: d The name of a format that is created with a VALUE statement must begin with a dollar sign ($) if it applies to a character variable.

How many observations will the data set Work.Earn contain? data work.earn; Value=2000; do year=1 to 20; Interest=value*.075; value+interest; output; end; run; 0 1 19 20

Correct answer: d The number of observations is based on the number of times the OUTPUT statement executes. The new data set has 20 observations, one for each iteration of the DO loop.

At the beginning of the execution phase, the value of _N_ is 1, the value of _ERROR_ is 0, and the values of the remaining variables are set to: 0 1 undefined missing

Correct answer: d The remaining variables are initialized to missing. Missing numeric values are represented by periods, and missing character values are represented by blanks.

Consider the small raw data file and program shown below. What is the value of Count after the fourth record is processed? 1. 10 2. 20 3. 4. 40 5. 50 data work.newnums; infile numbers; input Tens 2-3; Count+tens; run; missing 0 30 70

Correct answer: d The sum statement adds the result of the expression that is on the right side of the plus sign to the numeric variable that is on the left side. The new value is then retained for subsequent observations. The sum statement ignores the missing value, so the value of Count in the fourth observation would be 10+20+0+40, or 70.

What belongs within the braces of this ARRAY statement? array contrib{?} qtr1-qtr4; quarter quarter* 1-4 4

Correct answer: d The value in parentheses indicates the number of elements in the array. In this case, there are four elements.

Which statement will limit a PROC MEANS analysis to the variables Boarded, Transfer, and Deplane by boarded transfer deplane; class boarded transfer deplane; output boarded transfer deplane; var boarded transfer deplane;

Correct answer: d To specify the variables that PROC MEANS analyzes, add a VAR statement and list the variable names.

Which function calculates the average of the variables Var1, Var2, Var3, and Var4? mean(var1,var4) mean(var1-var4) mean(of var1,var4) mean(of var1-var4)

Correct answer: d Use a variable list to specify a range of variables as the function argument. When specifying a variable list, be sure to precede the list with the word OF. If you omit the word OF, the function argument might not be interpreted as expected.

Within the data set Furnitur.Bookcase, the variable Finish contains values such as ash/cherry/teak/matte-black. Which of the following creates a subset of the data in which the values of Finish contain the string walnut? Make the search for the string case-insensitive. data work.bookcase; set furnitur.bookcase; if index(finish,walnut) = 0; run; data work.bookcase; set furnitur.bookcase; if index(finish,'walnut') > 0; run; data work.bookcase; set furnitur.bookcase; if index(lowcase(finish),walnut) = 0; run; data work.bookcase; set furnitur.bookcase; if index(lowcase(finish),'walnut') > 0; run;

Correct answer: d Use the INDEX function in a subsetting IF statement, enclosing the character string in quotation marks. Only those observations in which the function locates the string and returns a value greater than 0 are written to the data set.

Which of the following statements is false about BY-group processing? When you use the BY statement with the SET statement: The data sets listed in the SET statement must be indexed or sorted by the values of the BY variable(s). The DATA step automatically creates two variables, FIRST. and LAST., for each variable in the BY statement. FIRST. and LAST. identify the first and last observation in each BY group, respectively. FIRST. and LAST. are stored in the data set.

Correct answer: d When you use the BY statement with the SET statement, the DATA step creates the TEMPORARY variables FIRST. and LAST. THEY ARE NOT STORED IN THE DATA SET.

Using ODS statements, how many types of output can you generate at once? 1 (only HTML output) 2 3 as many as you want

Correct answer: d You can generate any number of output types as long as you open the ODS destination for each type of output you want to create.

Which of these is false? Ranges in the VALUE statement can specify... a single value, such as 24 or 'S'. a range of numeric values, such as 0-1500. a range of character values, such as 'A'-'M'. a list of numeric and character values separated by commas, such as 90,'B',180,'D',270.

Correct answer: d You can list values separated by commas, BUT THE LIST MUST CONTAIN EITHER ALL NUMERIC VALUES OR ALL CHARACTER VALUES. Data set variables are either numeric or character.

A typical value for the numeric variable SiteNum is 12.3. Which statement correctly converts the values of SiteNum to character values when creating the variable Location? Location=dept||'/'||input(sitenum,3.1); Location=dept||'/'||input(sitenum,4.1); Location=dept||'/'||put(sitenum,3.1); Location=dept||'/'||put(sitenum,4.1);

Correct answer: d You explicitly convert numeric values to character values by using the PUT function. Be sure to select a format that can read the form of the values.

What is the purpose of the PATH= option? ods html path='d:\output' (url=none) body='body.html' contents='contents.html' frame='frame.html'; It creates absolute link addresses for loading HTML files from a server. It creates relative link addresses for loading HTML files from a server. It allows HTML files to be loaded from a local drive. It specifies the location of HTML file output.

Correct answer: d You use the PATH= option to specify the location for HTML files to be stored. When you use the PATH= option, you don't need to specify the full path name for the body, contents, or frame files.

Which of the following actions occurs at the beginning of an iteration of the DATA step? The automatic variables _N_ and _ERROR_ are incremented by one. The DATA step stops execution. The descriptor portion of the data set is written. The values of variables created in programming statements are re-set to missing in the program data vector.

Correct answer: d By default, at the end of the DATA step, the values in the program data vector are written to the data set as an observation, control returns to the top of the DATA step, the value of the automatic variable _N_ is incremented by one, and the values of variables created in programming statements are reset to missing. The automatic variable _ERROR_ is reset to 0 if necessary.

Which program correctly reads instream data? data finance.newloan; input datalines; if country='JAPAN'; MonthAvg=amount/12; 1998 US CARS 194324.12 1998 US TRUCKS 142290.30 1998 CANADA CARS 10483.44 1998 CANADA TRUCKS 93543.64 1998 MEXICO CARS 22500.57 1998 MEXICO TRUCKS 10098.88 1998 JAPAN CARS 15066.43 1998 JAPAN TRUCKS 40700.34 ; data finance.newloan; input Year 1-4 Country $ 6-11 Vehicle $ 13-18 Amount 20-28; if country='JAPAN'; MonthAvg=amount/12; datalines; run; data finance.newloan; input Year 1-4 Country 6-11 Vehicle 13-18 Amount 20-28; if country='JAPAN'; MonthAvg=amount/12; datalines; 1998 US CARS 194324.12 1998 US TRUCKS 142290.30 1998 CANADA CARS 10483.44 1998 CANADA TRUCKS 93543.64 1998 MEXICO CARS 22500.57 1998 MEXICO TRUCKS 10098.88 1998 JAPAN CARS 15066.43 1998 JAPAN TRUCKS 40700.34 ; data finance.newloan; input Year 1-4 Country $ 6-11 Vehicle $ 13-18 Amount 20-28; if country='JAPAN'; MonthAvg=amount/12; datalines; 1998 US CARS 194324.12 1998 US TRUCKS 142290.30 1998 CANADA CARS 10483.44 1998 CANADA TRUCKS 93543.64 1998 MEXICO CARS 22500.57 1998 MEXICO TRUCKS 10098.88 1998 JAPAN CARS 15066.43 1998 JAPAN TRUCKS 40700.34 ;

Correct answer: d To read instream data, you specify a DATALINES statement and data lines, followed by a null statement (single semicolon) to indicate the end of the input data. Program a contains no DATALINES statement, and the INPUT statement doesn't specify the fields to read. Program b contains no data lines, and the INPUT statement in program c doesn't specify the necessary dollar signs for the character variables Country and Vehicle.

Formatted input can be used to read standard free-format data standard data in fixed fields nonstandard data in fixed fields both standard and nonstandard data in fixed fields

Correct answer: d Formatted input can be used to read both standard and nonstandard data in fixed fields.

A great advantage of storing dates and times as SAS numeric date and time values is that: they can easily be edited. they can easily be read and understood. they can be used in text strings like other character values. they can be used in calculations like other numeric values.

Correct answer: d In addition to tracking time intervals, SAS date and time values can be used in calculations like other numeric values. This lets you calculate values that involve dates much more easily than in other programming languages.

When you write a DATA step to create one observation per detail record you need to: distinguish between header and detail records. keep header data as a part of each observation until the next header record is encountered. hold the current record so other values in the record can be read. all of the above

Correct answer: d In order to create one observation per detail record, it is necessary to distinguish between header and detail records. Use a RETAIN statement to keep header data as part of each observation until the next header record is encountered. You also need to use the @ line-hold specifier to hold the current record so other values in the record can be read.

Shown below are date and time expressions and corresponding SAS datetime informats. Which date and time expression cannot be read by the informat that is shown beside it? 30May2000: 10: 03: 17.2 DATETIME20. 0May00 10: 03: 17.2 DATETIME18. 30May2000/ 10: 03 DATETIME15. 30May2000/ 1003 DATETIME14.

Correct answer: d In the time value of a date and time expression, you must use delimiters to separate the values for hour, minutes, and seconds.

The COMMAw.d informat can be used to read which of the following values? 12,805 $ 177.95 18% all of the above

Correct answer: d The COMMAw.d informat strips out special characters, such as commas, dollar signs, and percent signs, from numeric data and stores only numeric values in a SAS data set.

An input data file has date expressions in the form 10222001. Which SAS informat should you use to read these dates? DATE6. DATE8. MMDDYY6. MMDDYY8.

Correct answer: d The SAS informat MMDDYYw.

Which is true for the double trailing at sign (@@)? It enables the next INPUT statement to read from the current record across multiple iterations of the DATA step. It must be the last item specified in the INPUT statement. It is released when the input pointer moves past the end of the record. all of the above

Correct answer: d The double trailing at sign (@@) enables the next INPUT statement to read from the current record across multiple iterations of the DATA step. It must be the last item specified in the INPUT statement. A record that is being held by the double trailing at sign (@@) is not released until the input pointer moves past the end of the record, or until an INPUT statement that has no line-hold specifier executes.

Which INPUT statement correctly reads the values for ID in the fourth record, and then returns to the first record to read the values for Fname and Lname? 1---+----10---+----20--- George Chesson 3801 Woodside Court Garner NC 27529 XM065 Floyd James Coldwell 123-A Tarbert Apex NC 27529 XM065 Lawson input #4 ID $ 5. #1 Fname $ Lname $; input #4 ID $ 1-5 #1 Fname $ Lname $; input #4 ID $ #1 Fname $ Lname $; all of the above

Correct answer: d The first #n line pointer control enables you to read the values for ID from the fourth record. The second #n line pointer control moves back to the first record and reads the values for Fname and Lname. You can use formatted input, column input, or list input to read the values for ID. formatted: input #4 ID $ 5. Column: input #4 ID $ 1-5 List: input #4 ID $ *modified would be: input item : $10.

Which SAS program reads the values for ID and holds the record for each value of Quantity, so that three observations are created for each record? 1---+----10---+----20---+----30 2101 21,208 19,047 22,890 2102 18,775 20,214 22,654 2103 19,763 22,927 21,862 data work.sales; infile unitsold; input ID $; do week = 1 to 3; input Quantity : comma.; output; end; run; data work.sales; infile unitsold; input ID $ @@; do week = 1 to 3; input Quantity : comma.; output; end; run; data work.sales; infile unitsold; input ID $ @; do week = 1 to 3; input Quantity : comma.; output; end; run; data work.sales; infile unitsold; input ID $ @; do week = 1 to 3; input Quantity : comma. @; output; end; run;

Correct answer: d This raw data file contains an ID field followed by repeating fields. The first INPUT statement reads the values for ID and uses the @ line-hold specifier to hold the current record for the next INPUT statement in the DATA step. The second INPUT statement reads the values for Quantity. When all of the repeating fields have been read, control returns to the top of the DATA step, and the record is released.

What type of input should be used to read in values that contain embedded blanks and have values of nonstandard numeric values? Column Formatted List Modified List

D Modified list input can be used to read the values that contain embedded blanks and nonstandard values.

What is the result of submitting the following program? data work.addtoend; set clinic.stress2 end=last; if last; run; an error an empty data set a continuous loop a data set that contains one observation

D This program uses the END= option to name a temporary variable that contains an end-of-file marker. That variable — last — is set to 1 when the SET statement reads the last observation of the data set.

How many characters can be used in a label? 40 96 200 256

D When specifying a label, enclose it in quotation marks and limit the label to 256 characters.

How to get Data into SAS

Data Sources: 1. Data entry using SAS Table Editor 2. Include raw data within SAS Program ("datalines") 3. Read in raw data from an external file ("infile") 4. Import data from another software product (e.g. Excel) 5. Read in a pre-existing SAS dataset (permanent or created within the program)

Two Parts of a SAS Program

Data Step Proc Step

Input data from raw data, you use the ____ keyword:

Datalines; EX: data work.sample; input firstname $ gender $ age; datalines; John Male 22 Jane Female 19 ; run;

Log Window:

Displays status messages created when SAS executes (runs a program)

You use the _____ and ____ to change variables permanently in the _____ step.

Format, Label, data step; Since it is permanent it will display in the descriptor portion of the dataset. ex: DATA work.bonus; set data1.empdata; Bonus = Salary * .1; Label Bonus = 'Annual Bonus'; Format Bonus Dollar12.2; Run;

Descriptor Portion View:

General data Set Information *data set name * data set label *date/time created *storage information *number of observations Information for each variable *name *Type *Length *Position *format *Informat *label

Context-sensitive Help

Highlight a keyword and press F1

Change the Observation column with a variable by using the _____ statement.

ID statement: proc print data=data1.empdata; id JobCode; var EmpID Salary; run;

Input Data from a separate text file, use the ___ statement:

Infile; Ex: data work.sample; infile 'D:\UCSB\sample.txt'; input name $ gender $ age; run;

Assigning Column labels:

LABELUnits=n 'Number of Units' DOB='Date of Birth';

You use the _____ and ____ to change variables temporarily in the _____ step.

Label, Format, proc step ex: PROC PRINT data=work.bonus label; Label Bonus = 'Incentive Bonus'; Format Bonus Dollar12.; Run;

SAS developed

NC State University

Data Portion View:

Observations for Each Variable Name Age Height Weight John 19 69 180 Mary 22 63 130 John 21 67 165

Options:

Options nodate nonumber ls=72; code ex:

What is ODS?

Output Delivery System: ODS is designed to overcome the limitations of traditional SAS output. It provides a method of delivering output in a variety of formats, and makes the formatted output easy to access.

Creating User defined Formats: General Form:

PROC FORMAT; VALUE format-name range1='label' RUN;

Import an Excel File code:

PROC IMPORT DATAFILE='X:\PStat 130\data1\DallasLA.xls' OUT=WORK.tdfwlax DBMS=XLS REPLACE; SHEET='DFWLAX'; GETNAMES=YES; RUN;

How to display the descriptor portion of the dataset?

Proc Contents: proc contents data=data1.empdata; run;

Observation is in the:

Row

SAS stands for:

Statistical Analysis System

Numeric Variable:

Stored as floating point numbers in 8 bytes of storage by default. Eight bytes of floating point storage provide space for 16 to 17 significant digits. You are not restricted to 8 digits.

Request Column totals by ___ statement:

Sum Statement; proc print data=data1.empdata noobs; var JobCode EmpID Salary; sum Salary; run;

What two statements are used to create a raw data file?

The keyword _NULL_ in the DATA statement enables you to use the power of the DATA step without actually creating a SAS data set. You use FILE and PUT statements to write observations from a SAS data set to a raw data file. The FILE statement specifies the raw data file and the PUT statement describes the lines to write to the raw data file. The filename and location specified in the FILE statement must be enclosed in quotation marks.

What happens when you merge two datasets using a BY statement?

The variable used for the BY statement will merge only that match between the two datasets. For example: Data1: + Data2: = Data3: ID name ID Sex ID name Sex 1 Ash 1 F 1 Ash F 2 Nic 2 F 2 Nic F 3 sam

What happens when you merge two data sets that have variables with the same name?

The variables in the second data set override the first data set.

Drop Statement:

Use in data step when you want to keep the majority and therefore only drop a few. data work.tempdata; set data1.empdata; drop EmpID Hire; run;

Editor window:

Used to edit, execute and save SAS programs

Operands:

Variable names and constants

By default Proc Print displays all observations but if we want a subset of the data: Use _____ statement in proc print: (ex: with logistical operators: |, $, ^=, ...)

Where statement: proc print data=data1.stresstest noobs; where MaxHR>=170; run; where JobCode='FLTAT' and Salary>50000; where JobCode='FLTAT' & Salary>50000; where JobCode='PILOT' or JobCode='FLTAT'; where JobCode='PILOT' | JobCode='FLTAT'; where JobCode not in('PILOT','FLTAT'); where JobCode ^ in('PILOT','FLTAT');

What option is used in the ODS HTML statement to change the appearance of the HTML output?

You can change the appearance of HTML output by using the STYLE= option in the ODS HTML statement. The style name doesn't need quotation marks. style = brown

What will produce the same as PROC MEANS?

You can use either PROC MEANS or PROC SUMMARY to create the table. Adding a PRINT option to the PROC SUMMARY statement produces the same report as if you used PROC MEANS.

This will produce: data actors.props3; set actors.props1 actors.props2; by actor; run;

actors.props1: Actor Prop Curly Anvil Larry Ladder Moe Poker + actors.props2: Actor Prop Curly Ladder Moe Pliers = actors.props3: Actor Prop Curly Anvil Curly Ladder Larry Ladder Moe Poker Moe Pliers

Explorer window:

allows you to navigate to libraries, datasets and other SAS objects

To create Subgroups, use the ___ statement.

by statement; proc print data=data1.admit; BY Sex; run;

Data Portion:

contains your data: if you want to see your actual data, then to display your data use: proc print

Informat:

controls the way SAS reads in the data

Format:

controls the way data look SAS outputs data (displays)

Using the set statement, how to create a temporary dataset:

data work.dfwlax; set ia.dfwlax; run;

Variable Statement Ex:

data work.students; input firstname $ gender $ age; datalines; John Male 19 Wendy Female 22 ; run;

Datalines Code:

datalines; < data > ; DATA AgeInfo; input age; datalines; 18 19 20 21 22 ; run; PROC print; run;

Formats:

dollar10.2, comma10.2, MMDDYY10.=(10/16/2016, DATE9.=16OCT2001, WORKDATE.=December 31, 1959

Column Input:

each data value is in a fixed location data work.students; input Name $ 1-6 Gender $ 9-14 Age 18-20; datalines; David Male 19 Amelia Female 23 Ravi Male 17 Ashley Female 20 Jim Male 26 ; run;

List Input:

each data value is separated by a space John Male 22 Wendy Female 19

Use the _____ keyword to set a date back to SAS date value ( number of days since 01/01/1960); the text portion has to written in the form _____ or _____.

evaldate, ddmmmyyyy, ddmmmyy ex: evaldate = '5APR2016'd;

libname command Ex:

libname data1 "X:\PSTAT 130\airlines one";

Using the set statement, how to create a permanent dataset:

libname mydrive 'D:\'; data mydrive.dfwlax; set ia.dfwlax; run;

Descriptor Portion:

metadata on the dataset: file name, when created, sorted dataset, when it was sorted, lables, variables, length...; to display,. Use proc contents

Suppress the Obs Columns:

noobs in proc print: PROC PRINT DATA=SAS-data-set NOOBS; RUN;

Width Options:

proc print data=data1.empdata width=uniform; run;

Only Selected variables, For the data set empdata (found in the library data1), print only the salary and last names of the employees (in that order). Code:

proc print data=data1.empdata; var Salary LastName; run;

format code ex:

proc print data=ia.empdata label; label Lastname='Last Name' Firstname='First Name' Salary='Annual Salary'; format Salary dollar11.2; title1 'Salary Report'; run;

Assigning Column labels #2:

proc print data=ia.empdata label; label Lastname='Last Name' Firstname='First Name' Salary='Annual Salary'; title1 'Salary Report'; run;

title ex:

proc print data=work.march; title1 'The First Line'; title2 'The Second Line'; run; proc print data=work.april; title2 'The Next Line'; run; proc print data=work.may; title 'The Top Line'; run; proc print data=work.june; title3 'The Third Line'; run; proc print data=work.july; title; run;

Print the data set grouped by ActLevel with a subtotal for the Fee column for each ActLevel:

proc sort data=data1.admit out=work.admit; by ActLevel; run; proc print data=work.admit; by ActLevel; sum Fee; run;

Specify JobCode in the BY and ID statements to change the report format. EX:

proc sort data=data1.empdata out=work.empdata; by JobCode; run; proc print data=work.empdata; by JobCode; id JobCode; sum Salary; run;

Proc Sort code Ex:

proc sort data=data1.empdata; by Salary; run; proc sort data=data1.empdata out=work.jobsal; by Salary; run;

Each statement starts with a keyword

proc, var, options, input ....and always end in a semicolon (;)

Results window:

provides bookmarks to each section of SAS output

What is required for one-to-one matching?

requires multiple SET statements. Where same-named variables occur, values that are read from the second data set replace those read from the first data set. Also, the number of observations in the new data set is the number of observations in the smallest original data set.

Variable Statement:

tells SAS the name and type of each variable. Use in data step.

Title and Footnote Options Code:

title1 colored=red height=4 bold justify=left;

Formatting Data Values ex:

to format salary to have commas and dollar signs, date format,...

Keep Statement:

use in data step when you want to keep only a few variables data work.tempdata; set data1.empdata; keep Firstname Lastname; run;

Formatted Input:

uses SAS formats (informats) data students; input Name $ Gender $ Age Enroll mmddyy8.; datalines; David Male 19 06/18/10 Amelia Female 23 08/02/10 Ravi Male 17 07/22/10 Ashley Female . 09/14/10 Jim Male 26 08/26/10 ; run;

Opening and saving files:

• FileTypes - .sas (a SAS program, i.e., executable statements) - .log (contents of the Log window) - .lst (contents of the Output window)


Conjuntos de estudio relacionados

Physical Science Electron Dot Diagram

View Set

SSC101 - Chapter 3, SSC101 - Chapter 6, SSC101 - Chapter 5, SSC101 - Chapter 4, SSC101 - Chapter 2, SSC101 - Chapter 1, SSC102 - Chapter 1, SSC102 - Chapter 2, SSC102 - Chapter 3, SSC102 - Chapter 4, SSC102 - Chapter 5, SSC102 - Chapter 6, SSC102 - C...

View Set

CH-13 Encryption and Hashing Concepts

View Set

Compensation; The Benefit Determination Process (Chapter 12)

View Set