SAS Programming
what are SAS Variable naming conventions?
(1) 1-32 characters long (2) start with letter or underscore. Subsequent characters can be letter numbers or underscores. (3) Can be upper, lower, or mixed case (4) not case sensitive
explain the naming conventions for a libref name:
(1) 1-8 characters (2) must start with letter or underscore. Subsequent characters can be letters, underscores, or numerals (3) can be uppercase, lowercase, or mixed (not case sensitive)
list the formatting recommendations for SAS program:
(1) Begin each statement in a new line (2) use white space to separate words and steps (3) Indent Statements within a step (4) Indent continued lines in a multi-line statement.
define the SAS program Structure:
(1) Statements can begin and end in any column (2) Single statement can span multiple lines (3) Several statements can appear on the same line (4) Unquoted values can be lowercase, uppercase, or mixed case "Unconventional formatting"
what are missing values?
(1) blanks are missing character values (2)periods are missing numeric values
what is a SAS character variable?
(1) contains any value such as letters and numbers, special characters, blanks (2) 1-32,767 characters length (3) 1 byte per character
What is the SAS Programming Process
(1) define the business need (2) Write SAS Program (3) Run the Program (4) Review the Results (5) debug between step 2 and 4
What is the Sort procedure?
(1) replaces the original data set or creates a new one (2) can sort on multiple variables (3) sorts in ascending (default) or descending order (4) does not generate printed output The input data set is overwritten unless the OUT= option is used to specify an output data set
What is a SAS numeric variable?
(1) stores float or binary numeric values (2) have 8 bytes of default storage (3) can store up to 16-17 significant digits.
If you submit a program containing unbalanced quotation marks in SAS Studio, you can simply correct the error and resubmit the program. 1. True 2. False
1. True Also, although SAS allows either single or double quotation marks, you can't mix the types. If you begin with a single quotation mark, you must end with a single quotation mark; otherwise, SAS considers the quotation marks unbalanced.
what is an appropriate format name?
A format name -can be up to 32 characters in length -for character formats, must begin with a dollar sign ($), followed by a letter or underscore -for numeric formats, must begin with a letter or underscore -cannot end in a number -cannot be given the name of a SAS format -cannot include a period in the VALUE statement.
What is one to one match merging?
A single observation in one data set is related to exactly one observation in another data set based on the values of one or more selected variables.
what is one to many merging?
A single observation in one data set is related to more than one observation in another data set based on the values of one or more selected variables.
What is a target variable?
A target variable is a variable to which the result of the function is assigned
What are DO loops?
Allows you to process computations iterating through the data by stating a start and end iteration. DO index-variable=start TO stop <BY increment>; iterated SAS statements...END; ex: do i=1 to 20; Yearly +(Yearly+Amount)*Rate; end; •The values of start, stop, and increment -must be numbers or expressions that yield numbers -are established before executing the loop -if omitted, increment defaults to 1. - •Details of index-variable: -The index-variable is written to the output data set by default. At the termination of the loop, the value ofindex-variableis oneincrementbeyond thestopvalue.
what is non-match merging?
At least one observation in one data set is unrelated to any observation in another data set
What format does the Automatic numeric to character conversion has?
BEST.12 format may create leading blanks
In the example below, the variable X is summed during each iteration of the DO loop. What is the value of X at the completion of this DATA step? data test; do i=1 to 5; do j=1 to 4; x+1; end; end; run; a. 0 b. 5 c. 20 d. 21
C.20 5*4=20
What types of variables is supported by SAS?
Character and numeric
What is a SAS Library?
Collection of SAS files that are referenced and stored as a unit.
What does the SAS Descriptor potion do?
Contains metadata. general properties such as dataset name and # of observations Variable properties include name, type, and length
what are the possible use uses for an if statement?
Data set statement, Data assignment statement
What does the CAT function do?
Does not remove leading or trailing blanks from the arguments before concatenating them.
What is a footnote statement?
FOOTNOTEn "footnote text"; -Footnotes appear at the bottom of the page. -No footnote is printed unless one is specified. -The value of n can be from 1 to 10. -An unnumbered FOOTNOTE is equivalent to FOOTNOTE1. -Footnotes remain in effect until they are changed, canceled, or you end your SAS session.
When you use the SAS/ACCESS LIBNAME statement to assign a libref to a Microsoft Excel workbook, you can open and view the workbook using PROC CONTENTS and Microsoft Excel at the same time. True False
False If SAS has a libref assigned to an Excel workbook, the workbook cannot be opened in Excel. To disassociate a libref, use a LIBNAME statement and specify the libref and the CLEAR option. SAS disconnects from the data source and closes any resources that are associated with that libref's connection.
When you use the SAS/ACCESS LIBNAME statement to assign a libref to a Microsoft Excel workbook, SAS treats each worksheet within the workbook as a library. True False
False SAS treats the workbook as a library, and each worksheet as a SAS data set.
explain the drop and keep options in the input dataset
If you use KEEP= on an input data set, only the listed variables are read in. Any variables not listed are dropped on input and not available in the PDV. Using KEEP= on an input data set is more efficient - only read in the variables you need. When you specify the DROP= or KEEP= option in the SET statement, SAS does not read the excluded variables into the program data vector. If you work with a large data set, you can construct a more efficient DATA step by not reading unneeded variables from the input data set.
Can an output statement be used with a when statement?
Multiple values can be listed in the WHEN expression and use the dataset specified for the output statement
please explain the following variable list code: | Qtr1 | Name | Q3 | Fourth | Total=sum(of _Numeric_)
Name not included
Can you change an established variable type?
No, A variable is character or numeric. After the variable's type is established, it cannot be changed. By following three steps, you can create a new variable with the same name and a different type.
How can you generate a separate observation for each year?
Place an explicit OUTPUT statement inside the DO loop
please explain the following variable list code: | TotJan | Qtr2 | TotFeb | TotMar | Total=sum(of Tot:)
Qtr2 not included
What does the CATS function do?
Removes leading and trailing blanks from the arguments.
what does the CATT function do?
Removes trailing blanks from the arguments.
What is a SAS format?
SAS formats can be used in a PROC step to change how values are displayed in a report. A format is an instruction to write data values. -A format changes the appearance of a variable's value in a report. -The values stored in the data set are notchanged.
Describe the SAS Execution Stage:
SAS initializes the PDV by making everything missing
What types of SAS libraries are there?
SAS work/temporary library and the permanent library
Describe the SAS compilation phase:
Scans the program for syntax errors; translates the program into machine language. Creates the program data vector (PDV) to hold one observation. Creates the descriptor portion of the output data set.
what is a title statement?
TITLEn "title name"; -Titles appear at the top of the page. -The value of n can be from 1 to 10. -An unnumbered TITLE is equivalent to TITLE1. -Titles remain in effect until they are changed, canceled, or you end your SAS session.
What does the CATX function do?
The CATX function removes leading and trailing blanks, inserts delimiters, and returns a concatenated character string. NewVar=CATX(separator,string1,.....,string-n); ex: FullName=catx(' ',Title,FMName,LName);
What does the CEIL function do?
The CEIL function returns the smallest integer greater than or equal to the argument. NewVar=CEIL(argument); ex: x=ceil(4.4); output:5
What does the Compress function do?
The COMPRESS function removes the characters listed in the chars argument from the source. ex: New_ID=compress(ID,' -'); NewVar=compress(var,'delimiters wanted removed');
How do Do loops with items work?
The DO loop is executed once for each item in the list. -The list must be comma separated. ex: do Month='JAN','FEB','MAR'; ... end;
What is a drop statement?
The DROP statement specifies the variables to exclude from the output data set.
What major steps make a SAS program?
The Data step and Proc step
What does the Find function do?
The FIND function searches a target string for a specified substring. VarName or PositionName=find(string,substring,modifiers,startpos); ex: pos5=find(Text,'us','I',10);
What is the firsts data set option?
The FIRSTOBS= data set option specifies a starting point for processing an input data set. This option specifies the number of the first observation to process. FIRSTOBS= and OBS= are often used together to define a range of observations to be processed.
What does the FLOOR function do?
The FLOOR function returns the greatest integer less than or equal to the argument. NewVar=FLOOR(argument); ex: y=floor(3.6); output: 3
What does the ID statement do?
The ID statement specifies the variable or variables to print at the beginning of each row instead of an observation number
Explain the IN data set option:
The IN= data set option creates a variable that indicates whether the data set contributed to building the current observation. variable is a temporary numeric variable that has two possible values: 0 Indicates that the data set did not contributeto the current observation. 1 Indicates that the data set did contribute to the current observation. ex: data empsauc; merge empsau(in=Emps) phonec(in=Cell); by EmpID; if Emps=0 or Cell=0; run;
What does the INT function do?
The INT function returns the integer portion of the argument. NewVar=INT(argument); z=int(3.9); output: 3
What is a keep statement?
The KEEP statement specifies all variables to include in the output data set.
What does the Length function do?
The LENGTH function returns the length of a non-blank character string, excluding trailing blanks. General form of the LENGTH function: NewVar=Length(argument); ex: last_position=length(code);
what is match merging?
The MERGE statement in a DATA step joins observations from two or more SAS data sets into single observations. A BY statement indicates a match-merge and lists the common variable or variables to match. data empsauh; merge empsau phoneh; by EmpID; run;
What is the OBS data set option?
The OBS= data set option specifies an ending point for processing an input data set. This option specifies the number of the last observation to process, not how many observations should be processed. Process observations 1 through 100.
What does the Round Function do?
The ROUND function returns a value rounded to the nearest multiple of rounding unit. NewVar=round(argument, rounding-unit); ex: NewVar=round(42.65,10);
what does the scan function do?
The SCAN function returns the nth word of a character value. NewVar=scan(var, n,delimiter); ex: Fname=scan(Name,2,','); When you use the SCAN function: §A missing value is returned if there are fewer than n words in the string. §If n is negative, the SCAN function selects the word in the character string starting from the end of string. §The length of the created variable is the length of the first argument §Delimiters before the first word have no effect. §Any character or set of characters can serve as delimiters. §Two or more contiguous delimiters are treated as a single delimiter.
What does the split option do?
The SPLIT= option in PROC PRINT specifies a split character to control line breaks in column headings. proc print dat=orion.sales split='*'; var Last_Name; label Last_Name='Last*Name'; run;
what does the TRANWRD function do?
The TRANWRD function replaces or removes all occurrences of a given word (or a pattern of characters) within a character string. NewVar=tranwrd(source, target,replcement); ex: product=tranwrd(Product,'Luci','Lucky'); These details apply when you use the TRANWRD function: §The TRANWRD function does not remove trailing blanks from target or replacement. §If NewVar is not previously defined, it is given a length of 200. §If the target string is not found in the source, then no replacement occurs.
what is an assignment statement?
The assignment statement evaluates an expression and assigns the result to a new or existing variable. new variable= expression
what is a concatenation operator?
The concatenation operator is another way to join character strings. General form of the concatenation operator: NewVar=string1 !! string2;
What is an explicit output statement?
The explicit OUTPUT statement writes the contents of the program data vector (PDV) to the data set or data sets being created. The presence of an explicit OUTPUT statement overrides implicit output. You can direct output to a specific data set or data sets by listing the data set names in the OUTPUT statement.
What is a subsetting if statement?
The subsetting IF statement tests a condition to determine whether the DATA step should continue processing the current observation. subsetting if statement is only valid in a data step
How to change Titles?
To change a title line, submit a TITLE statement with the same number but different text. -This replaces a previous title with the same number. -It cancels all titles with higher numbers.
how tp change the name of a given variable?
Use a LABEL statement and the LABEL option to display descriptive column headings instead of variable names. proc print dat=orion.sales label; label Employee_ID= 'SalesID'; run;
How to create a library?
Using a Libname statement libname 'libref' "path of data"; (1) not required to be in a data or proc step (2) does not require a run statement (3) it executes immediately (4) it remains in effect until changed/cancel or until session ends
what is the value statement?
VALUE format-name range1=' formatted-value1 ' range2=' formatted-value2 ' Each range can be: -a single value -a range of values -a list of values. Formatted Values: -can be up to 32,767 characters in length -are enclosed in quotation marks.
please explain the following variable list code: | Qtr1 | Qtr2 | Var1 | Qtr3 | Qtr4 | Total=sum(of Qtr1-Qtr4)
Var1 is omitted from the output
please explain the following variable list code: | Qtr1 | Second | Q3 | Fourth | Var2 | Total=sum(of Qtr1--Fourth)
Var2 is omitted from the output
What is a function?
You can use functions in DATA step statements anywhere that an expression can appear. ex: avgscore=mean(exam1,exam2,exam3)
What is the value of x at the completion of the DATA step? data test; x=15; do until(x>12); x+1; end; run;
a. . (missing) b. 13 c. 15 d. 16 A DO UNTIL is evaluated at the bottom. The initial value of x is 15. The DO loop occurs one time, even though 15 is greater than 12, because the condition is not checked until the bottom of the loop. Therefore, 15 becomes 16 before the condition is checked. Remember, the statements in the DO UNTIL loop are executed at least once.
Look at the values of the two variables below. Which functions would you use to extract the ""portion of the string from each variable? ShipCode. |. SiteMgr Tr"112"k_T. Montry, "75" TD"107"M_R Lee, "124" a. Use the SUBSTR function to extract values from ShipCode, and the SCAN function to extract values from SiteMgr b. Use the SCAN function to extract values from ShipCode, and the SUBSTR function to extract values from SiteMgr
a. Use the SUBSTR function to extract values from ShipCode, and the SCAN function to extract values from SiteMgr The three numbers in ShipCode are in the same position within each string, so you can use the SUBSTR function to extract them from the string. The numbers in SiteMgr are in different locations, but they are in the second position after a delimiter that can be specified. So you can use the SCAN function to extract them.
Select the situation(s) in which you can use the WHERE statement to subset observations. Select all that apply. (There are more than one correct answers, no partial credits given) a. in a PROC step b. in a DATA step, when the variable in the condition is created by an assignment statement c. in a DATA step, when the variable in the condition is in the input data set
a. in a PROC step c. in a DATA step, when the variable in the condition is in the input data set You can use a WHERE statement to subset observations in situations a and c. A subsetting IF statement can be used in situations b and c.
What is the result of this assignment statement given the value of var1 and var2? num=var1+var2/2; var1 | var2 . 10 a. .(missing) b. 0 c. 5 d. 10
a. .(missing) If an operand in an arithmetic expression has a missing value, the result is a missing value.
How many times does this DO loop execute? data work.test; x=15; do while (x<12); x+1; end; run; a. 0 b. 1 c. 12 d. x
a. 0 The DO WHILE expression is evaluated at the top of the loop. Because the expression is false, the DO loop does not execute.
If you specify the colon as the delimiter for the SCAN function, how many words appear in the text string below? Washington:New York:California:New Mexico a. 4 b. 6
a. 4 The SCAN function uses the colon as the delimiter and divides the string into four words.
Which value is returned from the following statement? x=int(7.89); a. 7 b. 7.8 c. 8
a. 7 The INT function returns the integer portion of the argument. In this example the argument is 7.89. So, the INT function returns a value of 7 for x.
Which statement is true concerning the execution phase of the DATA step? a. Data is processed in the program data vector (PDV). b. An implicit OUTPUT occurs at the top of the DATA step. c. An implicit REINITIALIZE occurs at the bottom of the DATA step. d. Variables read from the input data set are set to missing values when SAS returns to the top of the DATA step.
a. Data is processed in the program data vector (PDV). Feedback: During execution, data manipulation occurs in the PDV. An implicit OUTPUT and RETURN (not REINITIALIZE) occurs at the bottom of the DATA step. When SAS returns to the top of the DATA step, variables read from the input table are retained and computed variables are set to missing.
Does this comment contain syntax errors? /* Report created for profit sharing and no one is to observe the code */ proc print data= work.sales; run; a. No. The comment is correctly specified. b. Yes. Every comment line must end with a semicolon. c. Yes. The comment text incorrectly begins on line one. d. Yes. The comment contains a semicolon, which causes an error message.
a. No. The comment is correctly specified. A block comment begins with a forward slash and an asterisk, followed by your comment text, and ends with an asterisk and forward slash. A block comment can be any length, and can contain semicolons.
Consider the values in the First, Middle, and Last variables as shown below (leading or trailing blanks are indicated in "_"). First. |. Middle |. Last. | Sue__. _L_ Farmer___ What is the result after using the following SAS function? CATS(First,Middle,Last) a. SueLFarmer b. Sue L Farmer c. Sue L Farmer
a. SueLFarmer The CAT function does not remove leading or trailing blanks. The CATSfunction strips leading and trailing blanks. The CATX function removes leading and trailing blanks and separates each string with the specified delimiter.
Which statement is false? a. The KEEP statement controls which variables are in the input data set. b. The DROP statement controls which variables to exclude from the output data set. c. The KEEP= option in the DATA statement names the variables to include in the output data set. d. The DROP= option in the SET statement names the variables to exclude from being read into the PDV.
a. The KEEP statement controls which variables are in the input data set. The KEEP statement controls which variables are in the output data set, not the input data set.
You can use either < or > to define a non-inclusive range in a VALUE statement. a. True b. False
a. True You can only use the < symbol to define a non-inclusive range.
If you run this DATA step, what observations does the data set bonuses contain? data bonuses; merge managers (in=M) staff (in=S); by EmpID; if M=0 and S=1; run; a. only the observations from staff that have no match in managers b. only the observations from managers that have no match in staff c. all observations from both managers and staff, whether or not they match d. no observations
a. only the observations from staff that have no match in managers The subsetting IF statement selects all observations from staff that have no match in managers.
In this DATA step, which SCAN function completes the assignment statement to correctly extract the four-digit year from the Text variable? Select all that apply. (More than one correct answers, no partial credits given) data Scan_Quiz: Text="New Year's Day, January 1st, 2007"; Year=________________________________; run; a. scan(Text,-1) b. scan(Text,4) c. scan(Text,6,', ')
a. scan(Text,-1) c. scan(Text,6,', ') The correct answer is a, and c. The SCAN functions can extract the first word from the end of the string by the default delimiter space or extract the 6th word from the beginning of the string separated by comma as the delimiter.
In this SAS program, is salesbonus a temporary SAS data set? proc means data=salesbonus; class Job_Title; var Amount; run; a. yes b. no
a. yes The data set salesbonus is a temporary data set because it is referenced using a one-level name. SAS assumes that the data set is stored in the temporary work library.
Can you include multiple VALUE statements in a single PROC FORMAT step? a. yes b. no
a. yes You can create multiple user-defined formats in the same PROC FORMAT step by specifying multiple VALUE statements.
how to cancel titles and footnotes?
adding a null title; or footnote; statement at the end of the code
What value will be assigned to Units? data work.comp; set work,sles; Units=Total+Bonus/Qty; run; Partial PDV Total | QTY | Bonus | Units 140. 10. 50. . a. 19 b. 145 c. 3 d. Missing
b. 145 SAS executes the expression on the right side of the assignment statement following normal operator precedence, and the result is assigned to Units. The division occurs first (50/10 is 5), and then the addition occurs (140+5 is 145).
Which statement about SAS libraries is true? a. You refer to a SAS library by a logical name called LIBNAME. b. A SAS library is a collection of one or more SAS files that are referenced and stored as a unit. c. A single SAS library can contain files that are stored in different physical locations. d. At the end of each session, SAS deletes the contents of all SAS libraries.
b. A SAS library is a collection of one or more SAS files that are referenced and stored as a unit. The correct answer is b. You refer to a SAS library by a logical name called a libref. A single SAS library cannot contain files that are stored in different physical locations. And SAS deletes the contents of temporary SAS libraries but not permanent SAS libraries.
When you use the subsetting IF statement, how are observations excluded? a. If the expression is true, SAS excludes the observation from the input data set. b. If the expression is false, SAS excludes the observation from the output data set. c. If the expression is false, SAS excludes the observation from the PDV. d. If the expression is true, SAS excludes the observation from the PDV.
b. If the expression is false, SAS excludes the observation from the output data set.
What happens if you submit the following program? proc print data=work.newsalesemps; run; a. SAS does not execute the step. b. SAS assumes that the keyword PROC is misspelled and executes the PROC PRINT step.
b. SAS assumes that the keyword PROC is misspelled and executes the PROC PRINT step The log will indicate that SAS assumed that the keyword PROC was misspelled, corrected it temporarily, and executed the PROC step.
Within the data set hrd.temp, PayRate is a character variable and Hours is a numeric variable. What happens when the following program is submitted? data temp; set had.temp; Salary=PayRate*Hours; run; a. SAS converts the values of PayRate to numeric values. No message is written to the log. b. SAS converts the values of PayRate to numeric values. A message is written to the log. c. SAS converts the values of Hours to character values. A message is written to the log.
b. SAS converts the values of PayRate to numeric values. A message is written to the log. When this DATA step executes, SAS automatically converts the character values of PayRate to numeric values so that the calculation can occur. Whenever data is automatically converted, a message is written to the SAS log stating that the conversion has occurred.
Which statement correctly describes the result of submitting the DATA step that is shown below? The variables TotalPay and Commission are numeric. The variable Pay is character. data reg.newsales; set reg.sales; TotalPay=Pay+Commission; run; a. The values of the numeric variables TotalPay and Commission are converted to character values because these variables are used in a character context. b. The values of the character variable Pay are converted to numeric values because this variable is used in a numeric context.
b. The values of the character variable Pay are converted to numeric values because this variable is used in a numeric context. The character variable Pay appears in an arithmetic operation, so its values are converted to numeric to complete the calculation.
What result would you expect from submitting this step? proc print data=orion.sales run; a. an HTML report of the work.newsalesemps data set b. an error message in the log c. a LISTING report of the work.newsalesemps data set d. the creation of the temporary data set work.newsalesemps
b. an error message in the log There is a missing semicolon following the data set name. When this step runs, SAS will interpret the word RUN as an option in the PROC PRINT statement (because of the missing semicolon). As a result, the PROC PRINT step will not execute and an error message will be displayed in the log.
Within the data set furn.bookcase, the variable Finish contains values such as ash, cherry, teak, and matte-black. Which of the following creates a subset of the data in which the values of Finish contain the string walnut? Use the I modifier to make the search for the string case-insensitive. data work.bookcase; set furn.bookcase; if find(Finish,'walnut',I)>0; run; b. data work.bookcase; set furn.bookcase; if find(Finish,'walnut','I')>0; run;
b. data work.bookcase; set furn.bookcase; if find(Finish,'walnut','I')>0; run; When you use the 'I' modifier in the FIND function to make the search case-insensitive, you must enclose it in quotation marks. So the correct answer is b.
Which PROC CONTENTS step prints only general information about a SAS library and a listing of the members of the library? proc contents data=orion.country nods;run; b. proc contents data=orion._all_ nods;run; c. proc contents data=orion._all_;run; d. proc contents data=orion.nods _all_;run;
b. proc contents data=orion._all_ nods; run; The correct answer is b. This PROC CONTENTS step generates the specified output. After the keyword _ALL_, you add a space and then the NODS keyword to suppress the descriptor data for each individual file in the library.
Which program changes the values of the variable Examples, found in the data set Before, to the values shown in After? Before after | Examples | | Examples | 326.54. 327 98.2 98 -32.66 -33 1401.75 1402 a. data after; set before; Examples=int(Examples); run; b. data after; set before; Examples=round(Examples); run; c. both a and b are correct
b. data after; set before; Examples=round(Examples); run; The values shown in After have been rounded to the nearest integer. This is done by using the ROUND function without a round-off unit.
What is the value of X at the completion of the DATA step? data work.test; x=15; do while(X<12); x+1; end; run; a. 12 b. 15 c. 16 d. unknown
b. 15 The value of X is set to 15 at the beginning of the DATA step, and because the DO loop never executes, the value of X never increases.
How many observations and variables are in the output data set ShippingZones given the following information?The input data set Shipping contains 5 observations and 3 variables (Product, BoxSize, and Rate). data ShippingZones; set Shipping; Zone=1; output; Zone=2; Rate=(Rate*1.5); run; a. 5 observations and 3 variables b. 5 observations and 4 variables c. 10 observations and 3 variables d. 10 observations and 4 variables
b. 5 observations and 4 variables The explicit OUTPUT statement is sending the ZONE=1 row to the output data set. There is no explicit OUTPUT statement after ZONE=2, so those rows are not making it to the output data set. An implicit OUTPUT is not at the bottom of the DATA step due to the explicit OUTPUT. The four variables are Product, BoxSize, Rate, and Zone.
What is the value of Result? ProdCode=" 808:9670-100"; Result=substr(left(ProdCode),1,3); a. 3 blanks b. 808 c. 100 d. 8.9
b. 808 The LEFT function moves the leading blanks to the end of the value. Then, the SUBSTR function extracts a three-character string starting from position 1
What happens when you submit the following code? data temperate tropical; set flora; output; run; a. Because the OUTPUT statement does not specify a data set, both output data sets are empty. b. All observations in the flora data set are written to both output data sets. c. The DATA step writes the first observation to each output data set and then stops processing because no RETURN statement is specified.
b. All observations in the flora data set are written to both output data sets. An OUTPUT statement writes the current observation to every data set being created unless an explicit data set reference is made.
If the value of Region is missing (.), do these two DATA steps produce the same output? data step 1 using SAS SQL when statement data step 2 using if then statement a. Yes b. No
b. No In the first DATA step, values that do not match any WHEN statement are ignored. In the second DATA step, values that do not match an IF-THEN condition are written to both data sets listed in the DATA statement.
Which of the following steps is typically used to generate reports and graphs? a. DATA b. PROC c. REPORT d. RUN
b. PROC PROC steps are typically used to process SAS data sets (that is, generate reports, graphs, and statistics).
Which variables are in the final output data set work.boots? data work.boots(drop=Product); set seashell.shoes(keep=Product Subsidiary Sales Inventory); where Product='Boot' drop Sales Inventory; Total=sum(Sales,Inventory); run; a. Subsidiary b. Subsidiary and Total c. Product and Subsidiary d. Product, Subsidiary, Sales, and Inventory
b. Subsidiary and Total The variable Subsidiary from the input data set and the calculated variable Total are in the final output data set. Product, Sales, and Inventory are dropped.
A typical value for the character variable Target is 123,456. Which statement correctly converts the values of Target to numeric values when creating the variable TargetNo? a. TargetNo=input(Target,comma6.); b. TargetNo=input(Target,comma7.); c. TargetNo=put(Target,comma6.); d. TargetNo=put(Target,comma7.);
b. TargetNo=input(Target,comma7.); You explicitly convert character values to numeric values by using the INPUT function. Be sure to select an informat that can read the form of the values.
Which statement is false? a. The DO UNTIL loop executes until a condition is true. b. The DO WHILE loop always executes at least one time. c. The DO WHILE loop checks the condition at the top of the loop. d. The DO UNTIL loop checks the condition at the bottom of the loop.
b. The DO WHILE loop always executes at least one time.
The sashelp.class data set contains 19 observations: 9 observations with a sex value of F and 10 observations with a sex value of M. data Females Other; set seashelp.class; if sex='F' then output Females; output Other; run; a. The Other data set contains 10 observations. b. The Other data set contains 19 observations. c. The Female data set contains 10 observations. d. The Female data set contains 19 observations.
b. The Other data set contains 19 observations
Examine the following program and then decide which statement is true. data us; set orion.sales; where Country='US'; run; a. The program reads a temporary data set and creates a permanent data set. b. The program reads a permanent data set and creates a temporary data set. c. The program contains a syntax error and will not execute. d. The program will not execute because you cannot work with permanent and temporary data sets in the same step.
b. The program reads a permanent data set and creates a temporary data set. The DATA statement doesn't specify a libref, so it's creating the temporary data set us. The SET statement reads orion.sales, which is a permanent data set. There are no syntax errors in the program.
What is the result of running the following DATA step? data work.boots; set seashell.shoes(keep=Sales Product Subsidiary); NewSales=Sales*1.25; run; a. The step produces work.boots with three variables. b. The step produces work.boots with four variables. c. The step produces an error due to invalid syntax for the KEEP= option. d. The step produces an error because the Sales variable is not read in from the sashelp.shoes data set.
b. The step produces work.boots with four variables. The data set work.boots is created with the variables of Product, Subsidiary, NewSales, and Sales.
What is the syntax error in this DATA step? data returns_qtr1; set returns_jan(rename=(ID=CustID)(Return=Item)) returns_feb(rename=(Dt=Date)) returns_mar; run; a. You cannot specify more than two data sets in the SET statement. b. There are too many sets of parentheses in the RENAME= option. c. You cannot specify multiple variables in the RENAME= option. d. The BY statement is missing.
b. There are too many sets of parentheses in the RENAME= option. The RENAME= option after the data set returns_jan has an extra set of parentheses. The correct code should be: (rename=(ID=CustID Return=Item))
How will a value of 50000 be displayed if the TIERS format below is applied to the value? proc format; value tiers 2000-<5000='Tier1' value tiers 5000-<10000=Tier2' run; a. Tier1 b. Tier2 c. 50000 d. a missing value
b. Tier2 In Tier1, the less-than symbol is before 50000, which means that value will be excluded from the range. In Tier2, however, 50000 is the starting value of the range and will be included in the tier.
According to the data set shown, what type of variable is ActLevel? ID | DOB | ActLevel 124. 05MAR59. . 22MAY41 3 224 . 4 298 12DEC43. 2 a. numeric b. character c. can't tell from the data shown
b. character The correct answer is b. The variable ActLevel has a missing value in row one that is represented with a blank. Missing character values in SAS are represented with blanks. In addition, the values for ActLevel are left justified, which also indicates that ActLevelis a character variable. If the MISSING SAS system option was turned on, the missing numeric values for ID and DoB would not show periods, as they do here.
Which of the following programs concatenates the data sets, sales and products, in that order? a. data newsales; set products sales; run; b. data newsales; set sales products; run; c. data newsales; set sales; set products; run;
b. data newsales; set sales products; run; You list the data sets that you want to concatenate in the SET statement in the same order that you want to concatenate them. You list both data sets in one SET statement.
Which of the following statements contains valid syntax? a. do 1 to 10 by 2; b. do while (Year>2025); c. do until Earnings<=100000; d. do date='01JAN2019' to '31JAN2019';
b. do while (Year>2025); When WHILE or UNTIL is used in the DO statement, the expression must be in a set of parentheses. In answer choice a, the index variable is missing. In answer choice c, the parentheses are missing around the expression. In answer choice d, the DATE values are character instead of numeric ('01JAN2019'd).
How many observations and variables does the data set shown here contain? Company | Region | Sales A&MRadio. N 63500 Jack's TV S 45800 Sound City S 38900 Music Ltd. N 99500 a. three observations, three variables b. four observations, three variables c. three observations, four variables
b. four observations, three variables The correct answer is b. The SAS data set contains four observations and three variables. Recall that in SAS, observations are the rows in a data set, and variables are the columns in a data set.
Which of the following correctly assigns the libref myfiles to a SAS library in the c:/mysasfiles folder? a. libname orion myfiles 'c:/mysasfiles'; b. libname myfiles 'c:/mysasfiles'; c. libref orion myfiles 'c:/mysasfiles'; d. libref myfiles 'c:/mysasfiles';
b. libname myfiles 'c:/mysasfiles'; The correct answer is b. This LIBNAME statement begins with the keyword LIBNAME, followed by the name of the libref, which is myfiles. It then specifies the physical location of the library, in quotation marks, which is c:/mysasfiles.
Which SET statement has correct syntax? a. set empscn(rename(Country=Location)) empsjp(rename(Region=Location)); b. set empscn(rename=(Country=Location)) empsjp(rename=(Region=Location)); c. set empscn rename=(Country=Location) empsjp rename=(Region=Location);
b. set empscn(rename=(Country=Location)) empsjp(rename=(Region=Location));
Which WHERE statement correctly subsets on the numeric values for May, June, or July and missing character names? a. where Months in (5-7) and Names=.; b. where Months in (5,6,7) and Names=' '; c. where Months in ('5','6','7') and Names='.';
b. where Months in (5,6,7) and Names=' '; The correct answer is b. You specify the value list in the IN operator in parentheses, and separate the values by either commas or blanks. Only character values must be enclosed in quotation marks, and a blank represents a missing character value.
You want to apply the following user-defined format to the numeric variable Age. The values of Age are stored with one decimal place. Which of the following statements is true regarding the PROC FORMAT step? proc format; value $agegp low-65='Non Retirement' 66<- high='Retirement'; run; a. The value 65 is not included in either of the specified ranges. b. The value 66 will be displayed as Retirement. c. The format name does not match the variable type. d. The text strings for the formatted values cannot include spaces.
c. The format name does not match the variable type. A user-defined format that applies to numeric values cannot start with $. The first value range is an inclusive range, which means it includes the first value and the last value, so 65 will be included in the range. The value 66 will not be displayed as Retirement because the less-than symbol appears directly after it, so it will be excluded from the range. The formatted value is always a character string, no matter whether the format applies to character values or numeric values. A character string can consist of any type of character.
If you run this DATA step, what observations does the data set bonuses contain? data bonuses; merge mangers staff; by EmpID; run; a. all of the observations from managers, and only those observations from staff with matching values for EmpID b. all of the observations from staff, and only those observations from managers with matching values for EmpID c. all observations from staff and all observations from managers, whether or not they have matching values d. only those observations from staff and manager with matching values for EmpID
c. all observations from staff and all observations from managers, whether or not they have matching values By default, the output data set of a DATA step that includes a MERGE statement and a BY statement includes all of the observations from all of the input data sets that are listed in the MERGE statement.
How many observations are in the bikeinfo2 output data set given the following input data set work.bikeinfo and code? work.bikeinfo | name | bike | Marco. 12 Angela 10 data bikeinfo2; set bikeinfo; do month=1 to 3; do week=1 to 4; bike=bike+2; end; output; end; run; a. 2 b. 3 c. 6 d. 12 e. 24
c. 6 For each observation read in, 3 observations are created (1 for each of 3 months). So, 2 observations read * 3 months = 6 observations in total.
Suppose the data set shipping contains 25 observations and 3 variables (Product, BoxSize, and Rate). How many observations and variables does the data set shippingzones have? data shippingzones; set shipping; Zone=1; output; Zone=2; Rate=(Rate*1.5); output; Zone=3; Rate=(Rate*1.5); output; run; a. 25 observations, 4 variables b. 75 observations, 3 variables c. 75 observations, 4 variables
c. 75 observations, 4 variables For every observation read, there are three observations created (25*3=75). The variable Zone is added to the existing three variables to create four variables in total.
Which statement about this PROC SORT step is true? proc sort data=orion.staff; out=work.staff_sort; by descending Salary Manager_ID; run; a. The sorted data set overwrites the input data set. b. The observations are sorted by Salary in descending order, and then by Manager_IDin descending order. c. A semicolon should not appear after the input data set name. d. The sorted data set contains only the variables specified in the BY statement.
c. A semicolon should not appear after the input data set name. This PROC SORT step has a syntax error: a semicolon in the middle of the PROC SORT statement. If you correct this syntax error, this step sorts orion.staff by Salary in descending order and by Manager_ID in ascending order. The step then creates the temporary data set staff that contains the sorted observations and all variables.
Which statement reads CityCountry and correctly assigns a value to Country? | CityCountry | Country | Athens,Greece Greece New Delhi, India. India Auckland, NewZeland NewZealand a. Country=scan(CityCountry, 2); b. Country=scan(CityCountry, -1); c. Country=scan(CityCountry, 2, ','); d. Country=scan(CityCountry, 2, ', ');
c. Country=scan(CityCountry, 2, ','); The SCAN function should return the second word using only the comma as a delimiter.
Which SAS format creates the displayed value shown here?$5,950.35 a. DOLLAR4.2 b. COMMA8.2 c. DOLLAR9.2 d. $12
c. DOLLAR9.2 The correct answer is c. The DOLLARw.d format writes numeric values with a leading dollar sign, a comma that separates every three digits, and a period that separates the decimal fraction. The displayed value is nine characters wide, so the total format width, w, is set to 9. This includes the special characters and decimal places. The displayed value contains two decimal places, so d is set to 2.
If the value of PctDec is -3.272, which statement will NOT return a value of -3 for Decrease? a. Decrease=round(PctDec,1); b. Decrease=ceil(PctDec); c. Decrease=floor(PctDec); d. Decrease=int(PctDec);
c. Decrease=floor(PctDec); The ROUND function returns a value rounded to the nearest multiple of the round-off unit. For values less than 0, the CEIL and INT return the same value.
Which statement is false regarding nested DO loops? a. Each DO statement must have a corresponding END statement. b. Each DO loop must have its own index variable. c. Each DO loop must use the same increment value. d. Each DO loop can contain iterated SAS statements.
c. Each DO loop must use the same increment value. When you nest DO loops, you must use different index variables for each loop, and you must be certain that each DO statement has a corresponding END statement. Each DO loop can use different increment values.
Which function returns the greatest integer less than or equal to the argument? a. CEIL b. INT c. FLOOR
c. FLOOR The FLOOR function returns the greatest integer less than or equal to the argument
What is the new value of Code? code="HNL:96701-006"; code=substr(code,1,length(code)-4; a. 96701-006 b. HNL: c. HNL:96701 d. 1-006
c. HNL:96701 When functions are nested, the innermost function executes first and passes its result to an outer function. The LENGTH function returns the length of the string (13) to the SUBSTR function. The SUBSTR function extracts a nine-character string (13-4) starting from position 1.
If you specify the period, comma, and blank as delimiters for the SCAN function, what is the fourth word in the text string below? MR. JONATHAN E. MATTHEWS, PERSONNEL DIRECTOR a. E b. PERSONNEL DIRECTOR c. MATTHEWS d. PERSONNEL
c. MATTHEWS This text string would be divided into six words: MR, JONATHON, E, MATTHEWS, PERSONNEL, and DIRECTOR. MATTHEWS is the fourth word.
Which assignment statement replaces all occurrences of the string MISS with the string MS. in values of the variable Name? a. Name=transwrd(Name,"MISS","MS."); b. Name=tranwrd(Name,"MS.","MISS"); c. Name=tranwrd(Name,"MISS","MS.");
c. Name=tranwrd(Name,"MISS","MS."); The TRANWRD function requires arguments in this order: the variable name, the string to search for, and then the string to use as a replacement string.
What happens if you submit the following program to merge donors1 and donors2, shown below? data merged; merge donors1 donors2; by ID; run; donors1 ID | Type | Units 304 o 16 129. a 48 129. a 50 129. a 57 486. b 63 donors2 ID | Code | Units 488 65. 27 129. 63. 32 438. 62. 39 304. 61. 45 387. 64. 67 a. The merged data set contains some missing values because not all observations have matching observations in the other data set. b. The merged data set contains eight observations. c. The DATA step produces errors.
c. The DATA step produces errors. The two input data sets are not sorted by values of the BY variable, so the DATA step produces errors and stops processing.
Which statement is true regarding the iterative DO loop? a. The start and stop values can be character or numeric values. b. If an increment value is not specified, the default increment is 0. c. The index variable is incremented at the bottom of each DO loop. d. The index variable is not written to the output data set unless specifically kept.
c. The index variable is incremented at the bottom of each DO loop. The index variable is incremented at the bottom of each DO loop. The start and stop values must be numeric when used with the keyword TO. The default increment is 1. The index variable is in the final table unless specifically dropped.
Which user-defined format name is valid? a. $STFMT9 b. $3LEVELS c. _4YEARS d. DOLLAR
c. _4YEARS Character formats begin with a dollar sign and must be followed by a letter or underscore. It cannot end with a number as in Answer choice a. Answer choice b has a dollar sign followed by a number. Also, user-defined formats cannot be the name of a SAS format, as in answer choice d.
After merging data sets empsau and phonec we obtain the following results. Which data set(s) contributed information to the first observation in the output data set empsauc? Partial empsauc | First | Gender | EmpID | Phone | Togar. M 121150. 5555-179 Kylie. F. 121151. Birin M 121152 5555-167 121153. 5555-138 a. empsau b. phonec c. both empsau and phonec d. insufficient information
c. both empsau and phonec Both data sets contributed to the first observation. If one of the data sets had not contributed, you would see missing values for at least one variable in the observation.
Which FORMAT statement formats the variable values as shown below? Birth_Date | Hire_Date | Term_Date 28/09/1968. 01/10/1989. 01/31/09 a. format Birth_Date Emp_Hire_Date mmddyy10. Emp_Term_Date ddmmyy10.; b. format Birth_Date Emp_Hire_Date ddmmyy. Emp_Term_Date mmmyyyy.; c. format Birth_Date Emp_Hire_Date ddmmyy10. Emp_Term_Date mmddyy8.;
c. format Birth_Date Emp_Hire_Date ddmmyy10. Emp_Term_Date mmddyy8.; The correct answer is c. The variables Birth_Date and Emp_Hire_Date are both displayed as a two-digit day, a two-digit month, and a four-digit year; the day precedes the month. The displayed values have a length of 10. This is the DDMMYY10. format. The variable Emp_Term_Date is displayed as a two-digit month, a two-digit day, and a two-digit year; the month precedes the day. The displayed value has a length of 8. This is the MMDDYY8. format. There is no DDMMYYYY or MMMYYYY SAS format.
Which of the following librefs is valid? a. _orionstar b. orion/01 c. or_01 d. 1_or_a
c. or_01 The correct answer is c. This libref follows all three rules for valid librefs. It has a length of one to eight characters, it begins with a letter or underscore, and its remaining characters are letters, numbers, or underscores.
How many step boundaries does this program contain? data work.staff; length First_Name $ 12 Last_Name $18 Job_Title $ 25; infield "&path\newemployees.csv" dlm=' , '; input First_Name $ Last_Name $ Job_Title $ Salary; run; proc print data = work.staff; run; proc means data =work.staff; run; a. four b. five c. six d. seven
c. six RUN, QUIT, DATA, and PROC statements function as step boundaries, which determine when SAS statements take effect and indicate the end of the current step or the beginning of a new step.
Which SUM statement will produce column totals for the variables Quantity and Total_Retail_Price? a. sum=Quantity, Total_Retail_Price; b. sum Quantity, Total_Retail_Price; c. sum Quantity Total_Retail_Price; d. sum=Quantity sum=Total_Retail_Price;
c. sum Quantity Total_Retail_Price; You specify the variable names separated by blanks to display totals for variables in your report.
What does a DATA step typically create? a. raw data file b. program file c. SAS data set d. report
c. SAS data set A DATA step typically creates a SAS data set. However, you can use DATA steps to create raw data, program files, and reports. The DATA step is very flexible.
what are the types of comments?
comment /* hello world*/ comment statement * hello world;
What does the Put function do?
converts numeric to character CharVar=PUT(source,format); ex: data hrdata; keep Phone Code Mobile; set orion.convert; Phone='(' !! put(Code,3.) !! ') ' !! Mobile; run;
what does the Data step do?
creates SAS data set
what does the propcase function do?
creates proper case meaning capital letters for each new word on a given delimiter. NewVar=propcase(argument,delimiter(s)); ex: pname=propcase(Name,' &';
What types of files can a DATA step read as input data? a. SAS data sets b. Microsoft Excel worksheets c. raw data files d. all of the above
d. all of the above A DATA step can read a SAS data set, an Excel worksheet, or a raw data file as input data.
What is the final value of Year given the following step? data invest; do Year=2010 to 2019; Capital+5000; Capital+(Capital*.03); end; run; a. . (missing) b. 2010 c. 2019 d. 2020
d. 2020 The final value of Year is 2020, which is one increment beyond the stop value of 2019.
There are 300 observations in the trials data set. How many observations does the test data set contain? data test; set trials (firstobs=150 obs=200); run; a. 200 b. 201 c. 50 d. 51
d. 51 The first observation processed is 150, and the last observation processed is 200.
What is the result of the following assignment statement? num=4+10/2 a. .(missing) b. 0 c. 7 d. 9
d. 9
Which assignment statement for Total correctly uses a SAS variable list and the SUM function to add the values for Year1, Year2, Year3, and Year4? a. Total=sum(of Year1-Year4); b. Total=sum(of Year2--Year4); c. Total=sum(of Year:); d. All of the above
d. All of the above In this situation, you can use a numbered range list, a name range list, or a name prefix list to specify the arguments. Any of these assignment statements would give the correct value for Total.
Which of the following is a SAS syntax requirement? a. Begin each statement in column one. b. Put only one statement on each line. c. Separate each step with a line space. d. End each statement with a semicolon. e. Put a RUN statement after every DATA or PROC step.
d. End each statement with a semicolon. SAS statements usually begin with an identifying keyword, and they must end with a semicolon. Although it is recommended to end steps with a RUN statement, it is optional. The other listed items are related to formatting your programs to make them easier to read.
A typical value for the numeric variable SiteNum is 12.3. Which statement correctly converts the values of SiteNum to character values when creating the variable Location? a. Location=Dept!!'/'!!put(SiteNum, $3); b. Location=Dept!!'/'!!put(SiteNum,4); c. Location=Dept!!'/'!!put(SiteNum,3.1); d. Location=Dept!!'/'!!put(SiteNum,4.1);
d. Location=Dept!!'/'!!put(SiteNum,4.1); When you use the PUT function, you must specify a format that can read the form of the values. The numeric format 4.1 correctly writes the values of SiteNum.
Which of the following assignment statements correctly calculates the average of Rest1, Rest2, Rest3, and Rest4? a. RestAvg=mean of Rest1-Rest4; b. RestAvg=mean(Rest1 Rest2 Rest3 Rest4); c. RestAvg=sum(Rest1,Rest2,Rest3,Rest4); d. RestAvg=mean(Rest1,Rest2,Rest3,Rest4);
d. RestAvg=mean(Rest1,Rest2,Rest3,Rest4); The MEAN function calculates the arithmetic mean (average) of the arguments. The arguments must be enclosed in parentheses and separated by commas.
Which of the following is not true of SAS DATE values? a. They are numeric. b. They can be positive or negative values. c. They represent the number of days between the day being stored and a base date. d. The base date is January 1, 1900.
d. The base date is January 1, 1900. All of these are true of SAS date values except that SAS stores date values as the number of days between January 1, 1960, and a specific date.
Which BY statement in a PROC SORT step can produce the output shown here? obs | Postal_Code | Employee_ID 1. 92173 120807 2 92131 120661 3 92128 121128 4 92128 120755 5 92128 120730 6 92124 121029 7 92124 121021 a. by Postal_Code Employee_ID; b. by descending Postal_Code Employee_ID; c. by Postal_Code descending Employee_ID; d. by descending Postal_Code descending Employee_ID;
d. by descending Postal_Code descending Employee_ID; In the output, the observations are sorted in descending order for Postal_Code and, within each postal code, in descending order for Employee_ID. The BY statement must specify the keyword DESCENDING before each variable.
explain the concatenation process in the data step:
data empsall1; set empsdk empsfr; run; §The SET statement reads observations from each data set in the order in which they are listed. Any number of data sets can be includedinthe SET statement
please explain the Rename data set option:
data empsall2; set empscn empsjp(rename=(Region=Country)); run; §The RENAME= option must be specified in parentheses immediately after the appropriate SAS data set name. §The name change affects the PDV and the output data set. It has no effect on the input data set.
provide an example of a nested DO loop:
data invest(drop=Quarter); do Year=1 to 5; Capital+5000; do Quarter=1 to 4; Capital+(Capital*(.045/4)); end; output; end;run;
What does the proc print statement do?
displays the data portion of a SAS dataset.
how many iterations will the do loop make in a 1-5 range?
do i=1 to 5; ... end; 1 2 3 4 5 6 6 iterations
Which of the following statements is true about merging SAS data sets by using the DATA step? a. Merging combines observations from two or more data sets into a single observation in a new data set. b. SAS can merge data sets based on the position of observations in the original data set or by the values of one or more common variables. c. Match-merging is merging by values of one or more common variables. d. To match-merge data sets, all input data sets must be sorted or indexed on the BY variables. e. All of the above
e. All of the above All of these statements about merging SAS data sets by using the DATA step are true.
Which of the following can represent a step boundary? a. a RUN statement b. a QUIT statement c. a PROC statement d. a DATA statement e. all of the above
e. all of the above All of these statements can represent a step boundary by indicating either the end of a step or the beginning of a new step.
Hoe to Converting a Variable to Another Data Type?
ex: data hrdata; set orion.convert(rename=(GrossPay=CharGross));run;
What does Substr function do?
extracts characters from a given variable values . Ex: org_code=substr(cacti_code,4,1); NewVar=Substr(var,start,end length)='new value';
what does the Proc step do?
generates reports, graphs, and manages data
What does the INPUT function do?
it converts character to numeric NumVar=INPUT(source,informat);
how do you view the contents of all the files under a library?
proc contents data= "libref"._all_; run;
what are the possible use cases for a where statement?
proc step, Data set statement
What is a SAS work/temporary library?
temporary SAS datasets can be stored and accessed in the work library until the duration of the session, SAS will delete those files stored in the Work library when the session is terminated when storing data in a work library, the data must contain a one level name or begin with 'work.'
Describe a SAS statement:
they begin with identifying keyword and end with a semicolon
How do you change or cancel a libref?
to change a libref simply submit a libname statement with the same libref using a different path. to cancel a libref submit a libname statement with the libref and clear option (libname Orion clear;)
Explain the Do Until Statement:
§The DO UNTIL statement executes statements in a DO loop repetitively until a condition is true. §The value of expression is evaluated at the bottom of the loop. §The statements in the loop are executed at least once. DO index-variable=start TO stop <BY increment> WHILE | UNTIL (expression);
explain the Do While Statement:
§The DO WHILE statement executes statements in a DO loop repetitively while a condition is true. §The value of expression is evaluated at the top of the loop. §The statements in the loop never execute if expression is initially false. DO index-variable=start TO stop <BY increment> WHILE | UNTIL (expression);
Explain the requirements for match merging:
§Two or more data sets are listed in the MERGE statement. §The variables in the BY statement must be common to all data sets. §The data sets must be sorted by the variables listed in the BY statement. (use PROC SORT)