SAS Exam 3 (CH5-CH12)
To assign date values to variables in assignment statement by using date constants
'ddmmmyy'd for something like 01JAN60
By default sum statement variables initialize to what
0
Variable Naming Rules
1 to 32, begin with letter or underscore, continue with any combination of numerals, letters, or underscores
Column Input to be used when data has
1. standard or numeric values 2. fixed fields
A character format name is how long
31 characters in length
A numeric format name is how long
32 characters in length
Comparison Operators
=, ^-, <, >,<=, >=
SAS name literal
A name token that is expressed as a string within quotation marks followed by the uppercase or lowercase letter n - the name literal tells SAS to allow the special character $ in the data set name. So libref.'sheetname$'n
Another way to conditionally process other than if-then/else
A select group
Logical Operators
AND (&) OR (|)
Concatenating
Appends observations from one dataset to another one on top of each other vertically combined doesn't need to have the same variables
The default format SAS uses for writing numeric values
BESTw. - will display the most information about a value according to the available field width
Why do we use libname libref clear;
Because if SAS has a libref assigned to an excel workbook the workbook cannot be opened in excel
How to by and class differ in proc means?
By processing requires that your data already be sorted or indexed in the order of the by variables. Also by group creates 4 small tables - class produces a single large table
Input Buffer
Created to hold a record from an external file
What needs to be done before you use the by statement with a set statement
Datasets that are listed in the set statement must be sorted by the values of the by variables
SAS date time statement
DateTime='18jan2005:9:27:05'dt
what does the _NULL_ keyword do in data _null_;
Enables you to use the DATA step without actually creating a SAS data set
Does the data step execute once for each record in the input file or once for each file?
Executes once for each record in the input file
. in terms of logical operators
FALSE
0 in terms of logical operators
FALSE
What do you do in column input if the values occupy only one column
Gender $ 14 where gender is a character variable that only occupies column 14
Which ODS destination is open by default?
HTML
If then delete syntax
IF expression THEN DELETE;
Using drop= or keep= in the data statement
If you do need to reference a variable in the original data set
Where are formats stored by default?
In a default format catalog named Work.Formats
How to permanently store formats?
In a permanent catalog named formats when you specify the library option in the proc format statement proc format library=libref;
Interleaving
Intersperses observations from two or more datasets based on one or more common variables statements: set, by
What does the SAS/ACCESS libname statement do
It associates SAS name with an excel workbook file by pointing to its location. The workbook becomes a new library in SAS and the worksheets in the workbook become the individual SAS datasets in the library
Reference a SAS library
LIBNAME statement libname libref 'path'
What input can be used to read raw data files that contains values that are not in fixed fields?
List input
Match merging
Matches observations from two or more datasets into a single observation in a new data set according to the values of a common variable MERGE BY
Can you use column input to read a file that contains data that is not arrange in columns?
No
Can you use multiple datalines statements in a data step?
No
To create simple HTML output do you have to specify the ODS HTML statement?
No
Do you need a run statement following the null semicolon statement in a datalines?
No - the datalines stateament functions as a step boundary so the data step is executed as soon as SAS encounters it
Can column input be used to read in data with commas?
No commas are nonstandard
If the expression is false
No further statements are processed for that observation, control returns to the top of the data step
Do invalid data errors cause SAS to stop processing the program?
No it does not. Only syntax errors cause SAS to stop processing the program
Is an input buffer created when a SAS dataset is read?
No, an input buffer is created only when raw data is read
When reading variables from a SAS dataset does SAS sets the values to missing for every row?
No, only before the first cycle of execution of the DATA step
Do you need a period when referring to the SAS format COMMA9.2?
No, the period in this format occurs between 9 and 2. Only user defined formats require periods at the end of name for sure
Which keyword can be used to label missing numeric values as well as any values that are not specified in a range
OTHER
ODS stands for
Output Delivery System
Order in which arithmetic operations are performed
PEMDAS
What are arithmetic opeartors
PEMDAS
What should you remember when associating a format with a variable in a data or proc step
Place a period at the end of the format name when it is used in the format statement
PDV
Program Data Vector - area of memory where SAS holds one observation at a time
Named Range
Range of cells within a worksheet that you define in Excel and assign a name to - the named range and its parent worksheet will appear in the SAS explorer window as separate datasets, except that the dataset created from the name range will have no dollar sign appended to its name
What happens if the result of all SELECT-WHEN comparisons are false an not OTHERWISE statement is present
SAS issues an error message and stops executing the DATA step
What happens if you do not execute a file statement before a put statement in a data step?
SAS writes the lines to the SAS log
How to read Microsoft Excel data
SAS/ACCESS LIBNAME statement or Import Wizard
In what format does SAS reads data that is stored in excel files?
The data gets read just as it is stored in excel
How is an if statement processed...if the expression is true
The data step continues to process that observation
How many observations in one to one merging
The number of observations in the smallest dataset - because observations are combined based on their relative position in each data set
SAS time constant in an assignment statement
Time='9:25't
If your data contains semicolons but you want to read it in using datalines what can you do
Use Datalines4 and then a null statement that consists of 4 semicolons
length statement
Use a length statement to assign a length to accommodate the longest value of a variable. Otherwise when using an if then else then statement the length of the first variable is used and others might get truncated.
What needs to be done to prevent continuous looping with point=
Use a stop statement to prevent continuous looping. The stop statement causes SAS to stop processing the current data step immediately and to resume processing statements after the end of the current DATA step
How to permanently associate a format with a variable
Use format in a data step
How to temporarily associate a format with a variable
Use format in a proc step
What should you do if you want an apostrophe to appear in a label?
Use two single quotation marks
When is the proc freq useful
When the dataset has variables whose values can be described as categorical, whose values are best summarized by counts rather than by averages
Put statement
When the source of program errors is not apparent, can use the PUT statement to examine variable values and print your own message in the log data work.test; infile loan; input Code $ 1 Amount 3-10 Rate 12-16 Account $ 18-25 Months 27-28; if code='1' then type='variable'; else if code='2' then type='fixed'; else type='unknown'; if type=unknown then put 'MY NOTE: invalid value: ' - 248 - code=; run;
What is the main different while reading an existing data set with the SET statement versus reading raw data
While reading an existing data set SAS retains the values of existing variables from one observation to the next
How do you write a SAS dataset from a raw data file?
Write a data step program - location or name of external file, name for new SAS dataset, reference that identifies the external file, description of the data value to be read
Are libname and filename statements global?
Yes
Can you nest do statements within do groups?
Yes
Can you place a format statement in both a data step and a proc step?
Yes
Can you use if-then/else or select statements in do groups?
Yes
if you assign temporary labels or formats within a proc step do they override permanent labels or formats created previously?
Yes
Can you specify a variable in a put statement?
Yes - you can specify but it will only print the value of the variable. To write both the variable name and value in the log add an equal sign to the variable name
Can you override a variable name when creating a new assignment statement to a variable?
Yes like resthr=resthr+(resthr*10)
Can you subset data as you read it in?
Yes use a subsetting if statement
DROP/KEEP vs. DROP=KEEP=
You cannot use DROP and KEEP in SAS procedure steps
The two automatic variables in PDV that are used in processing but not written to the dataset as part of an observation
_N_ -counts number of times the data step executes _ERROR_ - signals the occurence of an error that is caused by data during execution - 0 = no error, 1 = 1 or more error
Statistics for grouped observations, ie separate analyses for grouped observations in the MEANS procedure
add a CLASS statement to the MEANS procedure
To specify only a certain number of variables to focus on for proc means
add a var statement
Arithmetic operator +
addition
Appending
adds observations in the second data set directly to the end of the original data set procedure:APPEND
If you want to create a new variable when reading a raw data file does it go before or after the input statement?
after the input statement like as follows data sasuser.stress; input ID $1-4 Name $6-25 RestHR 27-29 MaxHR 31-33 RecHR 35-37 TimeMin 39-40 TimeSec 42-43 Tolerance $45; TotalTime=(timemin*60)+timesec
When there is only one observation of a by variable type do first. last. both get 1 or 0
both first. and last. will be 1
Another statement in proc means that groups variable
by
What happens if you use if-then statements without the ELSE statement
causes SAS to evaluate all IF-THEN statements
What happens if you use if-then statements with the ELSE statement
causes SAS to execute IF-THEN statements until it encounters the first true statement
When crosstabulations are specified in a proc freq tables statement, cells contain
cell frequency, percentage of total freq, percentage of row freq, percentage of col freq
When are statement scanned for syntax errors - compilation or execution phase
compilation
raw data file
contain data values that are organized in fields
One to One merging
contains all variables from each dataset, combines observations based on their relative position in each dataset
BODY= or FILE= specification in the ODS HTML statement
creates custome names HTML body file containing procedure results
CONTENTS= and FRAME= options in ODS HTML
creates table of contents that links to your HTML output
Writing observations to a raw data tile
data _NULL_; set libref.dataname; file 'name' or 'path'; put vars and cols; run;
Name a SAS data set
data statement data libref.name
END=option in the set statement
data work.addtoend(drop=timemin timesec); set sasuser.stress2(keep=timemin timesec) end=last; TotalMin+timemin; TotalSec+timesec; TotalTime=totalmin*60+totalsec; if last; run; proc print data=work.addtoend noobs; run;
How to read instream data
dataline statement as the last statement in the data step and immediately preceding the data lines
input statement
describes the fields of raw data to be read and placed in a SAS dataset input variable <$> startcol-endcol;
What gets outputted with fmtlib option in proc format
description of formats, length of longest label, number of values defined by format, version of SAS used to create format, date and time of creation
Getnames=yes|no libname option
determines whether SAS will use the first row of data in excel workseet or range as column names
Crosslist option to the tables statement
displays crosstabulation tables in ODS column format - creates a table that has table definition that you can customize using the TEMPLATE procedure
Arithmetic operator /
division
simple do group
do; sas statements end; do statement begins do group processing end terminates do group processing
DO UNTIL statement
executes statements in a DO loop repetitively until a condition is true, checking the condition after each iteration of the DO loop
DO WHILE statement
executes statements in a DO loop repetitively while a condition is true, checking the condition before each iteration of the DO loop
Arithmetic operator **
exponentiation
Reference an external file
filename statement filename tests 'path'
How to use an ODS statement
for each output, use an ODS to open the destination, at the end of the program use another ODS to close the destination ODS open-destination; ODS close-destination close;
Using drop= or keep= in the set statement
if you never reference those variables and don't want them to appear in the new dataset
Identify an external file
infile statement infile 'file specification' where file specification can take the form fileref to name a previously define file reference or filename to point to the actual name and location of the file so if you have defined a fileref tests then to read the raw data file you write infile tests' or instead of using fileref predefined can do infile 'path'
dbmax_text=n libname option
length of longest character string
Proc fslist
lets you view content and structure of raw data files
the SAS/ACCESS libname statement to reference an excel workbook file
libname libref 'c:\users\exercise.xlsx';
Create a new worksheet from a SAS dataset and save it to a path
libname libref 'path' mixed=yes; data libref.worksheet name; set work.datasetname; run;
Options in a libname statement
libname libref 'path' options;
Entire structure for reading in an external raw data file using column input
libname libref (optional) filename name 'path' data newdataname; infile externalfilename; input variable colstart-colend; run;
Aliases for datalines
lines or cards
to limit decimal place in proc means
maxdec=option proc means data=dataset statistics maxdec=n; run;
What type of errors are detected during compilation phase
misspelled keywords and data set names, unbalanced quotes, invalid options - if an error occurs during compilation execution will not occur
Arithmetic operator *
multiplication
proc means default prints
n count, mean, standard deviation, min and max of every numeric variable in the data set
^ logical operator
not
What types of error warnings occur during execution
note or error message, values stored in PDV are displayed, processing continues or stops
Standard numeric data contain
numerals decimals numbers in scientific notation plus or minus signs
open and close all ODS destinations
ods rtf file=filepath; ods pdf file=filepath; ods _all_ close; ods html;
output statement in proc means
outputs a SAS output created from proc means
proc freq procedure
proc freq data=sasdataset <NLEVELS>; tables variables; run; tables - specifies which frequency tables to product nlevels - displays a table that provides a number of distinct values for each variable named in the tables statement
proc means procedure
proc means data=sas-data-set <statistics>; var variables; run;
Difference between proc means and proc summary
proc means produces report by deafult, proc summary you must include a print option in the proc summary statement proc means produces a report, proc summary products an output dataset
Another way to create summarized output data set
proc summary
proc freq
products one way and n way frequency tables - create crosstabulation tables that summarize data for two or more categorical variables by showing the number of observations for each combination of variable values proc freq data=sas-data-set; run;
Put statement to write all variable names and variable values including automatic variables to the log
put 'string' _ALL_;
Select group syntax
select when otherwise end
How to worksheet names appear in the SAS library representing the excel workbook?
sheetname$
purpose of the path option in ODS
specifies the location of the HTML file output
mixed=yes|no libname option
specifies whether to import data with both character and numeric values and convert all data to character
scantext=yes|no libname option
specifies whether to read the entire data column and use length of longest string found as the SAS column width
scantime=yes|no
specifies whether to scan all row values in a date/time column and automatically determine the TIME. format if only time values exist
usedate=yes|no
specifies whether to use the DATE9. format for date/time values in Excel workbooks
COLUMN INPUT FOR
standard values in fixed fields (columns)
Arithmetic operator -
subtraction or negative prefix
sum vs assignment main difference
sum ignores missing values assignment sets new value to missing also
nofreq, nopercent, norow, nocol options
suppresses these options to be printed in the cells of a table proc freq data=data; tables vars /nofreq norow ncol; format; run;
More than two variables in tables in proc freq
tables var1 var2 var3 where var 1 are levels var2 is rows var3 is columns
NOCUM option to your tables statement
tables variable(s)/nocum; suppresses display of the cumulative frequencies and cumulative percentages in one way frequency tables and in list output
An else statement executes only if...
the previous if-then/else statement is false
To specify variables to be processed by the freq procedure use
the tables statement
What gets created when there is a by statement
two temporary variables first.byvariable last.byvariable with values 1 or 0
What should you do to initialize sum sec to a different number
use RETAIN statement - assigns an initial value to a retained variable. prevents variables from being initialized each time the DATA step executes
To define several formats in a single proc format step
use multiple value statements - each value statement defines a different format
To access observations directly by their observation number use what
use the point=option in the SET statement
proc format statement
used to define your own formats proc format <options>; options - library - specifies the libref for a SAS data library to contain a permanent catalog of user defined formats fmtlib - displays all of the formats in your catalog along with descriptions of their values
Executing a group of statements as a unit
using do groups
Suppressing the default printed report from proc means and only getting the output data set
using the noprint keyword proc means data=data noprint;
value statement rules and syntax
value format-name range1=label1 range2=label2 ..; format name begins with $ if char valid SAS name cannot be an existing format cannot end in number does not end in period when specified in a VALUE statement
Non-standard numeric data includes
values that contain special characters like % $ , data and time values data in fraction, integer binary, real binary, hexadecimal forms
sum statement general form
variable+expression where variable specifies the name of the accumulator variable expression is any valid SAS expression
SAS sets the value of each variable in a DATA STEP to missing except for when
variables are named in a retain statement, variables are created in a sum statement, data elements in a temporary array, any variables that are created with options in the file or infile statements, automatic variables
What does a frame file from ODS display
will display the contents and body file
can you add other optional statistics in the proc means?
yes, proc means data=dataset statistics; run;