SAS Interview Questions
It focuses on Character String Matching to allocate the free memory
Can you explain about CALL PRXFREE Routine?
The prime aim of CALENDAR is to make the data of the calendar on monthly basis be visible in the format of the SAS data set.
Can you explain the process of CALENDAR?
Under the PROC MEANS there is a subgroup statist which is created only when there is a BY statement that is being used and the input data is previously well sorted out with the help of BY variables. Under the PROC SUMMARY, there is a statistics which gets produced automatically for all the subgroups. It gives all sort of information that runs altogether. It would get the best sorting of the data set which is then produced with the help of the variables that significantly defines every subgroup and runs the PROC MEANS. Under the PROC SUMMARY section, the information in the output does not get created. At such time, you will have to use the OUTPUT statement for creating a new DATA SET and then use PROC PRINT to make sure the computed statics is visible
Could explain the difference between PROC MEANS and SUMMARY?
The central to every data set is the data itself. In SAS, the data is available in form of tabular manner where there are some of the variables that occupy the column space and also the row space gets occupied with the observation section. The numbers are treated by SAS as the numeric data while other things come under the character data. This is the main reason why SAS comes with two types. If you take an example, SAS date is the numerical value which is equivalent to the days from the 1st of January, 1960
Could you explain the working of the dates in SAS data?
Generally, KEEP and DROP are used for limiting the variables in the particular dataset
Could you help me know the right way to limit the variables written to output dataset in DATA STEP?
When there is a numeric or the character value that is specified, it is will be assigned or categorized under the missing values. For this, the use of CALL MISSING routine is made. There is no hard and fast rule for using it. Once you start using it, you get the far better clear idea on the same.
Could you please explain or understand in detail about CALL MISSING routine?
ODS HTML PATH = '/folders/myfolders/sasuser.v94/TutorialsPoint/' FILE = 'CARS2.html' STYLE = EGDefault; proc SQL; select make, model, invoice from sashelp.cars where make in ('Audi','BMW') and type = 'Sports' ; quit;
Creating HTML output with ODS?
If you are not willing to process few of the variables and you don't even wish them to appear in the new set of the data then you must use specify drop = data set option in that particular set of the statement But if you want to process some of the variables and don't wish them to be visible in the new data set then you can also mention drop = data set option in that particular set of the statement
Explain the difference between using drop = data set option in set and data statement?
SUBSTR Function is used for extracting a string or replacing contents of character value.
Explain the use of SUBSTR function?
The replacement or the removal if the occurrence of a pattern or the characters within the character string is done with the TRANWRD function
Explain the work of tranwrd function?
Proc glm performs simple and multiple regression, analysis of variance (ANOVAL), analysis of covariance, multivariate analysis of variance and repeated measure analysis of variance.
Explain what Proc glm does?
To display the contents of the SAS dataset PROC print is used and also to assure that the data were read into SAS correctly. While, PROC CONTENTS display information about an SAS dataset.
Explain what does PROC print, and PROC contents are used for?
The logical area in the memory is represented by PDV or Program Data Vector. At the time, SAS creates a database of one observation at a time. An input buffer is created at the time of compilation which holds a record from an external file. The PDV is created following the input buffer creation
Explain what is PDV?
SAS INFORMATS are used to read, or input data from external files known as Flat Files ASCII files, text files or sequential files). The informat will tell SAS on how to read data into SAS variables.
Explain what is SAS informats?
The basic structure of SAS consist of ==DATA step, which recovers & manipulates data. ==PROC step, which interprets the data.
Explain what is the basic structure of SAS program?
PROC gplot has more options and can create more colorful and fancier graphics.
Explain what is the use of PROC gplot?
The syntax of proc summary is same as that of proc means, it computes descriptive statistics on numeric variables in the SAS dataset.
Explain what is the use of function Proc summary?
Data Step MERGE does not create a cartesian product incase of a many-to-many relationship. Whereas, Proc SQL produces a cartesian product.
How Data Step Merge and PROC SQL handle many-to-many relationship?
PUT (formatted) statement in data step.
How can you write a SAS data set to a comma delimited file ?
data work; do i=1 to 20 until(Sum>=20000); Year+1; Sum+2000; Sum+Sum*.10; end; run; This iterative DO statement enables you to execute the DO loop until Sum is greater than or equal to 20000 or until the DO loop executes 10 times, whichever occurs first.
How do you specify the number of iterations and specific condition within a single do loop?
You must create a differently-named variable using the PUT function. The example below shows the benefits of the PUT function. charvar=put(numvar, 7.) ;
How to convert a numeric variable to a character variable?
Uisng PROC SQL with COUNT(DISTINCT variable_name) to determine the number of unique values for a column.
How to count unique values by a grouping variable?
There are some system options that can be used to debug SAS Macros:MPRINT, MLOGIC, SYMBOLGEN
How to debug SAS Macros
Using the FIRSTOBS = and OBS = statements.
How to print observations 4 through 8 from a data set?
Use PROC PRINTTO
How to save log in an external file?
By using TABLES Statement.
How to specify variables to be processed by the FREQ procedure?
INFILE statement is used to identify an external file
INFILE
INPUT statement is used to describe your variables
INPUT
Using the MISSOVER keyword.
If reading a variable length file with fixed input, how would you prevent SAS from reading the next record if the last variable didn?t have a value?
In SAS, the "where" statement does not perform automatic conversions in comparisons.
In SAS explain which statement does not perform automatic conversions in comparisons?
the INPUT function uses a SAS format (informat) to convert a character string into a number
Input function:
Some key concept of SAS include, SORT procedure Missing values KEEP=, DROP= dataset options Data step logic Reset to missing, or the RETAIN statement Log FORMAT procedure for creating value formats Data types IN= dataset option
List out some key concept of SAS?
5 ways to do a "table lookup" in SAS include, PROC SQL Match Merging Direct Access Format Tables Arrays
Mention 5 ways to do a "table lookup" in SAS?
Common programming errors committed in SAS are, Missing semicolon Not checking log after submitting program Not using debugging techniques Not using Fsview option vigorously
Mention common programming errors committed in SAS ?
You read the variables using input statement with column /line pointers, informats and length specifiers.
Mention how do you read the variables that you need?
To create a permanent SAS data set, there are two steps necessary, Assign a library and engine. Create the data. Make sure to assign both a library (other than WORK) and data set name to make the data set permanent.
Mention how to create a permanent SAS data set?
By using MAXDEC=option you can limit decimal places for the variable.
Mention how to limit decimal places for the variable using PROC MEANS?
To remove duplicates using PROC SQL use following step, Proc SQL noprint; Create Table inter.merged1 as Select distinct * from inter.readin ; Quit;
Mention how to remove duplicates using PROC SQL?
For debugging in SAS use the Debug clause after '/' in the data statement.
Mention how to test the debugging in SAS?
You will generate test data with no input data using "put" statement and "Data Null".
Mention how will you generate test data with no input data?
SAS informats are placed in three categories, • Character Informats : $INFORMATw • Numeric Informats : INFORMAT w.d • Date/Time Informats: INFORMAT w.
Mention the category in which SAS Informats are placed?
The "floor" returns the greatest integer less than/equal to the argument. Whereas the "ceil" function returns the smallest integer greater than/equal to the argument.
Mention the difference between CEIL and FLOOR functions in SAS?
The difference between SAS functions and procedures is that Procedures expect one variable value per observation Functions expect values to be supplied across an observation
Mention the difference between SAS functions and procedures?
For DataSet : Data set name/ debug Data set: Name/stmtchk For Macros: Options: mprint mlogic symbolgen
Mention the validation tools used in SAS?
To check errors, use the Log and for data validation use things like Proc Freq, Proc Means or sometimes Proc print to see how data looks.
Mention what SAS features do you use to check errors and data validation?
The default statistics that PROC MEANS produce are, N MN MAX MEAN STD DEV
Mention what are the default statistics that PROC MEANS produce?
The scrubbing procedures in SAS are Proc Sort with nodupkey option. It will eliminate the duplicate values.
Mention what are the scrubbing procedures in SAS?
Special input delimiters used in SAS are DLM and DSD.
Mention what are the special input delimiters used in SAS?
In SAS, PROC steps analyze and process data in the form of an SAS data set. It controls a library of routines that perform tasks on SAS data set such as sorting, summarizing and listing.
Mention what is PROC in SAS?
SLIBREF is a server-libref. It specifies the libref that is used by the server to identify the SAS data library when no physical name is determined and the server libref is different from the client libref.
Mention what is SLIBREF?
The command used to find missing values is missing_values=MISSING(field1,field2,field3);
Mention what is the command used to find missing values?
A one-to-one merge is suitable if both data sets in the merge statement are sorted by id and each observation in one data set has a corresponding observation in the other data set. If the observations do not match, then match merging is suitable.
Mention what is the difference between Match Merge and One to One Merge?
The difference between the NODUP and NODUPKEY is that, NODUP compares all the variables in our dataset while NODUPKEY compares just the BY variables
Mention what is the difference between nodupkey and nodup options?
The good SAS programming practices for processing large data sets is to sort them once using firstobs= and obs=.
Mention what is the good SAS programming practices for processing large data sets?
%INCLUDE statement reads an entire file into the current SAS program you are running and submits that file to the SAS System immediately.
Mention what is the use of %include statement?
A STOP statement is used to control the continuous looping in SET statement.
Mention why a STOP statement is needed for the POINT= option on a SET statement?
It incldeus the numeric to character conversion which icnldues input
Output function
The purpose of using such type of detail is for analysis the elementary at numeric level. It will help you examine how well is the data actually distributed
PROC UNIVARIATE
The double trailing sign (@@) tells SAS rather than advancing to a new record, hold the current input record for the execution of the next INPUT statement.
Purpose of double trailing @@ in Input Statement ?
Statistical Analytics System
SAS
Min, MAX, MEDIAN, RANUNI, INTCK, WEEKDAY, LOWCASE, UPCASE, SCAN, INT, ROUND
SAS Functions
It is a function that helps in extracting a string or even replaces the content of character value.
SUBSTR function
SYMPUT puts the value from a dataset into a macro variable where as SYMGET gets the value from the macro variable to the dataset.
What are SYMGET and SYMPUT?
The number of observations is limited only by computer's capacity to handle and store them.
What can be the size of largest dataset in SAS?
CATX syntax concatenate character strings remove trailing and leading blanks and inserts separators.
What function CATX syntax does?
The focus of such function is to search the character string and return it soon after it is found
What is ANYDIGIT function?
This type of term is used to make sure that the data which is process is grouped, indexed or even ordered based depending upon the variables.
What is BY-Group processing?
It is used for submitting the step of a PROC which is used more specifically in RUN statement. It ends without any kind of process.
What is RUN-Group processing?
SCAN extracts words within a value that is marked by delimiters. SUBSTR extracts a portion of the value by stating the specific location. It is best used when we know the exact position of the sub string to extract from a character value.
What is the Difference between SCAN and SUBSTR?
proc means will give descreptive statitstics. By default it will give output in output window.but proc summary will not give output as default.we need give an option print then only it will give the output.
What is the difference Between Proc Means and Proc Summary?
SUM function returns the sum of non-missing arguments whereas "+" operator returns a missing value if any of the arguments are missing.
What is the difference between '+' operator and SUM function?
BY processing requires that your data already be sorted or indexed in the order of the BY variables.
What is the difference between CLASS statement and BY statement in proc means?
DO WHILE expression is evaluated at the top of the DO loop. If the expression is false the first time it is evaluated, then the DO loop never executes. Whereas DO UNTIL executes at least once.
What is the difference between do while and do until?
The main difference is that while reading an existing data set with the SET statement, SAS retains the values of the variables from one observation to the next. Whereas when reading the data from an external file, only the observations are read. The variables will have to re-declared if they need to be used.
What is the difference between reading data from an external file and reading data from an existing dataset?
The OPTIONS OBS=0 at the beginning of the code needs to be written but if you want to execute the same then there will be a log which gets detected by the colors that gets highlighted.
What is the right way to validate the SAS program?
2 bytes and 1 byte.
What is the smallest length for a numeric and character variable respectively ?
Five
What would be the denominator value used by the mean function if two out of seven arguments are missing?
where statement
Which SAS statement does not perform automatic conversions in comparisons?
INTNX function advances a date, time, or datetime value by a given interval, and returns a date, time, or datetime value
Which date function advances a date, time or datetime value by a given interval?
It is a bitwise logical operation and is used for returning bitwise logical OR between two statements.
Explain BOR function?
It is used for compressing the data into new output.
Explain COMPRESS data set option?
When you define DSD, SAS treats two consecutive delimiters as a missing value and removes quotation marks from character values.
Explain how SAS treat the DSD delimiters?
You can debug and test your SAS program by using Obs=0 and systems options to trace the program execution in log
Explain how you can debug and test your SAS program?
The purpose of Stop statement is to stop the process of the current data on an immediate basis. It also allows resuming the statement of the process once there is an end of the current data step.
Explain in details about function of Stop statement in a SAS Program
Stop statement causes SAS to stop processing the current data step immediately and resume processing statement after the end of current data step.
Define the function of Stop statement in a SAS Program?
%EVAL cannot perform arithmetic calculations with operands that have the floating point values. It is when the %SYSEVALF function comes into picture.
Difference between %EVAL and %SYSEVALF
You can take the valuable value PayRate which starts with a dollar sign ($). When the SAS converts the PayRate automatically to the numeric value, the value gets converted. Furthermore, there is also the dollar sign which further blocks the process. The value then cannot get converted into the numeric one. That is why it is always advised to include the PUT and INPUT functions in the programs when the conversion takes place.
Do you have any example in which SAS does not convert or fails to convert the character value to numeric one?
The variable that is used for scanning functioning is 200
Do you have any idea about the assigned length which is given to the target variable by the scanning functioning?
Substr, Scan, Catx, trim, Index, find, tranwrd, and Sum.
Do you know any SAS functions?
SAS comes with two types of data which is numeric and character. Other than this, dates are also the part of the characters even if there are some functions to work as per the dates.
Do you know the data types present in SAS?
There are basically two functions which are used for Character handling functions namely UPCASE and LOWCASE
Do you know the functions that are used for Character handling functions?
CALL PRXCHANGE routine helps to perform the pattern matching replacement
Do you know what CALL PRXCHANGE routine is?
The loop continues till the UNTIL condition becomes True.
do until
The loop continues till the while condition becomes false.
do while
A RETAIN statement tells SAS not to set variables to missing when going from the current iteration of the DATA step to the next. Instead, SAS retains the values.
what is the purpose would you use the RETAIN statement?