R Programming

¡Supera tus tareas y exámenes ahora con Quizwiz!

NaN

Undefined mathematical operation

&&

evaluates and for the first member of a vector only

||

evaluates or for the first member of a vector only

Logical operators

>=, >, <=, <, ==, !=

file.rename()

Change the name of a file

NA

Missing value

Test to see if objects are NA

is.na()

Isolate the positive elements of vector x without including NA

x[!is.na(x) & x > 0]

Create an index vector that will show the first ten elements of vector x

x[1:10]

Four different flavors of index vectors

Logical vectors, vectors of positive integers, vectors of negative integers, and vectors of character strings

Reading in R code files (inverse of dput)

dget; the analogous function for writing data is is dput

|

or

Reading in R code files (inverse of dump)

source; the analogous function for writing data is dump

Create the dimensions of my_vector to have 4 rows and 5 columns, so it's now a matrix instead of a vector.

dim(my_vector) <- c(4,5)

Create the names of the matrix "m" rows to be "a" and "b" and the names of the columns to be "c" and "d"

dimnames(m) <- list(c("a", "b"), c("c", "d"))

Delete a directory named "testdir2"

unlink("testdir2", recursive = TRUE)

ls()

List all of the objects in the local workspace

header

Logical indicating if the file has a header line

nrows

The number of rows in the dataset

Atomic vector

Atomic vectors contain exactly one data type, whereas lists can contain multiple data types. Atomic vectors can be numeric, logical (e.g., TRUE, FALSE, NA), character, integer, or complex.

file.info()

Get information about the file

setwd()

Move to a different working directory (put a path in the parentheses)

rep()

Replicate

Make the column names for the my_data data frame the cnames vector

colnames(my_data) <- cnames

?

Use this to prompt the help page (e.g., ?list.files will bring up the help page for the list.files function). Note that symbols must be enclosed in backticks after the ? (e.g., ?`:`).

How to determine the classes of each column after reading in a table named initial

initial <- read.table("database.txt", nrows = 100) classes <- sapply(initial, class) tabALL <- read.table("database.txt", colClasses = classes)

Test to see if objects are NaN

is.nan()

Reading in saved workspaces

load; the analogous function for writing data is save

Mean of column "Ozone" in data frame hw1

mean(hw1["Ozone"][!is.na(hw1["Ozone"]) & hw1["Ozone"] > 0]) or mean(hw1$Ozone[!is.na(hw1$Ozone) & hw1$Ozone > 0])

Change the matrix my_matrix to a data frame named my_data

my_data <- data.frame(my_matrix)

Select 100 elements at random from vectors x and y

my_data <- sample(c(y, z), 100)

Create a matrix called my_matrix2 containing the numbers 1-20 and dimensions of 4 rows and 5 columns

my_matrix2 <- matrix(data = 1:20, nrow = 4, ncol = 5, byrow = FALSE, dimnames = NULL)

Get the named elements of this numeric vector: vect <- c(foo = 11, bar = 2, norf = NA)

names(vect)

Assign names to the numeric vector: vect2 <- c(11, 2, NA)

names(vect2) <- c("foo", "bar", "norf")

Assign the value of the current working directory to a variable called "old.dir"

old.dir <- dir()

Join the elements of a character vector called my_char together in one continuous character string.

paste(my_char, collapse = " ")

Create a sequence of numbers from pi to 10

pi:10 This will create 3.141593 4.141593 5.141593 6.141593 7.141593 8.141593 9.141593

Reading tabular data

read.table, read.csv; they are the same except read.csv assumes the separator is a comma; the analogous function for writing data is write.table

Reading lines of a text file

readLines; the analogous function for writing data is writeLines

Create a vector that contains 40 zeros

rep(0, times = 40)

Create a vector to contain 10 zeros, then 10 ones, then 10 twos

rep(c(0, 1, 2), each = 10)

Create a vector that contains 10 repetitions of the vector (0, 1, 2)

rep(c(0, 1, 2), times = 10)

A vector containing 1000 draws from a standard normal distribution

rnorm(1000)

Create a vector of numbers ranging from 0 to 10, incremented by 0.5

seq(0, 10, by=0.5) This will create 0.0 0.5 1.0 1.5 2.0 2.5 3.0 3.5 4.0 4.5 5.0 5.5 6.0 6.5 7.0 7.5 8.0 8.5 9.0 9.5 10.0

Create a sequence of 30 numbers between 5 and 10

seq(5, 10, length=30)

Number of missing values in an data frame hw1

sum(is.na(hw1))

Mean of Solar.R column values when Temp column values are > 90 and Ozone column values are > 31

ugh <- complete.cases(hw1$Temp, hw1$Ozone, hw1$Solar.R) & hw1$Temp > 90 & hw1$Ozone > 31 mean(hw1$Solar.R[ugh])

Reading single R objects in binary form

unserialize; the analogous function for writing data is serialize

Subset all elements of vector x except the 2nd and 10th elements

x[c(-2, -10)] or x[-c(2, 10)]

Subset the 3rd, 5th, and 7th elements of vector x

x[c(3, 5, 7)]

LETTERS

A predefined variable in R containing a character vector of all 26 letters in the English alphabet

sep

A sting indicating how the columns are separated

unlink()

Delete a directory

file.remove()

Delete a file

args()

Displays the argument names and corresponding default values of a function or primitive

break

Exit the loop

print()

Explicit printing of what you put in the parentheses. You can also auto print something by typing it and pressing enter. Note that R will print brackets around the number of elements in a vector before the elements themselves.

getwd()

See the current working directory

stringsAsFactors

Should character variables be coded as factors? TRUE or FALSE

return

Signals the function should exit and return a given value

next

Skip an iteration in a for loop

Inf

Special number meaning infinity

How to specify an integer

Specify L suffix to get integer (i.e., 1L gives integer 1)

Data Frames

Stores tabular data in rows and columns; can consist of many different classes of data; every element of the list has to have the same length

Matrices

Stores tabular data in rows and columns; can only contain a single class of data

isTRUE()

This is a function that will take one argument. If that argument evaluates to TRUE, the function will return TRUE.

file.create()

Create a new file in the working directory

c()

Create a vector

If I have a data frame with 1,500,000 rows and 120 columns (all numeric data), roughly how much memory will be required to store this data frame?

1,500,000 rows x 120 columns x 8 bytes/numeric = 1440000000 bytes 1440000000 bytes / 2^20 bytes/MB = 1373.29 MB = 1.34 GB Rule of thumb is that you need twice as much memory as the file size, you you need about 1.34 x 2 GB

Create a sequence of numbers from 1 through 20

1:20 or seq(1,20) This will create 1 2 3 4 5 6 7 8 9 10 11 12 13 14 15 16 17 18 19 20

Generate a sequence of integers from 1 to N, where N represents the length of the my_seq vector

1:length(my_seq) or seq(along.with = my_seq) or seq_along(my_seq)

comment.char

A character string indicating the comment character; set comment.char = "" if there are no commented lines in your file

colClasses

A character vector indicating the class of each column in the dataset

Ellipses

All arguments after ellipses must have default values

Order of operations

And is evaluated before or

file.exists()

Check to see if a file exists in the working directory

identical()

Check to see if two vectors are identical

cat("\014")

Clear R console

cbind()

Combine columns

dir.create()

Create a directory in the current working directory

file.path()

Create a file path

list()

Create a list

The ... argument

Indicates a variable number of arguments that are usually passed on to other functions. Note that any arguments coming after ... must be explicitly named.

dir()

List all files in the working directory

list.files()

List all files in the working directory

Boolean values

Logical values of TRUE and FALSE in R

What is the maximum value in the "Ozone" column during month 5?

M5 <- complete.cases(hw1$Ozone) & hw1$Month == 5 max(hw1$Ozone[M5])

What is the mean of "Temp" when "Month" is equal to 6?

M6 <- complete.cases(hw1$Temp) & hw1$Month == 6 mean(hw1$Temp[M6])

file.copy()

Make a copy of a file

<-

The assignment operator will assign a value to a symbol

file

The name of a file, or a connection

skip

The number of lines to skip from the beginning

Why are dumping and dputing useful?

The resulting textual format is edit-able, and in the case of corruption, potentially recoverable. The metadata is preserveNd, unlike writing out a table.

xor()

The xor() function stands for exclusive OR. If one argument evaluates to TRUE and one argument evaluates to FALSE, then this function will return TRUE, otherwise it will return FALSE. xor(TRUE, TRUE) = FALSE xor(TRUE, FALSE) = TRUE xor(FALSE, FALSE) = FALSE

&

and

Find the mean of columns with a for loop

columnmean <- function(y, removeNA = TRUE) { nc <- ncol(y) means <- numeric(nc) for(i in 1:nc) { means[i] <- mean(y[, i], na.rm = removeNA) } means }

Create a directory in the current working directory called "testdir2" and a subdirectory for it called "testdir3" all in one command

dir.create(file.path('testdir2', 'testdir3'), recursive = TRUE)


Conjuntos de estudio relacionados

Ohio Life Ch.2 Life Insurance Basics Quiz

View Set

Life Insurance Underwriting and Policy Issue - Chapter 5

View Set

AP Psychology (Research Methods)

View Set

Chapter 1: Perspectives on Maternal, Newborn, and Women's Health Care

View Set

Ch 8 Managing in Competitive, monopolistic and Monopolistically Competitive Markets

View Set

BA Chapter 12: Managing the Marketing Mix: Product, Price, Place, and PromotionAssignment

View Set

Ch. 14- Care of Preoperative Patients

View Set