Lecture 1: Introductory Programming & Exploratory Data Analysis with R
What are vectors?
- simplest data structures in R - 1 dimensional data structures that contain a single data type - simply a series of values that are all of one type e.g.12345
B read down: read.table(file = "file.csv", header = TRUE, sep = ",")
1. header= Name of the file. If the file is not in your current working directory, you must specify the location
When was S developed?
1970s at Bell Labs (same place where transistors, lasers, the Unix operating system, and the C programming language were invented)
1st rule of naming variables in R
Can be a combination of: letters, numbers, . (dot), _ (underscore)
What does a data frame consist of?
Each column in a data frame is a vector of the same length
What year was R developed?
during the 1990s
3rd rule of naming variables in R
If it starts with a dot, it cannot be followed by a number
What does numeric data types consist of?
In R, the numeric data type consists of both integers (e.g. 4, 10 1524 etc.) and floats (e.g. 4.2, 10.5, 1524.75647).
2nd rule of naming variables in R
Must start with a letter or a dot
What does it mean when you use quotes in R?
R will see anything in quotes as a character. If you put a number in quotes it will have a data type of character, not numeric
4th rule of naming variables in R
Reserved words in R cannot be used to name a variable
Who created R programming?
Ross Ihaka and Robert Gentleman developed a new programming language based largely on S
Before R, what language was present?
S
What was S designed for?
S was designed as a statistical computing language
What does S stand for?
Statistics
How do you stop results of your code?
To store the results of your code in a variable use the assignment operator. Entering your variable name will print the contents to the console.
Why should you use whitespace in your code?
Using it allows for better visual clarity
What does a common data structure consist of?
Vector, List, Matrix, and Data Frame
Purpose of calculations in R
We often want one calculation to build on the result of a previous calculation
Can you use R for arithmetic?
Yes you can use R to perform arithmetic operations and return the result. The result is printed to the screen and that's it.
What is a data type?
a particular kind of data item, as defined by the values it can take, the programming language used, or the operations that can be performed on it. ex: numeric, character, logical, and factor
Working with data of type character involves...
assigning data of type character to a variable similar to the way we did for data of type numeric
Vectors are created in R using the....
c() function - c as standing for concatenate or combine
Key difference when assigning character data to variable...
character data must be in quotes
What is logical data type in R?
comprised solely of two values: TRUE and FALSE
Another extremely common data structure in R is.....
data frame
What is a data frame?
data frames are 2-dimensonal data structures that can hold multiple data types
What do all objects in R contain?
data of a particular data type (numeric, character, logical, factor, etc...) and have a data structure
Objects in R contain
data of some type(s) and attributes associated with them
How do you create functions in R?
function call
Functions. are called with the following syntax:
function_name(argument_1, argument_2)
What are logical operators?
group of operators that will all return data of type logical
What is R studio?
is an Integrated Development Environment (IDE) designed specifically for R
What are IDEs?
like R studio, they are user inter faces that make writing, running, and debugging code much simpler than in a standard text editor
Strength of S includes...
lots of built-in statistical functionality + easy data visualization
What is the most common source of error in R?
not knowing or assuming that an object is of a particular data type
Each row of a data frame is called a
observation
What are arguments?
pieces of information that the function needs in order to carry out the designated task
What are attributes?
provide information about the data and a means by which specific slices of the data can be specified ex: in a data frame, ex. of attributes are Column Names and Row Names
The read.table() function allows us to...
read a local dataset from a file into our R environment
How do you specify arguments in read.table()?
read.table() has many possible arguments, but typically to get what we want we only need to specify three arguments: File = , Header = , and Sep =
What is factor data type?
special data type that is used for categorical data AND can be used to group your datasets into useful sub-groups for analysis
True or False: read.table() is flexible in that it can read text files that are delimited in any way e.g. tab, comma, space etc.
true
Column of a data frame is called a
variable