Learning R Lab 3 Part 1
#11. Investigate the contents of the BOD dataset that comes with gcookbook R package.
?BOD df3 = BOD str(BOD)
#7. Use the str() function to explore the diamonds dataset that comes with ggplot2 package.
diamonds df2 = diamonds str(df2)
#6. Use the function facet_wrap() to show the relationship between displ and hwy for each of the car types (class) in a separate graph.
ggplot(df, aes(displ, hwy))+ geom_point()+ facet_wrap(~class, nrow = 2) #you want to visualize each class in two different rows individually #you can also even do a facet wrap by two variables ggplot(df, aes(displ, hwy))+ geom_point()+ facet_wrap(cyl~class)
#5. Use the size aesthetic to show the different types of cars (class) instead of the color aesthetic.
ggplot(df, aes(displ, hwy))+ geom_point(aes(size = class)) #using a different characteristic like size within aes can work on variable but still has still has to be within the aes #size is not recommended to distingish by a categorical variable
#3. Use color to distinguish between different types (class) of cars in the previous plot.
ggplot(df, aes(x = displ, y = hwy)) + geom_point(aes(color = class)) #color = class which is a variable therefore it has to be within aes #or ggplot(df, aes(x = displ, y = hwy, color = class))+ #variable can be colored in aes in both the first line as well as within the geom as long as its within aes geom_point()
#2. Use the mpg dataset that comes with the ggplot2 package, and create a scatterplot that shows the relationship between the engine displacement displ and the highway miles per gallon variable hwy.
ggplot(df, aes(x = displ, y = hwy))+ #variables when you want to manipulate within ggplot have to be within aes geom_point() #scatterplot
#4. In the previous plot, we assigned a categorical variable to the color aesthetics. In this exercise, try to add the color aesthetics to the geom_point() line of code. Explain what happened.
ggplot(df, aes(x = displ, y = hwy, color = class))+ geom_point(color = "red") #even if you wanted to color by class, but then go to the geom_ and use color without aes, it will overide the previous color #or ggplot(df, aes(x = displ, y = hwy))+ geom_point(color = "red") #whether color = class was coded or not didnt' matter the geom_ color overid it
#9. Check what happens if you try to create a count bar chart for the variable carat.
ggplot(df2, aes(carat))+ geom_bar() #this is a nono, the data type of the variable does not go with this chart
#8. Create a barchart that shows the count distribution for each of the cut levels of the diamonds.
ggplot(df2, aes(cut))+ geom_bar() #typically used with one x var and count or frequency as y variable automatically
#15. Change the color of the bars in the previous graph into lightblue.
ggplot(df3, aes(factor(Time), demand))+ geom_col(fill = "lightblue", color = "red", width= .5) #note that fill becomes what actually colors and color becomes the boundary
#14. . Change the width of each bar in the previous graph into 0.5.
ggplot(df3, aes(factor(Time), demand))+ geom_col(width = 0.5)
#12. Create a bar chart that shows you the biochemical oxygen demand for each time in an evaluation of water quality.
ggplot(df3, aes(x = Time, y=demand)) + geom_col() #use geom_col() because you have an x and y var specified #or #ggplot(df3, aes(Time, demand))+ # geom_bar("identity") #13. How can you improve the previous graph? ggplot(df3, aes(x = factor(Time), y = demand))+ geom_col() #note that you can factor a variable within the dataset as well on the visual
#10. Load the gcookbook package to R.
install.packages("gcookbook") library(gcookbook)
#1. Load the ggplot2 package into R
install.packages("ggplot2") library(ggplot2) plot hist ?ggplot mpg ?mpg df = mpg