################################################################################ ### R BASICS WORKSHOP ### ### EXERCISE 1.1: A Sample R Session ### ### ### ### Center for Conservation and Sustainable Development ### ### Missouri Botanical Garden ### ### Website: rbasicsworkshop.weebly.com ### ################################################################################ ## OBJECTIVE: ## The purpose of this exercise is to become accustomed to R and the way it ## responds to line commands. ## PART 1 ## # Here you will be preliminarily introduced to several important concepts that we # will cover in more detail during the rest of the workshop # The '<-' is used to indicate an assignment. It is often used to save data # into an object: x <- 50 # In this first command, you assigned the value of 50 to an object named 'x'. # Objects in R are used to store information. To find what is stored in an # object, you just need to type its name into the console: x # As you might have already noticed, R does not run lines that start with '#'. # This fact is used to create commentaries. # R already has values of some fundamental constants. For example, to find the # value of pi just type: pi # You can also copy the value of pi to another object, For example: y <- pi y pi # Actions in R, like data manipulations, graphics and analyses are conducted # using functions, which are elements in R that perform specific actions. # For example, the function 'rnorm' generates random values from a normal # distribution: rnorm(n=50) ## This is another version of line of code I just wrote: rnorm(n=10) # Functions act on or are modified by arguments. Arguments define how a function # will work. In this example, the function 'rnorm' is modified by argument 'n' # that has the value of 50. As a consequence, you get 50 random values from a # normal distribution. You can ask for as many values as you want: rnorm(n=25) rnorm(n=5) rnorm(n=1) # You can store the output of a function in an object. Then, these output can be # used for other purposes: z <- rnorm(n=50) # You now can do things with the values stored in 'z'. For example, you can use the # function 'mean' to calculate the mean of the values in 'z': mean(z) # You can calculate other statistics, like the standard deviation, or just create # a summary of the values in z: sd(z) summary(z) # You can change the order of the values: sort(z) sort(z, decreasing=TRUE) # Note here, for example, that the function sort takes two arguments, one is the # the values in 'z', the other is the value 'TRUE'. You will learn more # about arguments shortly in the workshop. # You can also make a histogram of those values: hist(z) # R has also 'operators' that perform a multitude of actions. The most common are # the arithmetic operators for sum '+', subtraction '-', multiplication '*', and # division '/'. For example, we can multiply the values in 'z' by a constant: z*2 # We can also write a single line of code that performs multiple actions and # saves the output, for example y <- rnorm(50)*2 # This creates 50 random values from a normal distribution, then multiplies # each value by 2 and finally stores the results in an object named 'y'. # We can also create more complicated sequences of actions, for example: y <- 0.5 + 1.5*z + rnorm(50) # This 1) creates a random set of 50 values from a normal distribution, 2) # multiplies the values in 'z' by 1.5, 3) sums element-by-element the results # of (1) and (2), 4) sums 0.5 to each value in the result of (3), and 5) # stores the results of these computations into 'y'. To see the values in 'y', # just type: y # Now you can use the values in objects 'z' and 'y' to a number of things. For # example, to make a scatter-plot you use the function 'plot': plot(z, y) # This will open a graphics window automatically. To find the correlation # between 'z' and 'y': cor(z, y) # To produce a boxplot and conduct a t-test: boxplot(z,y) # To preform a t-test: t.test(z, y) # To preform a one-tailed t-test: t.test(z, y, alternative="greater") # Notice the difference in p-value. # To check to see what is in your workspace thus far, type: ls() # Note that the objects you have created are listed ('z' and 'y') ## PART 2 ## # In this second part, you will continue to play with various elements of R. # Just run the code, and look at the output. It would be best if you try to # type the code into the console rather than just copy-paste: # To make a variety of graphs of sin(theta): theta <- seq(0, 2*pi, length=100) plot(theta, sin(theta)) par(new=TRUE) plot(theta, sin(theta), type="h") plot(theta, sin(theta), type="l") plot(theta, sin(theta), type="s") theta <- seq(0, 2*pi, length=10) plot(theta, sin(theta), type="l") # To see what these commands mean, type: help(plot) # To make some simple arithmetic and repeating sequences, type: c(1:25) seq(1, 25) seq(25, 1, -1) seq(1, 25, 2) seq(1, 25, length=6) seq(0, 2, 0.1) rep(0, 25) rep(1, 25) # Make a vector of integers from 1 to 25: n <- c(1:25) # Make a column of weight vectors equal to the square root of n: w <- sqrt(n) # Simulate some response variables, and display them in a table: r <- n + rnorm(n) * w data.frame(n, r) # Create a regression line, display the results, create a scatter-plot, and draw # the regression line on the plot in red: regress.rn <- lm(r ~ n) summary(regress.rn) plot(n, r) abline(regress.rn, col="red") # Note that the order of r and n for the regression line is reversed from the # order in the plot. # Plot the residuals and put labels on the axes: plot(fitted(regress.rn), resid(regress.rn), xlab="FittedValues", ylab="Residuals", main="Residuals vs Fitted") # Simulate 100 tosses of a fair coin and view the results: x <- rbinom(100,1,0.5) x # Next, keep a running total of the number of heads, plot the result with # steps (type = "s"): c <- cumsum(x) plot(c, type="s") # Roll a fair dice 1000 times and look at a summary: fair <- sample(c(1:6), 1000, replace=TRUE) summary(fair) # Roll a biased dice 1000 times and look at a summary: biased <- sample(c(1:6), 1000, replace=TRUE, prob=c(1/12,1/12,1/12,1/4,1/4,1/4)) summary(biased) # The next data set arise from the famous Michaelson-Morley experiment. There # are five experiments (column 'Expt') and each has 20 runs (column 'Run') and # 'Speed' is the recorded speed of light minus 290,000 km/sec. To see the # dataset, type: morley # The data in the first two columns are labels. Make the experiment number a # factor: morley$Expt <- factor(morley$Expt) # Now, make a labeled boxplot of the speed in column 3: boxplot(morley[ ,3] ~ morley$Expt, main="Speed of Light Data", xlab="Experiment", ylab="Speed") # Perform an analysis of variance to see if the speed are measured speeds are # significantly different between experiments: anova.mm <- aov(Speed ~ Expt, data=morley) summary(anova.mm) # Draw a cubic: x <- seq(-2, 2, 0.01) plot(x, x^3-3*x, type="l") # Draw a bell curve: curve(dnorm(x), -3, 3) # Look at the probability mass function for a binomial distribution: x <- c(0:100) prob <- dbinom(x, 100, 0.5) plot(x, prob, type="h") # To plot a parameterized curve, start with a sequence and give the x and y # values: angle <- seq(-pi, pi, 0.01) x <- sin(3*angle) y <- cos(4*angle) plot(x, y, type="l") # Now we will plot contour lines and a surface. First, we give a sequence of # values. This time we specify the number of terms: x <- seq(-pi, pi, len=50) y <- x # Then, we define a function for these x and y values and draw a contour map. f <- outer(x, y, function(x, y) (cos(3*x) + cos(y)) / (1 + x^2 + y^2)) contour(x,y,f) # To draw a surface plot: persp(x,y,f,col="orange") # To change the viewing angle: persp(x, y, f, col="orange", theta=-30, phi=45)