############################################################ ### R BASICS WORKSHOP ### ### PRESENTATION 2.1: FUNCTIONS AND ARGUMENTS ### ### ### ### Center for Conservation and Sustainable Development ### ### Missouri Botanical Garden ### ### Website: rbasicsworkshop.weebly.com ### ### Last modification: 2021-June-04 ### ############################################################ ### A. WRITING IN R IS SIMILAR TO WRITING IN ENGLISH ####### # A command in English: # "Skip three times forward" # 1. *Skip* defines the requested action (verb) # 2. *three times* and *forward* modify of that action # Another command in English: # "Generate a sequence from 5 to 20 with values every 0.5" # 1. *Generate a sequence* defines the action to be taken # 2. *From 5* modifies the action defining where to start # 3. *to 20* defines the end of the sequence # 4. *with values every 0.5 * defines the spacing of values # How do you give the above command to a computer? # Using a programming language (like R)! seq(from = 5, to = 20, by = 0.5) # 1. *seq* is the name of a function and defines an action # 2. *from*: is the name of an argument that defines the # beginning of the sequence. In this case the value given # to this argument is *5* # 3. *to* is a second argument that defines the end of the # sequence. In this example this argument takes a # value of *20* # 4. *by* is a third argument that defines the separation # between values in the sequence. The argument *by* in # this example takes a value of *0.5* ## FUNCTION: is an element in R that requests an action ## from your computer. Functions contain algorithms to ## perform particular tasks ## ARGUMENT: is an element in R that specifies or modifies ## how a function works. Arguments are given between ## parenthesis after a function name ### B. BASIC ANATOMY OF A COMMAND IN R ##################### seq(from = 5, to = 20, by = 0.5) # 1. Function name # 2. Open parenthesis # 3. Name of argument # 4. Equal sign (=) # 5. Value given to the argument # 6. Comma (,) # 7. Repeat 3, 4, and 5 for each argument # 8. Close parenthesis # Some other examples: rep(x = 8, times = 3) # The function *rep* repeats a piece # of data rep(x = 3, times = 8) rnorm(n=10, mean=2, sd=1) # The function *rnorm* generates # random values from a normal # distribution rnorm(n=100, mean=-10, sd=3) ## Note the repeating structure of the commands! ## ### C. SOME RULES ABOUT WRITING COMMANDS IN R ############## # RULE 1. Each function has its own arguments. rep(x = 8, times = 3) # *rep* has *x*, *times* rnorm(n = 10, mean = 2, sd = 1) # *rnorm* has *n*, *mean*, *sd* # RULE 2. The description of each function and its arguments # can be found in the help page of the function. To # access this information, use the function *help*. help(topic="seq") # RULE 3. Arguments USUALLY have names (e.g.: from, to, by), # and values are passed to each argument using "=". # RULE 4. Names of arguments can be eliminated if values are # given in the pre-determined order. For example, # these two commands are equivalent: seq(from = 5, to = 20, by = 0.5) # Most explicit (with names) seq(5, 20, 0.5) # Quickest, no argument names seq(5, to = 20, 0.5) # Mix, some with names some without # But they are different from this: seq(0.5, 5, 20) # RULE 5. Each function has a pre-determined order for its # arguments. For example, for the function *seq* the # order is *from* first, then *to*, and then *by*. help(topic="seq") # This info is in the help page of the # function *seq* # RULE 6. The order of the arguments can be changed ONLY if # you use their names. For example, these commands # are equivalent: seq(from = 5, to = 20, by = 0.5) seq(by = 0.5, from = 5, to = 20) # RULE 7. Some arguments have pre-determined values! rnorm(n=10) # What are the pre-determined values for the argumens in # function *rnorm*? # These arguments with pre-determined values don't need to # be specified for the function to work, but one has to be # careful. Make sure the defaults are what you want! # RULE 8. When *...* appears in the help file of a function, # it frequently means multiple arguments with no # names. For example, in the function *c*, ... means # many values that will be concatenated: help(c) # Also could have been written as help(topic="c") c(9, 5, 3, 5) # RULE 9. R is case sensitive, so the function *seq* exist, # but the function *Seq* does not: Seq(from = 5, to = 20, by = 0.5) # RULE 10. In R, white space is meaningless: seq(from=5, to=20, by=0.5) seq(from = 5, to = 20, by = 0.5) seq(from=5, to=20, by= 0.5) ### D. SOME ADDITIONAL EXAMPLES OF SIMPLE COMMANDS ######### # *rep* rep(x = "R", times = 10) rep(times = 10, x = "R") rep("R", 10) rep(10, "R") # Why doesn't this work? # *rnorm* rnorm(n=10) rnorm(n=100, mean=10, sd=5) # *rpois* rpois(n=10, lambda=5) rpois(n=10) # Why doesn't this work? # *paste* paste("R", "Basics", "Workshop", sep="_") paste("R", "Basics", "Workshop", sep=" ") # *sum* sum(19, 4, 2, 6, 2) sum(6, 4, 19, 2, 2) # *log* log(x=10) # What type of logarithm is calculated here? ### E. FUNCTIONS WITHIN FUNCTIONS ########################## # It is very common in R to have commands (lines of code) # that have multiple functions simultaneously, with # functions within functions. For example: c(19, 4, 2, 6, 2) # Concatenates multiple values. mean(x=c(19, 4, 2, 6, 2)) # Calculates the average of the # values in the argument x. mean(x=19, 4, 2, 6, 2) # This version does NOT do the same! mean(x=rnorm(100)) # Another example a little more complex: rnorm(n=50, mean=0, sd=1) # Generates 50 values from a # normal distribution with an # average of 0 and a standard # deviation of 1 boxplot(x=list(rnorm(n=50, mean=0, sd=1), rnorm(n=500, mean=3, sd=1))) ## THE FUNCTIONS "INSIDE" ARE ALWAYS EVALUATED (RUN) BEFORE ## THOSE "OUTSIDE" # When you find these complex commands, it can be useful to # translate them into English from the inside out # In R: boxplot(x=cbind(rnorm(n=50, mean=0, sd=1), rnorm(n=50, mean=3, sd=1))) # In English: # A. Obtain 50 random values from a normal distribution with # a mean of zero and a sd of one. B. Obtain another 50 # values from a normal distribution with a mean of 3 and a # sd of 1. C. Combine the two sets of values by columns. # D. Use the values in the list to make a boxplot. ### F. MAIN SOURCES OF HELP ON FUNCTIONS AND ARGUMENTS ##### # 1. Read the help file for the function # 2. Search the web - use google (it has all the answers: # https://www.youtube.com/watch?v=YuOBzWF0Aws) # 3. Ask a question in an online forum - # www.r-project.org/mail.html # 5. Study the code behind the function - type the name of # the function without the parentheses ### G. THE R HELP FILE FOR A FUNCTION ###################### help(lm) # The most important parts of the help file: # 1. DESCRIPTION - a brief description of what the # function does # 2. USAGE - how to implement the function # 3. ARGUMENTS - a description of each of the function # arguments # 4. DETAILS - Details about the how the function works # 5. VALUE - a description of the results of the function # 6. SEE ALSO - a list of related functions # 7. EXAMPLES - a series of examples on how to use the # function ## GET USED TO CONSULTING THE HELP FILES OFTEN! ##