Chapter 2 Functions
In programming, functions are like little blocks of code that perform a specific task. Think of them as reusable instructions that you can call whenever you need them.
Here’s why functions are super helpful:
- Avoid repetition: Instead of writing the same code multiple times, you can just call the function.
- Cleaner code: Your code becomes easier to read and maintain because functions help organize it better.
- Easier debugging: When something goes wrong, you only need to check the function itself rather than searching through your entire program.
Why Use Functions?
Imagine having to rewrite a set of instructions every time you need them! With functions, you write the code once and reuse it as many times as you want. A good rule of thumb is: if you expect to run a specific set of instructions more than twice, create a function for it.
What Can Functions Do?
Functions are flexible and can be used for many different purposes:
- Take input (called arguments)
- Process the input based on what the function is meant to do
- Return a result after completing the task
2.1 Writing Functions
Lets take a tour on different types of functions in R before diving deep into writing functions. This will help you understand when to write functions and when to use readily-available functions. There are three main types of functions:
- User-Defined Functions (UDF) – Custom functions you write for your specific needs.
- Built-in functions – These come pre-loaded in R. Example:
mean()
- Package functions – Functions from external R packages you can install. Example:
ggplot()
andselect()
fromggplot2
anddplyr
respectively.
2.1.1 User-Defined Functions
The best way to grasp how functions work in R is by creating your own! These are called* User-Defined Functions (UDFs), and they allow you to design custom tasks that fit your needs.
In R, functions typically follow this format:
function_name <- function(argument_1, argument_2) {
# Function body (your instructions go here)
return(output)
}
Let’s break down the key elements:
- Function Name:
- This is how you’ll call your function later. When you create a function, you assign it a name and save it as a new object. For example, if you name your function
calculate_mean
, that’s the name you’ll use every time you want to run the function.
- Arguments (also called Parameters):
- Arguments are placed inside the parentheses. They tell the function what input to expect or how to modify its behavior. Think of them as placeholders for the data you’ll provide later when you run the function.
- Function Body:
- Inside the curly brackets
{}
, you’ll write the instructions that the function will follow to accomplish the task. This is the “heart” of the function.
- Return Statement:
- The
return()
function tells R what result to give you after the function finishes its job. It’s optional, but it helps if you want to store the function’s result in a variable.
Let’s write a simple function that calculates the mean (average) of two numbers:
How to Use the Function:
To find the mean of 10 and 20, simply call the function like this:
## [1] 15
Let’s add a few more simple tasks: writing a function that calculates the difference between two numbers. Why is this important? Well, imagine you have two values and you want to find their difference—that’s exactly what this function will help us do!
# Function to calculate the difference between two numbers
calculate_difference <- function(x, y) {
# Subtract the second number (y) from the first number (x)
difference <- x - y
# Return the difference result so we can use it later
return(difference)
}
You see!:
x
andy
are our arguments: These are the two numbers we’ll use in our calculation.- The subtraction happens inside the function: We simply subtract
y
fromx
and store the result in difference. Finally, we return the difference: This way, we can use the result when we call the function.
Now, let’s put it to the test! We’ll run the function with different sets of numbers and see what we get:
## [1] 5
## [1] 10
## [1] 20
Notice how easy it is to calculate the difference between any two numbers by just calling our function? That’s the power of writing your own functions—they make life a lot easier!
Now, lets make it more interesting! How about a function that greets you by name? We can do the same in R by creating a simple function that takes someone’s name and returns a greeting.
Here is how we do it:
# Function to greet a student by their name
greet_student <- function(student_name) {
# Create a personalized greeting
greeting <- paste("Hello", student_name, "!")
# Return the greeting so we can use it later
return(greeting)
}
Remember!
- We use
student_name
as the argument: This is where you pass in the name of the student. - We combine
"Hello"
with the name: Thepaste()
function(that is an -in-built function which will discuss later in the course) helps us put the pieces together to form a full sentence. - Return the greeting: The function gives us back a customized message, ready to greet anyone!
Lets try it out with different names
## [1] "Hello John !"
## [1] "Hello Alice !"
## [1] "Hello Michael !"
Remember to try it out with your name!
Key Takeaways:
By writing these two simple functions, you’ve already tackled a lot of important concepts in R! You now know:
- How to create a function.
- How to pass arguments (inputs/parameters) to a function.
- How to return a result that you can use later.
Practical Exercise
In this exercise, you’ll get hands-on practice creating your own functions in R. Follow the instructions below to write functions that perform specific tasks. Remember to test your functions with different input values!
- Create a function called
add_numbers
that takes two arguments,a
andb
, and returns their sum. - Write a function named
is_even
that takes a single argument,num
, and returns"Even"
if the number is even, or"Odd"
if it’s odd. - Create a function called
find_max
that takes three arguments and returns the largest of the three numbers.
Solution
- Create a function called
add_numbers
that takes two arguments,a
andb
, and returns their sum.
# Function to calculate the sum of two numbers
sum_two_numbers <- function(x, y) {
sum <- x + y
return(sum)
}
# Test the function with different values
sum_two_numbers(5, 10) # Output: 15
## [1] 15
## [1] 50
## [1] 300
- Write a function named
is_even
that takes a single argument,num
, and returns"Even"
if the number is even, or"Odd"
if it’s odd.
# Function to check if a number is even or odd
check_even_odd <- function(number) {
if (number %% 2 == 0) {
return("Even")
} else {
return("Odd")
}
}
# Test the function with different numbers
check_even_odd(4) # Output: "Even"
## [1] "Even"
## [1] "Odd"
## [1] "Even"
- Create a function called
find_max
that takes three arguments and returns the largest of the three numbers.
# Function to find the maximum of three numbers
max_of_three <- function(a, b, c) {
max_value <- max(a, b, c) # Use the built-in max function
return(max_value) # Return the maximum value
}
# Test the function with different values
max_of_three(10, 20, 5) # Output: 20
## [1] 20
## [1] 3
## [1] 15
________________________________________________________________________________
2.1.2 Built-in Fuctions
We have learned how to create our own user-defined functions (UDFs) to perform specific tasks. Now, let’s dive deeper into R’s capabilities by exploring its built-in functions. These handy tools are readily available for you to use anytime, making your coding experience even smoother.
R is packed with a treasure trove of built-in functions that allow you to perform a variety of tasks with just a few simple commands. Whether you’re crunching numbers or analyzing data, these functions are your best friends.
Here’s a sneak peek at some of the most useful built-in functions in R:
print()
: This function displays an R object right on your console. It’s like saying, “Hey, look at this!”
## [1] "Hello Mum"
min()
andmax()
: Need to find the smallest or largest number in a bunch? These functions will do just that for a numeric vector.sum()
: Want to add up a series of numbers? Usesum()
to get the total of a numeric vector.mean()
: This function calculates the average of your numbers. Perfect for when you need to find the middle ground!range()
: Curious about the minimum and maximum values of your numeric vector? range() has you covered.str()
: Want to understand the structure of an R object?str()
will give you a clear picture of what’s inside.ncol()
: If you’re working with matrices or data frames, this function tells you how many columns you have.length()
: This one returns the number of items in an R object, whether it’s a vector, a list, or a matrix.
Here’s a quick example to show you how easy it is to use these functions with a vector of numbers:
## [1] 1.0 3.0 0.2 1.5 1.7
## [1] 7.4
## [1] 1.48
## [1] 5
As you can see, working with R’s built-in functions is straightforward and super helpful. Start experimenting with these functions and watch how they can simplify your coding experience!
Key Takeaways:
By completing this exercise, you’ve already tackled several important concepts in R! You now know:
- How to create a vector and use it for calculations.
- How to utilize built-in functions like
sum()
,max()
,min()
,mean()
, andlength()
. - How to derive meaningful statistics from data using R’s built-in capabilities
R has a wealth of resources on this topic, and as you gain more experience and knowledge, you’ll uncover even more advanced built-in functions that can simplify your programming tasks.
Practical Exercise
In this exercise, you are required to create a vector named numbers
that contains the following values: 4, 8, 15, 16, 23, 42
. After creating the vector, you will use various built-in functions to analyze it based on the instructions below;
- Use the
sum()
function to calculate the total of thenumbers
vector. - Use the
max()
function to find the maximum value in thenumbers
vector. - Use the
min()
function to find the minimum value in thenumbers
vector. - Use the
mean()
function to calculate the average of thenumbers
vector. - Use the
length()
function to find out how many elements are in your numbers vector.
Solution
In this exercise, you are required to create a vector named numbers
that contains the following values: 4, 8, 15, 16, 23, 42
. After creating the vector, you will use various built-in functions to analyze it based on the instructions below;
- Use the
sum()
function to calculate the total of thenumbers
vector.
## [1] 108
- Use the
max()
function to find the maximum value in thenumbers
vector.
## [1] 42
- Use the
min()
function to find the minimum value in thenumbers
vector.
## [1] 4
- Use the
mean()
function to calculate the average of thenumbers
vector.
## [1] 18
- Use the
length()
function to find out how many elements are in your numbers vector.
## [1] 6
________________________________________________________________________________
2.1.3 Package Functions
Just like we’ve learned about User-Defined and Built-in Functions, R also provides a vast number of additional functions through packages. These packages extend R’s capabilities and allow you to perform specific tasks, from data manipulation to machine learning, with ease.
What are R Package Functions? Packages in R are collections of R functions, data, and compiled code that are stored in a well-defined format. While R comes with a set of built-in functions, packages allow you to go beyond the basic functionality. You can install and load packages based on the task you want to accomplish.
Think of package functions as tools in a toolbox: not everything is built-in, but by adding specific tools, you can perform new tasks easily.
Lets explore how to get started using the functions;
- Installing and Loading Packages
To use functions from a package, you first need to install the package and load it into your R session.
install.packages("package_name")
Every time you start a new R session if you want to use the functions from that package. Load the package by;
library(package_name)
To put this into real-life action, let’s learn about the dplyr
package, which is commonly used for data manipulation. It contains many useful functions to work with data frames or tibbles (a modern version of data frames).
Here’s an example of how to install and load dplyr, and use some of its core functions.
- Install the package
install.packages("dplyr") # Install it once
- Load the package
- Let’s explore a few package functions from dplyr:
select()
: Chooses specific columns from a dataset.filter()
: Filters rows based on conditions.mutate()
: Adds new variables (columns) or modifies existing ones.summarise()
: Summarizes data, such as calculating the mean or total
We will create a data frame to demonstrate how to use functions from the dplyr
package.
# Create a data frame for demonstration
data <- data.frame(
Name = c("John", "Jane", "David", "Anna"),
Age = c(28, 34, 22, 19),
Score = c(85, 90, 88, 92)
)
# 1. Select only the Name and Score columns
selected_data <- select(data, Name, Score)
selected_data
## Name Score
## 1 John 85
## 2 Jane 90
## 3 David 88
## 4 Anna 92
# 2. Filter rows where Score is greater than 88
filtered_data <- filter(data, Score > 88)
filtered_data
## Name Age Score
## 1 Jane 34 90
## 2 Anna 19 92
# 3. Add a new column that increases Score by 10
mutated_data <- mutate(data, New_Score = Score + 10)
mutated_data
## Name Age Score New_Score
## 1 John 28 85 95
## 2 Jane 34 90 100
## 3 David 22 88 98
## 4 Anna 19 92 102
# 4. Calculate the average age
summary_data <- summarise(data, Average_Age = mean(Age))
summary_data
## Average_Age
## 1 25.75
In this example, we used functions from the dplyr
package to select columns, filter rows, modify data, and summarize it!
Key Takeaways:
By learning about R package functions, you’ve unlocked even more tools to work efficiently in R. Here’s what you’ve learned today:
- How to install and load R packages.
- How to use package functions like those in
dplyr
for data manipulation. - How to perform tasks like selecting columns, filtering data, and summarizing values.
Packages in R allow you to extend the functionality of the base language for specific tasks. With packages, R becomes an even more powerful tool, allowing you to work with more advanced data sets and perform complex operations with ease!
Practical Exercise
In this exercise, you will use the functions from the dplyr
package to manipulate the iris
data set. Remember the dplyr
package is installed by:
install.packages("dplyr")
and is loaded by:
library(dplyr)
The iris data set is loaded by
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Solve the following questions;
- Use the
select
function to select theSepal.Length
,Sepal.Width
, andSpecies
columns. - Use the
filter
function to filter rows whereSepal.Length
is greater than 5. - Use the
mutate
function to create a new columnSepal.Ratio
that dividesSepal.Length
bySepal.Width
.
Solution
In this exercise, you will use the functions from the dplyr
package to manipulate the iris
data set. Remember the dplyr
package is installed by:
install.packages("dplyr")
and is loaded by:
The iris data set is loaded by
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 4.9 3.0 1.4 0.2 setosa
## 3 4.7 3.2 1.3 0.2 setosa
## 4 4.6 3.1 1.5 0.2 setosa
## 5 5.0 3.6 1.4 0.2 setosa
## 6 5.4 3.9 1.7 0.4 setosa
Solve the following questions;
- Use the
select
function to select theSepal.Length
,Sepal.Width
, andSpecies
columns.
## Sepal.Length Sepal.Width Species
## 1 5.1 3.5 setosa
## 2 4.9 3.0 setosa
## 3 4.7 3.2 setosa
## 4 4.6 3.1 setosa
## 5 5.0 3.6 setosa
## 6 5.4 3.9 setosa
- Use the
filter
function to filter rows whereSepal.Length
is greater than 5.
## Sepal.Length Sepal.Width Petal.Length Petal.Width Species
## 1 5.1 3.5 1.4 0.2 setosa
## 2 5.4 3.9 1.7 0.4 setosa
## 3 5.4 3.7 1.5 0.2 setosa
## 4 5.8 4.0 1.2 0.2 setosa
## 5 5.7 4.4 1.5 0.4 setosa
## 6 5.4 3.9 1.3 0.4 setosa
- Use the
mutate
function to create a new columnSepal.Ratio
that dividesSepal.Length
bySepal.Width
.
## [1] 1.457143 1.633333 1.468750 1.483871 1.388889 1.384615
________________________________________________________________________________
2.1.4 Type of arguments in R functions
Now that we’ve learned and explored different types of functions, let’s dive into function arguments to strengthen your understanding of writing functions. Arguments are essential components of any function. Although it’s possible to write a function without parameters, like the example below, most functions do require arguments to tell them what data to process
Why Arguments Matter
Arguments are the input for functions. They allow us to give the function specific values to work with. If we want a function to handle different cases or data, arguments give us that flexibility.
When defining arguments, you include them inside the parentheses of the function definition, separated by commas. Generally, functions with more arguments tend to be more complex, but they also offer greater control over what the function does.
# Creating a function with arguments
my_function <- function(argument1, argument2){
# function body
}
Handling Missing Arguments
Whenever you create a function with parameters, you must provide the values for those parameters when calling the function. Otherwise, R will return an error. For example, if you forget to supply both numbers in a function to calculate their mean, the function won’t work.
But you can avoid this issue by using default arguments. These are preset values that the function will use if you don’t provide them during the call. Let’s modify our mean function to demonstrate:
In this version, if you only provide one value when calling the function, R will automatically use the default value for the second number (which is 30 in this case):
## [1] 20
You now understand how arguments work and the importance of default values in making functions more flexible and error-proof.
Practical Exercise
- Create a function
greet
that prints a simple message like"Hello, welcome to R programming!"
. - Write a function
multiply_numbers
that takes two arguments,a
andb
, and returns the product of these numbers - Create a function
calculate_total
that accepts two arguments, price and tax_rate. Set a default value oftax_rate = 0.15
(15%).
Solution
- Create a function
greet
that prints a simple message like"Hello, welcome to R programming!"
.
## [1] "Hello, welcome to R programming!"
- Write a function
multiply_numbers
that takes two arguments,a
andb
, and returns the product of these numbers.
## [1] 48
- Create a function
calculate_total
that accepts two arguments, price and tax_rate. Set a default value oftax_rate = 0.15
(15%).
calculate_total <- function(price, tax_rate = 0.15) {
total <- price + (price * tax_rate)
return(total)
}
# Call the function
calculate_total(price=160)
## [1] 184
________________________________________________________________________________
2.1.5 Understanding Return Values in R Functions
In many programming languages, functions take data as input and produce some result as output. Often, you must use a return
statement to explicitly give back the result. Otherwise, the value might only be visible inside the function and not available to use later. But in R, the situation is a little more relaxed!
In R, a function will always return a value that can be stored in a variable, even without a return statement. However, for clarity and good practice, it’s still helpful to include return to show your intent.
Let’s walk through an example:
mean_sum <- function(num_1, num_2) {
mean <- (num_1 + num_2) / 2
sum <- num_1 + num_2
return(list(mean = mean, sum = sum))
}
Now, calling the function:
## $mean
## [1] 15
##
## $sum
## [1] 30
2.2 Calling the Functions
In previous sections, we’ve seen how to call functions with different arguments. Now, let’s dig a little deeper into how R works behind the scenes when you pass arguments to a function.
R allows two main ways of passing arguments:
- By position – The arguments are passed in the same order as the function definition.
- By name – You explicitly mention the argument name and its value.
You can also mix these two strategies! Let’s explore these options using an example.
Here’s a simple function that takes two arguments: name and surname.
Lets call the function using different strategies;
- By Position
You pass the arguments in the exact order the function expects.
## [1] "Hello Jane McCain"
- By Name
When using this method, the order doesn’t matter. You just specify the argument names.
## [1] "Hello Jane McCain"
- Mixing Position and Name
You can mix both approaches. Named arguments are matched first, then the remaining ones are matched by position
## [1] "Hello Jane McCain"
This flexibility can make your code easier to read and maintain, especially when functions have many arguments!
2.3 Function Documentation
Finally when writing functions, it’s always a good idea to provide documentation to guide users on how to use the function. This is especially important when dealing with complex functions or when the function is shared with others.
One simple way to add documentation is by including comments in the body of your function. These comments explain what each part of the function does. This is an informal method, but it helps both you and others quickly understand what’s happening in the function.
Here’s an example:
hello <- function(name, surname) {
# Say hello to a person with their name and surname
print(paste('Hello,', name, surname))
}
If you call the function without executing it, you’ll see its structure along with the comments:
## function(name, surname) {
## # Say hello to a person with their name and surname
## print(paste('Hello,', name, surname))
## }
If your function is part of a larger package and you want it to be properly documented, you should write formal documentation in a separate .Rd
file. These files store structured documentation, which you can access using ?function_name
in R, similar to the help file you see for built-in functions like ?mean
.
Formal documentation includes details such as:
- Function name and description.
- Arguments and their roles.
- Examples of how to use the function.
- Output that the function returns.
- This approach ensures that users can easily understand and use your function, even in complex packages.
2.4 Hands-on Exercise
You will attempt this hands-on exercise to confirm your understanding of functions. For one of the functions you created, add comments inside the function to explain what each part of the function does.
- Create a User-Defined Function (UDF) named
calculate_area
that takes two arguments:length
andwidth
. The function should return the area of a rectangle. - Create a vector named
values
with the numbers4, 8, 15, 16, 23, 42
. Use the built-insum()
function to calculate the total of the values vector and print the result. - Write a function named
greet
that takes one argument,student_name
, and prints a greeting. Modify the function to have a default argument that greets a"Student"
if no name is provided. - Create a function named
mean_and_median
that takes a numeric vector as an argument and returns both the mean and median of that vector as a list.
Solution
- Create a User-Defined Function (UDF) named
calculate_area
that takes two arguments:length
andwidth
. The function should return the area of a rectangle.
calculate_area <- function(length, width) {
# Calculate the area by multiplying length and width
area <- length * width
# Return the calculated area
return(area)
}
# Example usage of the calculate_area function
area_result <- calculate_area(5, 10) # Length: 5, Width: 10
print(paste("Area of rectangle:", area_result))
## [1] "Area of rectangle: 50"
- Create a vector named
values
with the numbers4, 8, 15, 16, 23, 42
. Use the built-insum()
function to calculate the total of the values vector and print the result.
values <- c(4, 8, 15, 16, 23, 42)
# Use the built-in sum() function to calculate the total of the values vector
total <- sum(values)
# Print the result
print(paste("Total of values vector:", total))
## [1] "Total of values vector: 108"
- Write a function named
greet
that takes one argument,student_name
, and prints a greeting. Modify the function to have a default argument that greets a"Student"
if no name is provided.
greet <- function(student_name = "Student") {
# Print a greeting using the provided name or default to "Student"
print(paste("Hello,", student_name))
}
# Example usage of the greet function with a provided name
greet("John") # Should print "Hello, John"
## [1] "Hello, John"
## [1] "Hello, Student"
- Create a function named
mean_and_median
that takes a numeric vector as an argument and returns both the mean and median of that vector as a list.
mean_and_median <- function(num_vector) {
# Calculate the mean of the vector
mean_value <- mean(num_vector)
# Calculate the median of the vector
median_value <- median(num_vector)
# Return both mean and median as a list
return(list(mean = mean_value, median = median_value))
}
# Example usage of the mean_and_median function
results <- mean_and_median(c(12, 19, 21, 14, 09))
# Print the results
print(paste("Mean:", results$mean, ", Median:", results$median))
## [1] "Mean: 15 , Median: 14"
________________________________________________________________________________