# Functions & Libraries

A function is a collection of statements that work together to complete a certain task. A function can accept information in the form of arguments/parameters (= input). As a result, it can return data (= output).

R includes a vast variety of built-in functions, and users can write their own.

## Call a Function

You've already learnt how to call functions unintentionally in previous chapters.
In the following, we will use the `sd()`

function to demonstrate how to call a function.
It is utilized to calculate the standard deviation.

Type the name of the function followed by parentheses, such as `sd()`

, and the arguments between the parentheses (arguments are separated by a comma):

`# Calculate the standard deviation of 2, 4, 5 and 8`

sd(c(2, 4, 5, 8))

output:

[1] 2.5

Variables can also be used as function arguments:

`values <- c(2, 4, 5, 8)`

sd(values)

output:

[1] 2.5

You may also assign the function's output to a variable if you wish to reuse the function's output:

`my_sd <- sd(values)`

my_sd

output:

[1] 2.5

## Built-in Functions

Basic examples of built-in functions are seq(), mean(), max(), sum(), and paste(), among others.

### Function Information

With `help()`

or `?`

, you can receive information about a function:

`# Find out information about sd()`

help(sd)

?sd

With `args()`

, you can simply call the arguments of the function:

`# Call the arguments of sd()`

args(sd)

output:

function (x, na.rm = FALSE)

NULL

### Argument Matching

The description for `sd()`

shows that the function actually takes two arguments: `sd(x, na.rm = FALSE)`

.

When you enter `sd(values)`

into the R console, R understands that `values`

is the argument `x`

and not `na.rm`

.
That is because of the arguments' positioning (`values`

comes first, such as `x`

).

Another way to match the arguments is by using the equal sign: `sd(x = values)`

.

### Default Values

The second argument to the `sd()`

function indicates that by default `na.rm`

is set to `FALSE`

(even if you do not specify this argument yourself).
This means that missing values will not be eliminated.
However, default values can be overwritten, e.g. `na.rm = TRUE`

.

In contrast, `x`

is not specified by default.
If you do not specify the value of an argument without default values, an error will occur.

## Nested Functions

R allows you to use functions within functions.

To get the absolute difference of two vectors you can nest `abs()`

and `mean()`

:

`# example vector, containing patient age data`

gynaecology <- c(16, 49, 25, 20, 33, 56)

dermatology <- c(77, 56, 16, 28, 43, 64)

# Calculate the mean absolute difference

mean(abs(gynaecology - dermatology))

output:

[1] 17.16667

## User-Defined Functions

### Create a Function

You can define custom functions via the following syntax:

`function_name <- function(arguments) {`

body

}

**Function name**: It should be short yet clear and meaningful, so that the person who sees our code understands exactly what this function performs.**Function arguments**: We have already covered what arguments are. But it is possible for a function to have no arguments, although this is rarely practical. You can have as many arguments as you like, and you may assign default values to them.**Function body**: The function body is a collection of commands enclosed by curly braces that are executed in a preset sequence each time the function is called.

We want to create a function that squares the given integer.

`# Build the function square()`

square <- function(x) {

x^2

}

# Calculate 4 squared

square(4)

output:

[1] 16

### Return Values

By default, R returns the value of the function's final statement.
However, you may use the `return()`

function to directly tell R what to return.

We want to create a function that squares the given integer.

`# Return the value y by assigning y to x squared`

square <- function(x) {

y = x^2

return(y)

}

# Calculate 4 squared

square(4)

output:

[1] 16

## Function Scoping

In the scope of on R function variables outside the function are not accessible.

Consider our global R environment (our whole program) to be a room that contains all the objects, variables, functions, etc. that we have utilized. When we call a variable x, R will search around the room to get the value of x.

We may use `ls()`

to see what's in our environment.

As we define a new function, R sets up a fresh temporary environment for it. Imagine setting up a new room within our global R environment. The new room contains all the objects we have created, modified, and used within the function. But, as soon as the function is finished performing, the room disappears.

The next example will show you that the variable `txt`

does not exist outside the function, only inside it.

`# Create a function with a local variable`

R <- function() {

txt <- "cool"

paste("R is", txt)

}

# Print out R()

R()

output:

[1] "R is cool"

# Call the variable txt

txt

output:

Error: object 'txt' not found

You may use the global assignment operator `<<-`

to define a global variable inside a function.

`# Create a function with a global variable`

R <- function() {

txt <<- "cool"

paste("R is", txt)

}

# Print out R()

R()

output:

[1] "R is cool"

# Call the variable txt

txt

output:

[1] "cool"

The variable `txt`

is now accessible outside the function.

## R Packages

R packages are accessible collections of data, code, and documentation and are essentially additions or extensions to the R software.

R already offers built-in packages, such as the `base`

package, which includes functions such as `mean()`

, `list()`

, and `sample()`

among others.
However, for more in-depth data analysis, you might want to use more than that.

### Install and Load Packages

You can use R's built-in `install.packages()`

function to install packages on your computer's hard drive.
This function navigates to CRAN (Comprehensive R Archive Network), a repository containing thousands of packages, and downloads them.

Afterwards, you have to load the package into memory by using `library()`

.
This enables usage of a package's functionality throughout the current R session.
Therefore, before beginning a new R session, you must always load all the packages you intend to use.
Alternatively, you could call the `require()`

function, which works similarly.

`# Install the ggplot2 package`

install.packages("ggplot2")

# Load the ggplot2 package

library(ggplot2)

### Identify Loaded Packages

If we want to see which packages we loaded, we may go to the packages tab in the console's bottom right window. We may search for packages and load them by ticking the box next to them.

You could also enter `(.packages())`

or `search()`

into the console.
They will display all the packages that are currently loaded into memory.

`(.packages())`

output:

[1] "ggplot2" "stats" "graphics" "grDevices" "utils" "datasets" "methods" "base"

`search()`

output:

[1] ".GlobalEnv" "package:ggplot2" "tools:rstudio" "package:stats" "package:graphics"

[6] "package:grDevices" "package:utils" "package:datasets" "package:methods" "Autoloads"

[11] "package:base"

### Package Information

By clicking the package name in the packages tab, we may get additional information about the selected package in the help tab.
If we click the `ggplot2`

package, we get the following:

Alternatively, we may type `help(package = "ggplot2")`

into the R console. This will also lead you to the help tab.