# Functions & Libraries

A function is a collection of statements that work together to complete a certain task. A function can accept information in the form of arguments/parameters (= input). As a result, it can return data (= output).

R includes a vast variety of built-in functions, and users can write their own.

## Call a Function​

You've already learnt how to call functions unintentionally in previous chapters. In the following, we will use the `sd()` function to demonstrate how to call a function. It is utilized to calculate the standard deviation.

Type the name of the function followed by parentheses, such as `sd()`, and the arguments between the parentheses (arguments are separated by a comma):

``# Calculate the standard deviation of 2, 4, 5 and 8sd(c(2, 4, 5, 8))output: 2.5``
tip

Variables can also be used as function arguments:

``values <- c(2, 4, 5, 8)sd(values)output: 2.5``

You may also assign the function's output to a variable if you wish to reuse the function's output:

``my_sd <- sd(values)my_sdoutput: 2.5``

## Built-in Functions​

Basic examples of built-in functions are seq(), mean(), max(), sum(), and paste(), among others.

### Function Information​

With `help()` or `?`, you can receive information about a function:

``# Find out information about sd()help(sd)?sd``
tip

With `args()`, you can simply call the arguments of the function:

``# Call the arguments of sd()args(sd)output:function (x, na.rm = FALSE) NULL``

### Argument Matching​

The description for `sd()` shows that the function actually takes two arguments: `sd(x, na.rm = FALSE)`.

When you enter `sd(values)` into the R console, R understands that `values` is the argument `x` and not `na.rm`. That is because of the arguments' positioning (`values` comes first, such as `x`).

Another way to match the arguments is by using the equal sign: `sd(x = values)`.

### Default Values​

The second argument to the `sd()` function indicates that by default `na.rm` is set to `FALSE` (even if you do not specify this argument yourself). This means that missing values will not be eliminated. However, default values can be overwritten, e.g. `na.rm = TRUE`.

In contrast, `x` is not specified by default. If you do not specify the value of an argument without default values, an error will occur.

## Nested Functions​

R allows you to use functions within functions.

example

To get the absolute difference of two vectors you can nest `abs()` and `mean()`:

``# example vector, containing patient age datagynaecology <- c(16, 49, 25, 20, 33, 56)dermatology <- c(77, 56, 16, 28, 43, 64)# Calculate the mean absolute differencemean(abs(gynaecology - dermatology))output: 17.16667``

## User-Defined Functions​

### Create a Function​

You can define custom functions via the following syntax:

``function_name <- function(arguments) {    body}``
• Function name: It should be short yet clear and meaningful, so that the person who sees our code understands exactly what this function performs.
• Function arguments: We have already covered what arguments are. But it is possible for a function to have no arguments, although this is rarely practical. You can have as many arguments as you like, and you may assign default values to them.
• Function body: The function body is a collection of commands enclosed by curly braces that are executed in a preset sequence each time the function is called.
example

We want to create a function that squares the given integer.

``# Build the function square()square <- function(x) {    x^2}# Calculate 4 squaredsquare(4)output: 16``

### Return Values​

By default, R returns the value of the function's final statement. However, you may use the `return()` function to directly tell R what to return.

example

We want to create a function that squares the given integer.

``# Return the value y by assigning y to x squaredsquare <- function(x) {    y = x^2    return(y)}# Calculate 4 squaredsquare(4)output: 16``

## Function Scoping​

In the scope of on R function variables outside the function are not accessible.

Consider our global R environment (our whole program) to be a room that contains all the objects, variables, functions, etc. that we have utilized. When we call a variable x, R will search around the room to get the value of x.

tip

We may use `ls()` to see what's in our environment.

As we define a new function, R sets up a fresh temporary environment for it. Imagine setting up a new room within our global R environment. The new room contains all the objects we have created, modified, and used within the function. But, as soon as the function is finished performing, the room disappears.

example

The next example will show you that the variable `txt` does not exist outside the function, only inside it.

``# Create a function with a local variableR <- function() {  txt <- "cool"  paste("R is", txt)}# Print out R()R()output: "R is cool"# Call the variable txttxtoutput:Error: object 'txt' not found``
tip

You may use the global assignment operator `<<-` to define a global variable inside a function.

``# Create a function with a global variableR <- function() {  txt <<- "cool"  paste("R is", txt)}# Print out R()R()output: "R is cool"# Call the variable txttxtoutput: "cool"``

The variable `txt` is now accessible outside the function.

## R Packages​

R packages are accessible collections of data, code, and documentation and are essentially additions or extensions to the R software.

R already offers built-in packages, such as the `base` package, which includes functions such as `mean()`, `list()`, and `sample()` among others. However, for more in-depth data analysis, you might want to use more than that.

You can use R's built-in `install.packages()` function to install packages on your computer's hard drive. This function navigates to CRAN (Comprehensive R Archive Network), a repository containing thousands of packages, and downloads them.

Afterwards, you have to load the package into memory by using `library()`. This enables usage of a package's functionality throughout the current R session. Therefore, before beginning a new R session, you must always load all the packages you intend to use. Alternatively, you could call the `require()` function, which works similarly.

example
``# Install the ggplot2 packageinstall.packages("ggplot2")# Load the ggplot2 packagelibrary(ggplot2)``

If we want to see which packages we loaded, we may go to the packages tab in the console's bottom right window. We may search for packages and load them by ticking the box next to them. You could also enter `(.packages())` or `search()` into the console. They will display all the packages that are currently loaded into memory.

``(.packages())output: "ggplot2"   "stats"     "graphics"  "grDevices" "utils"     "datasets"  "methods"   "base"``
``search()output:  ".GlobalEnv"        "package:ggplot2"   "tools:rstudio"     "package:stats"     "package:graphics"   "package:grDevices" "package:utils"     "package:datasets"  "package:methods"   "Autoloads"         "package:base" ``

### Package Information​

By clicking the package name in the packages tab, we may get additional information about the selected package in the help tab. If we click the `ggplot2` package, we get the following: Alternatively, we may type `help(package = "ggplot2")` into the R console. This will also lead you to the help tab.