4  Module 1.1: Introduction to R - Your Breeding Data Analysis Tool

4.0.1 Introduction to R

R is a powerful language for data manipulation, visualization, and statistical analysis. Think of R as a versatile calculator for data.

  • What is R? Think of R as a powerful, specialized calculator combined with a programming language. It’s designed specifically for handling data, performing statistical analyses, and creating informative graphs.
  • Why R for Breeding?
    • Free & Open Source: Anyone can use it without cost.
    • Powerful for Data: Excellent at handling the types of large datasets we generate in breeding (phenotypes, genotypes).
    • Cutting-Edge Statistics: Many new statistical methods (like those for genomic selection or GWAS) are first available as R packages.
    • Great Graphics: Create publication-quality plots to visualize your results.
    • Large Community: Lots of help available online and specialized packages for genetics and breeding (like rrBLUP which we might see later).
  • R vs. Excel: Excel is great for data entry and simple summaries, but R is much better for complex analysis, automation, reproducible research, and handling very large datasets.

Try these examples in the RStudio Console:

# Basic arithmetic
2 + 5
[1] 7
10 - 3
[1] 7
4 * 8
[1] 32
100 / 4
[1] 25
# Order of operations (like standard math)
5 + 2 * 3   # Multiplication first
[1] 11
(5 + 2) * 3 # Parentheses first
[1] 21
# Built-in mathematical functions
sqrt(16)    # Square root
[1] 4
log(10)     # Natural logarithm
[1] 2.302585
log10(100)  # Base-10 logarithm
[1] 2

4.1 Variables: Storing Information

Variables are used to store information in R. You can think of them as containers for data. In R, you can create variables using the assignment operator <-. You can also use = for assignment, but <- is more common in R.

Use the <- operator to assign and manipulate variables:

# Assign the value 5 to variable x
x <- 5

# Assign the result of 10 + 3 to variable y
y <- 10 + 3

# Print the value of x
x
[1] 5
# Use variables in calculations
z <- x + y
# Print the value of z
z
[1] 18
# Assign the name of a variety to a variable
best_variety <- "ICARDA_Gold" # Text needs quotes ""

# Print name
print(best_variety)
[1] "ICARDA_Gold"
# We can also concatenate text like this
print(paste("The best variety is", best_variety))
[1] "The best variety is ICARDA_Gold"

4.2 Useful shortcuts

Function Shortcut
Run code

Ctrl + Enter (Windows)

Cmd + Enter (Mac)

Insert chunk

Ctrl + Alt + I (Windows)

Cmd + Option + I (Mac)

Run current chunk

Ctrl + Alt + C (Windows)

Cmd + Option + C (Mac)

Run all chunks above

Ctrl + Alt + P (Windows)

Cmd + Option + P (Mac)

Run all chunks

Ctrl + Alt + R (Windows)

Cmd + Option + R (Mac)