2  Welcome and Course Overview

2.1 Hello Baku Breeders!

Welcome to this introductory course on data analysis for plant genetics, a collaboration with ICARDA. We are excited to guide you through the essential tools and concepts needed to make sense of your valuable breeding data using R.

2.2 Course Objectives

  • Learn the fundamentals of the R programming language.
  • Understand core genomic concepts.
  • Perform genomic basic data loading, cleaning, and quality control.
  • Run population structure, genetic diversity and relatedness analyses.
  • Understand and run genome-wide association studies (GWAS).
  • Introduction to genomic selection (GS)

2.3 Course Structure

This course is divided into six different modules aligned to each of our objectives. Each module includes explanations and practical R exercises.

No prior programming experience is required!

All course files and scripts are available on our GitHub repository

2.4 Additional Resources

Learning R requires practice; luckily, there are many different online resources that can help us learn and master R such that we can use it for our own research.

  • swirl R Package: swirl is an interactive R package that helps you learn R through a variety of different exercises and lessons. Get started here
# Install package
install.packages("swirl")

# Load library
library(swirl)

# Start learning
swirl()
  • Base R Cheat Sheet: Basic commands and functions available here

  • General RStudio Cheat Sheets: General commands and functions for different R packages available here

  • R Markdown (.Rmd) Cheat Sheets: Information on how .Rmd files work and why they are useful is available here

  • gplot2 Plot Templates and Information: You can find more information on all of ggplot’s plotting options here. You can also find many ready to run plot templates here