scRNAseq_training

scRNA-seq Data Analysis Tutorial

In this tutorial you will use the Kallisto bustools workflow to perform pseudo-alignment of scRNA-seq reads to a reference transcriptome and generate count matrices. Then you will analyze the count data in R.

Overview


Set up

Download workshop data

The data used in this workshop are all publicly available and the download links are included throughout the document. For efficiency’s sake, you can dowload the data and code from Google Drive, here. Log into ondemand.hpcc.msu.edu to upload the scRNAseq_training directory into your hpcc space.

Install Anaconda

If you haven’t yet follow these instructions to install Anaconda.

Install kb-python if you haven’t

From these instructions.

module purge
module load Conda/3
pip install kb-python

Install R packages

Open the R environment on HPCC

module purge
module load R-bundle-CRAN/2023.12-foss-2023a
R --vanilla

In the R environment:

#install cran packages
install.packages(c("tidyverse", "Matrix", "patchwork",
                   "pheatmap", "RColorBrewer", "readxl"))


#install bioconductor packages
if (!require("BiocManager", quietly = TRUE))
    install.packages("BiocManager")

BiocManager::install(c("SingleCellExperiment", "scater",
                     "scran", "DropletUtils", "bluster",
                     "scDblFinder", "AUCell", "PCAtools"))

Part1: Get a count matrix from fastq files


Part2: Analyze the count data in R


Credits

Part2 of this tutorial is based off of the e-book Orchestrating Single-Cell Analysis with Bioconductor, a comprehensive resource designed to guide users through the process of analyzing single-cell RNA sequencing (scRNA-seq) data using the Bioconductor ecosystem in R.