Differential RNA splicing analysis with single-cell/nucleus RNA-seq data¶

Before you start¶

Perform mapping of sc(sn)RNA-seq reads to the reference genome using STARsolo.
- You can download a test input file mapped by STARsolo on the mouse genome from here.
Download a gene annotataion file of your interest in GTF format.

Installation¶

# Install Shiba with conda
conda create -n shiba -c conda-forge -c bioconda shiba
# Activate the conda environment
conda activate shiba
# Install styleframe for generating outputs in Excel format (optional)
pip install styleframe==4.1

scShiba¶

1. Prepare inputs¶

experiment.tsv: A tab-separated text file of barcode file and STAR solo raw output directory.

barcode SJ
/path/to/barcodes_run1.tsv /path/to/run1/Solo.out/SJ/raw
/path/to/barcodes_run2.tsv /path/to/run2/Solo.out/SJ/raw
/path/to/barcodes_run3.tsv /path/to/run3/Solo.out/SJ/raw
/path/to/barcodes_run4.tsv /path/to/run4/Solo.out/SJ/raw

barcodes.tsv is a tab-separated text file of barcode and group name like this:

barcode group
TTTGTTGTCCACACCT Cluster-1
TCAAGACCACTACAGT Cluster-1
TATTTCGGTACAGTAA Cluster-1
ATCCTATGTTAATCGC Cluster-1
ATCGATGAGTTTCTTC Cluster-2
ATCGATGGTCTTGCTC Cluster-2
TATGTTCGTCAGGCAA Cluster-2
ATCGCCTAGACTCGAG Cluster-2
...

Make sure to use tabs

If you copy and paste the above example, your experiment.tsv file may contain spaces instead of tabs, which will causes an error when you run scShiba. Please make sure that you are using a tab character between the columns.

config.yaml: A yaml file of the configuration.

workdir:
  /path/to/workdir # (1)!
gtf:
  /path/to/Mus_musculus.GRCm38.102.gtf # (2)!
experiment_table:
  /path/to/experiment.tsv # (3)!

# PSI calculation
only_psi:
  False # (4)!
fdr:
  0.05 # (5)!
delta_psi:
  0.1 # (6)!
reference_group:
  Cluster-1 # (7)!
alternative_group:
  Cluster-2 # (8)!
minimum_reads:
  10 # (9)!
excel:
  False # (10)!

The working directory where the output files will be saved. Please make sure that you have write permission to this directory.
The path to the gene annotation file in GTF format.
The path to the experiment.tsv file.
Set to True if you want to skip the differential analysis and only calculate PSI values for each sample.
Significance threshold for differential splicing analysis.
Minimum difference in PSI values between groups to be considered significant.
Reference group for differential splicing analysis.
Alternative group for differential splicing analysis.
Minimum number of reads required to calculate PSI values.
Set to True if you want to generate a file of splicing analysis results in excel format.

2. Run¶

scshiba.py -p 4 config.yaml

You are going to use 4 threads for parallelization. You can change the number of threads by changing the -p option.

Did you encounter any problems?

You can run scShiba with the --verbose option to see the debug log. This will help you to find the problem.

scshiba.py --verbose -p 4 config.yaml

If you continue to encounter issues, please don't hesitate to open an issue on GitHub. The community and developers are here to help!

SnakeScShiba¶

A snakemake-based workflow of scShiba. This is useful for running scShiba on a cluster. Snakemake automatically parallelizes the jobs and manages the dependencies between them.

1. Prepare inputs¶

experiment.tsv: A tab-separated text file of sample ID, path to fastq files, and groups for differential analysis. This is the same as the input for scShiba.

config.yaml: A yaml file of the configuration. This is the same as the input for scShiba but with the addition of the container field.

workdir:
  /path/to/workdir # (1)!
container: # This field is required for SnakeScShiba
  docker://naotokubota/shiba:v0.6.2 # (2)!
gtf:
  /path/to/Mus_musculus.GRCm38.102.gtf # (3)!
experiment_table:
  /path/to/experiment.tsv # (4)!

# PSI calculation
only_psi:
  False # (5)!
fdr:
  0.05 # (6)!
delta_psi:
  0.1 # (7)!
reference_group:
  Cluster-1 # (8)!
alternative_group:
  Cluster-2 # (9)!
minimum_reads:
  10 # (10)!
excel:
  False # (11)!

The working directory where the output files will be saved. Please make sure that you have write permission to this directory.
The Docker image of Shiba.
The path to the gene annotation file in GTF format.
The path to the experiment.tsv file.
Set to True if you want to skip the differential analysis and only calculate PSI values for each sample.
Significance threshold for differential splicing analysis.
Minimum difference in PSI values between groups to be considered significant.
Reference group for differential splicing analysis.
Alternative group for differential splicing analysis.
Minimum number of reads required to calculate PSI values.
Set to True if you want to generate a file of splicing analysis results in excel format.

2. Run¶

Please make sure that you have installed Snakemake and Singularity and cloned the Shiba repository on your system.

snakemake -s /path/to/Shiba/snakescshiba.smk \
--configfile config.yaml \
--cores 16 \
--use-singularity \
--rerun-incomplete