RNA-seq-2019nCov

RNA-seq pipeline

Introduction

The raw metatranscriptomic reads were processed using Fastp to filter low-quality data and adapter contaminations and generate the clean reads for further analyses. Human-derived reads were identified with the following steps: 1) identification of human ribosomal RNA (rRNA) by aligning clean reads to human rRNA sequences using BWA-MEM ; 2) identification of human transcripts by mapping reads to the hg19 reference genome using the RNA-seq aligner HISAT2 ; and 3) a second-round identification of human reads by aligning remaining reads to hg 38 using Kraken 2. All human RNA reads were then removed to generate qualified non-human RNA-seq data.

The remaining non-human non-rRNA reads were processed by Kraken 2X v2.08 beta. Non-viral microbial taxon assignment of the non-human non-rRNA reads was performed using clade-specific marker gene-based MetaPhAln2 with the default parameter options for non-viral microbial composition(--ignore-viruses).

Requirements:

python: v3+

Software for This pipeline:

Installation

git clone https://github.com/rusher321/RNA-seq-2019nCov.git

Notes: The above dependent software needs to be installed separately according to their instructions. After installing, the users should edit the config.yaml file, and change the software path to your own path.

Usage

1.Build the index for database

bulit the human rna index for bwa

      bwa index Human_rRNA_NCBI.fa

bulit the human genome index for HISAT2

      hisat2-build index hg19.fa hg19 -p 6

bulit the kraken2 database index

      kraken-build --build --threads 8 --db ./YourDBpath/
      # add the human genome to the database 
      kraken2-build --add-to-library hg38.fa --db  ./YourDBpath/
      # add the HCoV-19 genome to the database 
      kraken2-build --add-to-library HCoV-19.fa --db  ./YourDBpath/

Here we used the MiniKraken2_v2_8GB: (5.5GB) 8GB Kraken 2 Database built from the Refseq bacteria, archaea, and viral libraries and the GRCh38 human genome

bulit the kraken2x database index

     kraken2-build --build --protein --db $DBNAME

Edit the config.yaml file, and change the database path to your own path

2.Run the pipeline.

Input requirements
generate a sample information file like below:

id	fq1	fq2
demo1	demo1.1.fq.gz	demo1.2.fq.gz
demo2	demo2.1.fq.gz	demo2.2.fq.gz

The header must be: id fq1 fq2.

Init
cd to your workdir and run:

python /path/to/git/RNAseq init -d ./ -s samples.tsv

After that, in yourdir directory, inital files will be generated

ls ./
  
assay
results
scripts
sources
study
config.yaml
cluster.yaml

generate command line and just run it on local computer

python /path/to/your/git/RNAseq commandline -d ./ -u all

snakemake --snakefile /path/to/your/git/Snakefile --configfile config.yaml --until all

Or submit to cluster using qsub

snakemake --snakefile /path/to/git/Snakefile \
    --configfile ./config.yaml \
    --cluster-config ./cluster.yaml \
    --jobs 80 \
    --cluster "qsub -S /bin/bash -cwd \
               -q {cluster.queue} \
               -P {cluster.project} \
               -l vf={cluster.mem},p={cluster.cores} \
               -binding linear:{cluster.cores} \
               -o {cluster.output} \
               -e {cluster.error}" \
    --latency-wait 360 \
    -k \
    --until all

Support & Bug Reports

Please log an issue on github issue

Contributors

Huahui Ren -@rusher
Zhun Shi -@zhunshi

License

Released under the MIT license.

Name		Name	Last commit message	Last commit date
Latest commit History 47 Commits
__pycache__		__pycache__
rules		rules
src		src
.gitignore		.gitignore
README.md		README.md
RNAseq		RNAseq
Snakefile		Snakefile
cluster.yaml		cluster.yaml
config.py		config.py
config.yaml		config.yaml
pipeline.png		pipeline.png
sample.py		sample.py
samples.tsv.test		samples.tsv.test

rusher321/RNA-seq-2019nCov

Folders and files

Latest commit

History

Repository files navigation

RNA-seq-2019nCov

Introduction

Requirements:

Installation

Usage

1.Build the index for database

2.Run the pipeline.

Support & Bug Reports

Contributors

License

About

Topics

Resources

Stars

Watchers

Forks

Languages