MEDS5420 - Lectures 21, 22, and 23 - RNA-seq mapping, visualization, and transcript assembly
Michael Guertin
April 11, 13, and 18 2022
Contents
0.1
Cleaning up data
1
Building genome with HISAT2
1.1
GTFs:
1.2
Using HISAT2 and related software on server:
2
Building your annotated genome for mapping with HISAT2.
3
Alignment with HISAT2
3.1
Note on
--rna-strandness
3.2
Intermediate file conversions
4
StringTie for assembling transcripts
4.1
StringTie for merging transcripts
4.2
gffcompare to compare to annotations:
4.3
Calculate transcript abundances with stringtie for use with ballgown or other DE analysis software.
5
In class exercise 1:
6
Counting reads in transcripts with HTseq
6.1
Important update on HTseq-count
6.2
Strandedness with HTSeq
7
In class exercise 2:
8
Use Genome Coverage from bedtools to create a bedGraph:
8.1
Note on visualizing stranded data with
genomeCoverageBed
9
In class exercise 3:
10
Code chunks and commentary not covered in class
11
Visualizing stranded PE data:
11.1
Mapping paired end and stranded data with HISAT2
12
Displaying splice junctions
12.1
Move header to a new file
12.2
Extract splice junction reads
12.3
Convert back to bam
12.4
Convert to bed12
13
Molecular biology of library preparations
14
Answers to in class exercise 1:
15
Answers to class exercise 2:
16
Answers to in class exercise 3: