Why do we sequence?
What are the next gen sequencing technologies
-Illumina
-Oxford nanopore
-PacBio
How do you get high quality In Illumina?
short reads but ht e volume of reads you can get through is quite big
What are the length of the reads in PacBio?
shorter than nanopore but longer than illumina
How do you deal with high error rates in PacBio?
very high error rate - to solve that you sequence multiple times and then because the errors are random you can just align the sequences and then you get a high accuracy
What are quality scores particularly important for?
if you are trying to find SNPs you need to know the quality score to see if you have a sequencing error or an actual variation
What do we need quality scores for?
Where does the quality decorate?
Quality deteriorates towards the ends of reads
What does AT and GC do?
High AT or GC content reduces complexity and can lead to higher error rates\
What is the formula for QV?
What is base calling?
What is Chastity Filter?
What is Fast Q format?
What do they use for quality scores in Fast Q?
they use ascii values for quality scores so you get char to char association
Describe the standard output
What is depth of coverage useful for?
Sequencing errors are eliminated by the depth of coverage of overlapping sequence fragments
What was the depth coverage in the human genome project?
Describe paired end sequencing
What do we do with repeats in paired end sequencing?
Describe pmate pairs sequencing
What do you need for scaffolding?
-contig
-scaffold
What are contigs?
What are scaffolds?
What is de novo sequenicing>?