What is DNA?
DNA is deoxyribonucleic acid
Macromolecule consisting of a linear strand of nucleotides
How does DNA differ from RNA?
deoxyribose does not have the 2’-hydroxyl group of ribose
How is DNA double stranded?
Single linear strands bind to complementary strands to form double-stranded DNA
Why does DNA have a negative charge?
The phosphate group is negatively charged, this is what gives nucleic acids their overall negative charge as sugars and bases are neutral
How are the carbons on the deoxyribose sugar identified?
5’ and 3’ carbons are indicated – numbering starts at the carbon closest to the base
Which direction is the DNA strand read?
Sequence is 5’->3’ by convention
Why is the DNA strand 5’-3’?
5’ (5 prime) and 3’ (3 prime) are numbered based on the carbon atoms of the sugar
The base is attached to the first carbon, the phosphate links between the 3’ of one sugar and the 5’ of the adjacent sugar
How does DNA maintain its double stranded structure?
Hydrogen bonding occurs between base pairs
A-T (2H bonds) , G-C (3H bonds)
What direction do the 2 DNA strands of the duplex run?
The 2 strands run antiparallel (5’-3’) and the other (3’-5’)
Describe the structure of DNA in 3D
- Two antiparallel strands of DNA
- Bases “stacked”
- Two grooves
- Major
- Minor
How much length of DNA is found in our body?
There is ~2m of DNA in a nucleated cell 37.2 trillion cells in your body That is 7.44x1013 metres of DNA This equals 250 journeys to the sun and back! The average cell is 50µm in diameter
How do we fit 2m of DNA in each 50µm cell?
DNA strand wound around histones to form nucleosomes.
Nucleosomes are wound further into chromatin and then into extended chromosomes. They’re then looped and into the full chromosome.
What are histones?
Basic proteins that bind DNA
Describe the histone structure DNA wraps around?
Eight histones form the nucleosome
2A,2B, 3 & 4 (2 copies of each)
Histone 1 binds the linker DNA (piece of DNA between nucleosomes)
How is the structure of chromosomes determined?
Chromosomes come in different shapes depending on where the centrosome is found.
What are the 3 types of chromosome structure?
Metacentric
Subcentric
Acrocentric
What is a metacentric chromosome?
X-shaped chromosomes, with the centromere in the middle; two arms of the chromosomes are almost equal
What is a submetacentric chromosome?
off centre, centromere located near middle; chromosomal arms (i.e. p and q arms) are slightly unequal in length and may form L-shape
What is an acrocentric chromosome?
centromere is located quite near one end of the chromosome so don’t have the short arms
What is karyotyping?
process of pairing and ordering all the chromosomes of an organism, thus providing a genome-wide snapshot of an individual’s chromosomes
What is the exome?
The coding Genome
sum of all the gene sequences is called the exome
Some definitions just use the coding sequences (~37 Mbp – 1.2% of genome)
Some definitions use the whole gene sequences (~60Mbp – 2% of genome)
What does the primary DNA sequence include?
- encodes all the gene products necessary for an
organism - includes a large number of regulatory signals
- Much of the DNA sequence doesn’t have a function yet
What is a gene?
A gene is all the DNA transcribed into RNA as well as all the cis-linked (local) control regions
What is the role of cis-linked control regions?
Cis-linked (local) control regions are required to ensure quantitatively appropriate tissue-specific expression of the final protein
What is meant by cis-linked?
Cis-linked just means that these are regions physically close to the exons on the DNA strand. Contrast with trans-regulatory regions that can be on different chromosomes
Explain how genome differs from cell to cell
All nucleated somatic cells contain the same genome
Describe the human genome
- Human genome size is 3 x 109 base pairs – 3Gbp
- Contains 19-20 000 genes.
- > 2% of DNA is gene
- Genome size isn’t strongly related to complexity of
organisms,
e.g., marbled lungfish is 130Gbp and Paris japonica is
149Gbp
How big are genes in size?
Genes are often very different in size
e.g.
globin = 1.8kb
dystrophin = 2.4Mb
What is the role of intergenic regions?
Intergenic regions contain sequences of no known function
e.g. as repetitive DNA, endogenous retroviruses, pseudogenes
How are genes organised within the genome?
Genes often cluster in families – e.g. globin clusters
- allows for coordinate gene regulation - may just reflect evolutionary history
What is required for transcription to occur?
Recognition sequences are required in DNA that lie outside the transcribed region
What is the role of promoters during transcription?
Promoters recruit RNA polymerase to a DNA template
What is the role of RNA Polymerase during transcription?
RNA polymerase binds asymmetrically and can only move 5’ to 3’
Outline the structure of a gene from 5’- 3’ end
5’ end
Promoter
- CAAT box
- TATA box
Transcription initiation
5’ UTR
Translation initiation (ATG)
Exons + Introns
Translation termination
3’UTR
Transcription termination
3’ end
What two sequences form the gene promoter?
Regulatory CAAT box
Recruting TATA box
What is the role of the CAAT box?
Regulatory element
regulates the recruitment of RNA Polymerase
What is the role of the TATA box?
Needed to recruit general Transcription factors and RNA polymerase
Explain what is involved in transcription?
DNA is transcribed using RNA polymerases
What 3 polymerases do eukaryotic cells use?
RNA polymerase I
- needed to transcribe rRNA genes
RNA polymerase II
- needed to transcribe mRNA
RNA polymerase III
- needed to transcribe tRNA and other small RNAs
What is the significance of transcription factors?
Eukaryotic RNA polymerases are unable to recognise promoters efficiently without help.
Outline the process of transcription
- RNA pol. recruited (closed complex)
- DNA helix locally unwound (open complex)
- RNA synthesis begins
- Elongation occurs
- Termination
- RNA polymerase dissociates
How are introns distributed among genes?
Vary in number – 0 to 311
Vary in size - 30bp to 1Mbp
Some introns contain other genes
How long does transcription take?
average RNAP II elongation rate of 60 bases per second so transcription of some long introns lasts many hours
What regulatory regions are required during transcription?
Enhancers
Silencers
Insulators
What is the role of enhancers?
upregulate gene expression – short sequences that can be in the gene or many kilobases distant
They are targets for transcription factors (activators)
What is the role of Silencers?
downregulate gene expression. They are also position-independent and are also targets for transcription factors (repressors)
What is the role of insulators in gene transcription?
short sequences that act to prevent enhancers/silencers influencing other genes
Explain how eukaryotic mRNA is modified after transcription
- Capped at 5’ end
- Polyadenylated at 3’ end
- Intervening sequences (introns) removed
What is the 5’ cap?
Soon after RNA polymerase begins transcription a methylated cap (7-methylguanosine) is added to the 5’ end. Makes it resistant to digestion by nucleic enzymes.
Transcription then continues to completion.
What modification structures are present at the 3’ end?
G/U is a Guanine/Uracil rich region
PAP is polyadenylate polymerase
Outline the modifications that occur at the 3’ end of the mRNA strand
Cleavage factors bind to G/U region and PolyA sequences to signal cleavage should occur.
The Gu region and everything after it is removed
Polyadenylate polymerase adds many A bases ~250 to protect the end from degradation
What is the final modification to mRNA after transcription occurs?
RNA is transcribed, cap and poly A tail added, then introns are spliced out by the spliceosome.
What is the spliceosome?
large and complex molecular machine made up of ~150 proteins
removes introns from a transcribed pre-mRNA
How does the spliceosome remove introns?
Spliceosome brings the ends of the exons together and cleaves the intron from in between then joining the adjacent exons together.
Lariat structure is then degraded
What is alternative splicing?
Exons can be skipped or added so variations of a protein (called isoforms) can be produced from the same gene
What is the significance of splicing?
Splicing targets mRNAs for nuclear export
Further signalling complexes join once the strand has been spliced
What export signals bind to the spliced mRNA?
Exon Junction Complex
TREX export complex
These all signal the compound out of the nucleus and into the ER ready for translation by the ribosomes
What are pseudogenes?
These are genes that have been at least partially inactivated by the loss or gain of sequence that disrupt their correct transcription and/or translation
Give an example of pseudogenes
For example, glucocerebrosidase has an adjacent pseudogene that only differs in the coding region by one 55bp deletion and a few single base changes
Why do psueodgenes lack promoters and exons?
Processed pseudogenes have no promoter or exons as they are copied from mRNA by retrotransposition