0% found this document useful (0 votes)
46 views181 pages

Biol 200 Notes

This document summarizes key concepts in biochemistry including: 1. It describes the processes of replication, transcription, and translation - the three main processes by which DNA is copied and expressed as proteins. 2. It outlines the 20 common amino acids and discusses primary, secondary, tertiary, and quaternary protein structure. 3. It provides examples of protein structure including influenza hemagglutinin subunits and discusses interactions between structures.

Uploaded by

sacharlebois7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
46 views181 pages

Biol 200 Notes

This document summarizes key concepts in biochemistry including: 1. It describes the processes of replication, transcription, and translation - the three main processes by which DNA is copied and expressed as proteins. 2. It outlines the 20 common amino acids and discusses primary, secondary, tertiary, and quaternary protein structure. 3. It provides examples of protein structure including influenza hemagglutinin subunits and discusses interactions between structures.

Uploaded by

sacharlebois7
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 181

BIOL 200

KH1-KH8

Topics
• Informational biopolymers
• Monomers: Asymmetric

• Polymerization
• Replication, Transcription, translation
Replication Transcription Translation

prep Helicase- Helicase


separate strands separate strands

Template DNA(Both DNA(1 strand) mRNA


strands)

Start Origin of promoter Start codon


replication AUG- Met(Methionine)

Enzyme DNA polymerase RNA polymerase Peptidyl transferase catalyzed by large


binds at origin binds at Ribosomal unit
moves promoter and Three spots for tRNA
unidirectionally moves
3' to 5' on unidirectionally
template 3' to 5' on
template

Energized dNTP rNTP Aminoacyl tRNA


Monomers

Adding 3' OH on last 3' OH on last tRNA carry 1 AA for every codon(3
monomers DNA nucleotide RNA nucleotide nucleotides)
of chain attacks of chain attacks
alpha phosphate alpha phosphate
of incoming of incoming rNTP
dNTP --> last --> last two
two phosphates phosphates are
are dropped dropped
Note: rNTP
diffuses
randomly and is
only linked if
Watson crick is
respected

Direct or Direct Direct Indirect(tRNA is adapter with anticodons)


indirect
contact with
template

Growing RNA nascent Is antiparallel to tRNA has 3 docking sites


strand strand is DNA template Middle tRNA has the amino acid chain
antiparallel to RNA strand H moves to exit doc and transfers AA chain
DNA template bonds with DNA to be stacked over AA of right tRNA which
Never separates template break move to middle dock (Peptidotransferase)
from template as two DNA Chain of polypeptides is added over the
strands remake new tRNA and its AA
helix. RNA strand
exits RNA
polymerase 5'
end first as it
grows

Stop No stop Specific DNA Stop codon


sequence UAA
destabilize UAG
attachment and UGA
releases RNA
polymerase(due
to bad adhesion)
and RNA strand

Resulting 2 identical DNA 1 Double helix Protein (Amnio acid chain)


polymer strands(Each DNA strand
have 1 template 1 RNA chain
and 1 new) complimentary
to template DNA
• Genetic code
• Amino acid side chains
o Classes:
• Hydrophobic: Water fearing
▪ Non-polar distribution
▪ No interaction with polar water
• Hydrophilic
▪ Polar electronic charge distribution(can have net charge or
be electrically neutral)
▪ Interact with polar water
• Oil drop model

• Hydrophobic AA side-chains are not exposed at the surface but buried in the
core
o Hydrophobic effect:
• Hydrophobic molecules coalescence favored by weak non-covalent Van
der Waals intermolecular interactions
Amino Primary Seco Seco Tertiary Quaterna Supramolec
acid structur ndary ndar structure ry ular
e struc y structure complex
ture: struc
Alpha ture:
helix Beta
pleat
ed-
shee
ts
Char Peptide Local Local Overall Multimeri Large
acte chain confo conf conformation c molecular
ristic backbon rmati orma of proteins- machines
s e- on of tion polypeptides: contains made of
specific pepti of spatial any multiple
sequenc de pepti organization of number distinct
e of AA chain de multiple sec of proteins
back back structures identical
bone bone or
60% Protein tertiary different
of structure polypepti
polyp ancient des
eptid evolutionary
e relationship:
chain Leghemoglobin
segm , evolved in AA
ents sequence but
of structures of
alpha hemoglobin
-beta and myoglobin
remain similar
Shap mono Twisted, Alpha Beta
e mer snaking -helix pleat
random ed-
coils shee
ts
Stru Alpha Beta
ctur -helix pleat
e ed-
shee
ts

Paral
lel or
Anti
paral
lel
R 20 Protr
grou kinds ude
ps abov
e
and
belo
w
shee
t
Buil Translati
ding on and
proc peptidyl
ess transfer
ase
Inter Covale H- H- Amino acid
actio nt bond bond side chains
n bonds betw betw Intrachai
with een een n
in pepti pepti disulphid
stru de- de- e bridges
ctur bondi bond
e ng ing
carbo carb
nyl O onyl
on O on
one one
AA AA
and and
H on H on
amin amin
o o
grou grou
p on p on
differ diffe
ent rent
AA AA
n H- H-
bond bond
s s link
with adjac
amin ent
o stran
acid ds
n+4
Perio
dicity
= 3.6
resid
ues
per
turn
Inter Peptid Depe Depe
actio e nd of nd
n bonds side on
bet betwe chain side
wee en s chai
n amino ns
stru group
ctur and
es carbox
yl
group
Exa Influenza virus Three Transcriptio
mpl Hemagglutinin tertiary of n initiation
e subunits -HA2 hemagglu complex
fibrous domain tinin
and HA1 heterodi
globular mer
domain subunits
interact
together

o Disulfide bonds:
• Intrachain: contribute to tertiary structure
• Interchain: Contribute to quaternary structure
Motifs Domains
Description Combination of sec Characteristic 3D structures
structures forming
distinct local 3D
structure
Level of Structurally Structurally independent- contain
dependence dependent -Few sufficient # bonds to hold domain
structure together when independent from the
maintaining bonds rest of the protein
Rest of protein
contributes to
stability
Size, shape and Small local Large
structure structures Compactly folded
Characteristic AA
sequences
Composition Secondary Made of various motifs
structures
Types

Intrinsically disordered proteins


Exist in random coild under
physiological conditions
May adopt specific sec/tert
structure upon binding to well-
structured partner protein

• Macromolecular assemblies
• Protein folding and misfolding
o Native and denatured proteins;
• Proof 3D structure is determined by AA sequence
▪ Native conformation -- add urea --> denatured -- remove
urea --> renature through spontaneous refolding in folding
pathways(but sometime misfolded)
▪ Determined 3D structure is determined by AA sequence
o folding pathways,
• Synthesized from N-terminus to C-terminus
• N-terminus begins to fold before C-terminus is synthesized
o Chaperones: proteins that help guide protein folding along productive
pathways by permitting partially misfolded proteins to return to the
proper folding pathways
• Recognize misfolded proteins by hydrophobic patches exposed
• Functions:
▪ Fold newly made proteins into functional conformations
▪ Refold misfolded proteins
▪ Refold unfolded proteins into function configuration
▪ Disassemble toxic protein aggregates
▪ Assemble and dismantle large multiprotein complexes
▪ Mediate transformations between inactive and active
forms of proteins
• Types:
▪ Molecular chaperones: operate as single molecules
• HSP70: helps newly-synthesized proteins follow
correct folding pathway
• HSP: heat shock protein: because up
regulated under conditions that denature
proteins = high heat
• Mechanism:
• Bind to exposed hydrophobic
residues of nascent polypeptides
• Protect from aggregation until
properly folded

▪ Chaperonins: form multisubunit "refolding" chamber


• Upregulated when misfolded proteins accumulate
Molecular Group 1 chaperonins- Group 2
chaperones bacteria/mitochondria Chaperonins-
Eukaryotic
cytoplasm
Heats HSP70 HSP60 HSP60
hock
protei
n
Struct Blob: Supramolecular Supramolecular
ure substrate binding Enclosed chamber- Enclosed Hetero-
site 2 independent cylindrical octomeric folding-
Nucleotide- chambers chamber
binding domain GroEL- 7 identical CCT complex of TRiC
Open-close inward-facing
configuration protein binding
subunits
2 caps
GroES- 7 subunits
Functi Helps newly Reciprocal relationship
on synthesized between chamber: one
proteins follow closes to fold proteins
the correct while the other opens to
protein folding release them
pathway
Protect exposed
hydrophobic
residue of nascent
polypeptides from
aggregating until
properly refolded
Recog Bind to exposed Recognize hydrophobic Recognize
nition hydrophobic patches hydrophobic
residue of nascent patches
polypeptide
Mech
anism

Undergoes concerted
ATP-binding/hydrolysis
Cycle of client and conformational No cap, lid is
protein binding changes integrated into
and chaperonin
conformational Unfolded protein binds Twist to open and
change associated in chamber close
with ATP binding ATP + cap -> closed
and hydrolysis conformation, folding open configuration,
protein with the misfolded protein
ATP bound to chamber enters through open
nucleotide ATP hydrolysis: rate lid, binds inside
binding site: open determining step(causes
configuration, so GroES lid to bind to ATP binding and
misfolded protein other chamber) hydrolysis -> closed
binds to SBS ADP + GroES lid leave -> conformation, client
properly folded(or protein folding
ATP incompletely folded)
hydrolysis(DNAJ/ protein released Release ADP and P -
HSP40)--> closed > open
configuration conformation,
release protein,
ADP bound: reset chamber
protein folding,
ADP release, ATP
binding
(GrpE/BAG1) -->
open
configuration

ATP bound: open


configuration,
protein released,
continues folding
Up- When When When
regula misfolded/denatu misfolded/denatured misfolded/denature
tion red proteins proteins aggregate (high d proteins aggregate
aggregate (high temp) (high temp)
temp)
o Medical relevance of protein misfolding
• Protein degradation
o Ubiquitin/proteasome system
• Step 1: Poly ubiquitin "tag" damaged or misfolded proteins for degradation
▪ E1(ubiquitin-activating enzyme) activates Ub
▪ E2(ubiquitin-conjugating enzyme) replaces E1
▪ E3(ubiquitin ligase) recognizes hydrophobic patches and binds ubiquitin
to target protein
• Target protein:
• Misfolded
• Normal protein: degraded for regulatory purposes
• Oxidized amino acids
• Mutations in E3: can cause Parkinson's disease
▪ Ubiquitin tags accumulate on misfolded protein(repeat)
• 76-residue protein that can be covalently linked to lysine residue
on target proteins
• Step 2: Ubiquitin-tagged proteins are fed into proteasome
▪ Proteasome: (independently evolved similar structure to chaperonins,
but completely different function)
• Mechanism:
• Proteins in cap recognize and bind polyubiquitin
• Remove ubiquitin by hydrolysis
• Unfold target protein using ATP
• Feed target proteins to central chamber of 20S core
• 20S core: subunits form inward facing protease.
Degrades proteins to AA or short oligopeptides
• Shape:
• Three inner chambers
• Inward facing protease
• Minimize the danger of protease(enzyme that destroys
proteins)
o Failure of multiubiquitination/proteasome mechanism
• Accumulation of insoluble protein aggregates
▪ Amyloid: accumulation of misfolded proteins
• Formation:
• Amyloid precursor
(cleavage)
• Alpha helix
• Beta-sheets
• Aggregation into filaments resistant to proteolysis
• Neurodegenerative diseases:
• Alzheimer's disease
• Parkinson's disease
• Mad cow disease
• Lesions:
• Plaques
• Tangles
• Protein function and regulation
o Ligand-binding
• Specificity:
▪ The ability of a protein to bind only one particular ligand
even in the presence of a vast excess of irrelevant
molecules
• Affinity:
▪ Tightness or strength of binding
• Complementary molecular surfaces have many
weak
▪ Expressed as dissociation constant Kd
▪ The stronger the interaction, the lower the Kd
• Antibodies
▪ Characteristics:
• Bind with high specificity and affinity
▪ Structure:
• 2 identical heavy chains
• 2 identical light chains
• Regions:
• CDR: complementary determining region/
antigen-binding surface
• Protein loops from both heavy and
light chains
• Highly variable amino acid sequence
among antibody encoding genes
• Astronomical repertoire of possible
CDRs
• Enzymes: extremely diverse class of catalytically active proteins
whose ligands include the substrates of the reactions they
catalyze
▪ Enzyme kinetics: Michaelis-Menten enzyme kinetics
• Vmax: maximum rate of catalysis given a saturating
amount of substrate
• Depends on:
• Amount of enzyme
• How fast enzyme works
• Turnover number: enzymatic cycles
per second at top speed
• Km: substrate concentration that supports a rate of
catalysis equal to 1/2 of the Vmax
• Depends on/measure of:
• Chemical properties: Affinity of
enzyme: substrate binding
• Does not depend on:
• Changes in Vmax for the same
enzyme: binding affinity is
independent of concentration
▪ Relation to pH
• Enzymes exibit pH optima which reflect:
• Active site acid-base chemistry
• Sensitivity of overall protein conformation
to charge distribution
• Chymotrypsin, like trypsin, for example, has
a pH optima at 7
• Lysosomal hydrolase has evolved to have a
pH optima at 4.5
▪ Enzyme catalysis;
• Occurs at active site: some AA in the enzyme
contribute to SBS and others to CS
• Substrate binding site: specificity
• Interact at complementary
molecular surfaces
• Catalytic site
• Exposure of key amino acids on
catalytic site
• Distant on polypeptide, but
brought near eachother
during folding t make up
catalytic site
• Examples:
• Serine protease: hydrolyzes bonds in
polypeptides
• Serine on catalytic site
• Trypsin: hydrolyzes peptide bonds adjacent to arginine and lysine(large/basic
side chain)
• Proper substrate binding: when AA fits into negatively charged pocket
within the SBS
• Cleavage of peptide bond with formation of covalent substrate-enzyme
complex
• Hydrolysis of acyl enzyme complex

**Note: both reactions depend on pH(best at 7)


for His-57's ability to bind and release a proton
• Chymotrypsin
• Elastase:
• SBS pocket obstructed by bulky valine side chains
• Cleaving at catalytic site only happens near small side-chains(alanine and
glycine)
▪ multifunctional enzymes;
▪ Scaffolds
• Enzymes in common pathway are often physically associated with one
another by
• Direct interactions

• Binding to a common scaffold protein


▪ Allosteric effect: binding of a ligand to one site on a protein --conformational
change--> affected ligand binding at another site
• Example: HSP70 molecular chaperone ADP vs ATP binding affects
conformation and thus interaction with misfolded protein
▪ Common regulators:
• Allosteric conformational switched in regulatory proteins in response to
ligand binding or post-translational modification
(important in signal transduction pathways)
• Ca2+ binding to calmodulin: regulate structure and activity
• Change conformation of calmodulin
• Target peptide of other protein can now bind

• G-protein
• "on" GTP bound
• Switch "on" to "off" wit GAPs
• "off: GDP bound
• Switch "off" to "on" with GEFs
• Phosphorylation/dephosphorylation: (important in cellular signal transduction
pathways)

• Phosphorylation of amino acids: post translational modifications


• Rapidly reversible
• Covalent modification of protein structure
• Enormous importance in cellular regulatory events
• Example: cellular signal transduction pathway
• Kinases(>500 encoded in human genome)
• Specific for 1 or few target proteins
• Cascade affect: target proteins are often kinase or
phosphate
• Amplification of signal
• Fine-tuning
• Protein purification, detection and characterization
o Physical and chemical properties:
• Mass, size, shape
• Density
• Electrical charge
• Binding affinity
o Techniques
• Centrifugation
▪ Mechanism: generate centrifugal force (unit of earth
gravity, g)
• Acts on particles suspended within a liquid
medium(usually aq)
• Particles less dense then suspending
medium -> float
• Particles same density as suspending
medium -> steady
• Particles denser then suspending medium -
> down in tube
▪ Methods:
• Differential centrifugation:

• Separating cell content by size/mass


• Low speed: pellet nucleus, leave mitochondria in
supernatant
• Mass nucleus > mass of mitochondria
• Higher speed: pellet mitochondria
• Rate-zonal Centrifugation
• Create density gradient
• Smoothly mix high and low density sucrose solutions to
fill centrifuge tube
• Equilibrium density gradient centrifugation:
• Create density gradient
• Smoothly mix high and low density sucrose solutions to fill
centrifuge tube
• Separate DNA and lipoproteins
• Apply resuspended pellet to the top of sucrose gradient
• Centrifuge at high speed
• Content of pellet moves down tube until it reaches
density of sucrose equal to its own density
• Example: covid
Differential centrifugation

• 1st: pellet and discard nuclei and mitochondria


• Medium speed
• 2nd: pellet and keep virus particles
• High speed
Equilibrium density gradient centrifugation
• 3rd: resuspend pellet
• Sucrose concentration gradient
• High speed
• Electrophoresis: using charge-mass ration
▪ Free matrix vs gel
• Free matrix
• All SDS-protein complexes have same electrophoretic
mobility
• Gel: larger molecules impeded and slowed
• Larger SDS-protein complex are slowed
▪ Mechanism
• Applied electric field
• Migration of charged molecules through gel
• Direction: determined by net charge
• Speed: determined by charge/mass ratio
▪ Methods
• SDS-PAGE
• SDS: sodium dodecyl sulfate(anionic detergent)
• Disrupt oil-drop model: denatures individual
polypeptides and separate chains of multimeric
proteins into individual denatured polypeptides
• Interaction of SDS hydrophobic tail with
hydrophobic AA side chains of proteins
• Interaction of SDS hydrophobic tail with
itself
• Coat polypeptide chain with uniform layer
of SDS
• Negative charges repel each other
• Further unfolding of protein
• Polyacrylamide gel electrophoresis (PAGE)

• Gel matric impedes larger molecules


• Migration rate is inversely related to protein size
• Effect of post-translational modifications
• Shift in protein mobility in SDS page by
phosphorylation of proteins by protein kinase
• Inaccurate apparent molecular weight
• Phosphate group locally interfering with
SDS binding
• Isoelectric focusing:
• Mechanism:
• Isoelectric point: pH t which sum of all charges = 0
• Depends on AA composition of each protein
• pH below isoelectric point: positive charge
• pH above isoelectric point: negative charge
• Proteins separated by their isoelectric point in a pH
gradient
• Method:
• Establish pH gradient:
• Special buffer(ampholytes) mobilized in
acrylamide gel
• Subject proteins to elecctrical field -> proteins
migrate
• Proteins stop migrating when they reach isoelectric
point because they are now electrically neutral

• Two-dimensional:
• Mechanism: isoelectric focusing followed by SDS PAGE
• There is no relationship between isoelectrip point and molecular
weight
• Separating by SDS PAGE lets us see the different proteins that
may have had the same charge in isoelectric focusing
• Method:
• Isoelectric focusing: separate in first dimension by
charge(isoelectric point)
• Place vertical isoelectric strip parallel to 2D gel
• SDS PAGE: separate in second dimension by size
• Mass spectrometry
▪ Mechanism: High precision analytical(non preparative) method of
determining charge to mass ration of ionized molecules
• Each amino acid and oligopeptide has a characteristic molecular
weight
• Molecules carrying single charge have molecular weight equal to
m/z
▪ Method:
• Electrospray ionization: produce dispersed gas-phase ions
• Fragment by high-energy collision with an inert gas
• Separated in the mass analyzer into separate populations
of differing m/z
• Measure acceleration of ions in an electric or magnetic field
• Acceleration depends to charge-to-mass ratio
▪ Tandem MS method:
• Method:
• First step: gives m/z of all fragments of polypeptide
• ions(carry electric current) accelerated through
electric or magnetic field
• Ions collide with a detector
• Acceleration determined
• m/z determined
• Second step: gives m/z of smaller fragments of target part
of original polypeptide
• Consider m/z of fragments determined in step 1,
• Decide what particles get destroyed based on m/z,
• Decide what fragments you are interested in based
on m/z
• Redirect beam of interest by altering
electric field to isolate it by its m/z
• Destroy the rest
• Isolated target fragment is destroyed on
collision
• Determine charge-to-mass of its sub
fragments
• Product ion spectrum
analyzed computationally
with respect to known
protein sequences(based on
computer translations of
genome DNA sequences)
• Identify amino acid sequence of the
peptide ion
• Characteristics:
• Fragmentation is partial and random: one or few peptide
bonds per molecule break
• Liquid chromatography,
▪ Function:
• Separation of components based on their differential interactions
with an immobile(solid beads) material
▪ Structure:
• Mobile phase(aqueous buffer in proteins) moves continuously
past the solid phase
• Proteins molecules
• Usually done in columns
▪ Steps:
• Separation based on differential interactions with immobile phase
• Mobile phase(usually aqueous buffer) moves past solid phase
• Proteins moved along in mobile phase
▪ Different rates depending on protein interactions with
solid phase
▪ Types:
• Gel filtration: separate by size

▪ Solid phase beads with pores


▪ Mobile phase goes through and around beads
• Small proteins go through pores and slow down
• Larger proteins don’t fit in pores and pass around
beads quickly
▪ Large proteins collected first
• Ion exchange: separate by charge

▪ Beads are given a charge


▪ Charged ions flow through
• Ions of opposite charge to beads stick to them
• Ions of identical charge to beads flow through
▪ Collect ions of same charge as beads first
▪ Electrostatically bound ions washed off with NaCl
• (ion-exchange)Depending on charge of proteins,
Na+ or Cl- will pull proteins off the beads
• Antibody-affinity:
▪ Antibodies: highly specific
• Characteristics:
• Recognize epitopes on antigen surface
• Only that specific one
• Can be raised against almost any kind of
chemical agents, including proteins
• Structure:
• CDR: varies among antibodies within a
species
• Region of primary antibody that binds to
secondary antibody from another species-
• Constant among antibodies within a
species,
• Different from species to species
• Levels:
• Primary: recognizes the epitope of interest
• Secondary: recognizes the primary antibody
by long T region
• Universal
▪ Method:
• Covalently bind antibodies for target antigen to
beads of solid phase
• Flow complex mixture of proteins through the
column
• Antigen proteins with target epitope will be
retained
• All other non-target proteins flow through
and out of column
• Retained target protein antigens washed out with
low pH buffer
• Released from antibodies
• Co-immunoprecipitation: isolating protein complex by presenting
antibody for one of the proteins
▪ Precipitate the entire complex along with target protein
and antibody
• Western blot(immunoblot): protein viewing method
▪ Method:
• Use antibodies to recognize individual protein species in a
complex mixture of proteins separated by SDS PAGE
▪ Indirect immunodetection: antigen-primary antibody-
secondary antibody(probed)
• Tag antibodies with colours, not necessarily fluorescent
compounds
▪ Types:
• On lysed cells to detect intracellular receptors
▪ Solubilize cells with SDS
• Reveal presence of target receptors
• On natural, non-denatured proteins
▪ Non-SDS cell homogenate
▪ Immunoprecipitated with anti-(target receptor) antibody
▪ Add SDS
▪ Do western blot looking for when complex is stable
• subcellular localization:
▪ GFP:
• Nature:
▪ Jelly fish protein that fluoresces
▪ Single polypeptide chain contains enzymic activity that
modifies some of its own AA chains to generate the
fluorochrome
• Types:
▪ GFP fusion constructs:
• Functional protein tagged with GFP
• Shows protein function and where it travels to
▪ GFP reporter: replaces the gene and turns green when
replication is activated
• Non-functional protein
• Only shows location of replication
▪ Immunofluorescence:
• Method:
▪ Incubate e sample with primary antibody
▪ Wash
▪ Incubate sample with secondary antibody conjugated to
fluorochrome
▪ Wash
▪ Mount and view
• Double layer fluorescence: Two-colour immunofluorescence
• Coloured Antibody 1: from species B raised against
antibodies of species A for protein 1
• Coloured Antibody 2: from species D raised against
antibodies of species C for protein 2
PL1-PL6

Topics:
DNA replication
• Polymerization:
o Alpha phosphate of incoming deoxy nucleotide triphosphate(dNTP)
reacts with the 3' hydroxyl group(OH) in the growing DNA strand
• Primer and primase: DNA polymerase cannot initiate synthesis of a new strand,
only elongate it. Primase makes a primer complementary to the DNA sequence
from which DNA polymerase can begin to elongate the strand, adding DNA
nucleotides
• Replication protein:
o RFC loads PCNA clamp onto DNA template and DNA polymerase
o PCNA is a homotrimer protein that acts as a clamp, preventing DNA
polymerase epsilon or delta from separating from the template
o Large T-antigen(looks like a tangerine): is a hexamer helicase, 6 slices,
encoded by the viral genome and serves to unwind DNA helix at the
replication fork
o Primase/polymerase alpha: DNA polymerase alpha extends the primer
made by primase with DNA nucleotide before being switched out for
polymerase delta/RFC/PCNA complex to hold polymerase to template
strand as it adds nucleotides
o Ribonuclease H and FEN-1 displace RNA at 5' end of Okazaki fragment
• Polymerase delta replace RNA with DNA
• Fragments are ligated with DNA ligase
• Replication origins are rich in A and T
• Sequence of replication:
o Unwind helix: helicase
o Leading strand primer synthesis: Primase/DNA polymerase alpha
o Extension of primer: Polymerase epsilon/RFC/PCNA replace polymerase
alpha/primer
o Further unwinding: large T proteins helicase
• Bind RPA to single stranded regions
Constant DNA damage

DNA repair and recombination


• mutation,
• Proofreading
In cells by DNA polymerases decreases errors from 1 in 10 000 to 1 in 1 000 000
o Polymerase Delta, Epsilon, have 3' to 5' exonuclease/proofreading activity+
o Wrong base incorporated causes polymerase to pause
o 3' end of new strand moves to 3'-5' exonuclease site and mis paired base is
removes
• Base excision repair
o Problem: deamination of methyl cytosine into thymine or cytosine into
uracil
• Error is present before replication: template
• In T-G mismatches, the T is wrong
o Solution: base excision repair before replication to remove only the
wrong base
• DNA glycosylase break bond between T and sugar-phosphate
backbone
• APEI endonuclease cuts the base's DNA backbone
• AP lyase part of DNA polymerase B removes deoxyribose
phosphate
• DNA polymerase B inserts correct C base and ligase ligates
backbone
• Mismatch excision repair

o Problem: base-pair mismatches, insertions or deletions


• Errors in replication
• New strand is wrong
o Solution: mismatch excision repair after DNA replication to remove
several nucelotides near the mismatch
• MSH2 and MSH6 recognize mismatch- + identify daughter strand
• MLH1 endonuclease dimerized with PMS2 cuts new strand
• DNA helicase unwinds DNA
• DNA exonuclease digests some nucleotides of new strand
• DNA polymerase delta adds missing nucleotides based on
template
• Nucleotide excision repair
o Problem: chemically modified bases distort double helix
• UV radiation: Thymine-thymine dimer (two adjacent thymines
covalently bond)
• Carcinogens can distort helix
o Solution: nucleotide excision repair removes 24-32 bases on either side of
thymine-thymine dimer
• XP-C and 23B recognize distorted helix
• TFIIH(helicase), XP-G and RPA unwind helix to make bubble of 25
nucleotides
• XP-F and XP-G cut damaged strand
• DNA polymerase fills missing nucleotides using other strand as
template
o Unsolved problem: thymine-thymine dimer enter replication fork
• Replication DNA polymerases stall when they meet dimer
• Translesion polymerase can read through the thymine-thymine
dimer but cannot proofread
• Regions near thymine-thymine will likely have mutations cause by
replication errors
• Polymerase n is replaced by normal replication DNA polymerase
o Extra problem:
• Xeroderma pigmentosum: XP-n, genetic disorder high disposition
to UV-induced cancer. Mutations in XP cause diseases
• Double-strand break repair by end joining

o Problem: double strand break


• Radiation (x-rays, y-rays)
• Anti-cancer drugs
o Solution: NHEJ Rejoins broken chromosomes( But base pair deletions
occur in the process)
• Ku and DNA-PK bind ends of double strand break
• Recruit nuclease to remove bases when two bound DSB come
together
• Double strand molecules are ligated together
▪ Regardless of whether they were originally adjacent in the
chromosome
▪ Results in chromosomal rearrangement
o Unsolved problem: double strand break not repaired
• Part of chromosome distal to break is lost at the next cell division
• lethal
• Double-strand break repair by homologous recombination
o Problem: double strand break
• Collapsed replication fork
o Solution: homologous recombination to replaced damaged DNA with a
copy of undamaged sequence of homologous chromosome
• Replication stops when break is encountered in upper template
• 5' exonuclease acts on broken end of upper template
• Upper template distal to break is ligated to bottom daughter
strand moving away from break
• RecA or Rad51 mediate strand invasion:
▪ Newly ligated strand anneals to upper daughter strand
▪ Upper template strand crosses over and migrates to lower
template strand
• Strands are cut at cross-over
• Ligate ends
• Rebuild replication fork -> continue replication
o Extra problem:
• BRCA1 and BRCA2 gene mutations -> breast cancer
▪ These genes encode proteins important in recombination
repair
• post-replication mutations,
• Holliday structure can generate the original chromosome by rebinding or a
recombinant chromosome by recombining strands containing DNA from both
parents

Polymerase chain reaction (PCR):


Amplifies a specific known DNA sequence, which can be part of a complex mixture.
Must know the nucleotide sequences at the ends of the region to be amplified
• Uses:
o Sequencing
o DNA cloning y isolating a particular gene
o Detection of pathogens(SARS-CoV-2)
o Gene editing
• Function:
o Exponential amplification: very sensitive detection of known DNA
sequence in a sample
o Enable purification of a specific DNA sequence
• Set up
o Thermocycler: rapidly changes temp
o DNA template: complex with target sequence within
o DNA polymerase stable at high temperature
• Taq polymerase: originates from Thermophilic bacterium Thermus
aquaticus
o Primers complementary to each end of the region to be
amplified(oligonucleotide)
• Designed by computer
o dNTP: monomers for DNA to be built

• Procedure: 20-40 repeats


o Denaturation: increase heat
• Temp for denaturation: 98 degrees C
• Temp: depends on amount of C-G base-pairing
▪ C-G have 3 H-bonds, A-T have 2 H-bonds
o Annealing of DNA primers (20 nucleotides long)
• Temp for annealing: decrease temp to 48-72 degrees C
• DNA primer(oligonucleotides): designed by computer to be
complimentary to ends of DNA sequence to be amplified
o Extension of DNA primer by DNA polymerase
• Temp for extension: increase to 68-72 degree C keeps DNA from
reannealing
• Taq polymerase:
▪ Originates from thermophilic bacterium Thermus
aquaticus
▪ No proofreading exonuclease activity, so good for
amplifying short fragments
▪ Adds nucleotides
• Other enzymes with better fidelity(accuracy)
• Restrictions:
o Some primers will anneal where you don’t wan them to
o More cycles= more off target products getting amplified
• Separation/viewing:
o Gel electrophoresis:
(no need to denature DNA to be conducted)
• Molecules move through pores in gel at an inversely proportional rate to their
length
▪ long DNA sequences migrate less,
▪ short target sequences produced through PCR will migrate further

DNA sequencing
• Dideoxy Chain-Termination Method of DNA sequencing (classical Sanger
sequencing)
o Function: verification of individual results
o Setup:
• 4 tubes all containing
▪ DNA polymerase
▪ Oligonucleotides primer: that anneals to one end of
fragment to be synthesized
▪ DNA template
▪ dNTPs (100 mM)
▪ Chain terminator: 1 ddNTP( ddATP, ddGTP, ddTTP, ddCTP)
• No 3' OH, just H
• Synthesis stops because OH of 5' phosphate cannot
bind, so chain stops growing

o Limitations:
• Polymerase only runs for 300-500 nucleotide sequences, so gels only
resolve that much. Do multiple short sequencing reactions to sequence a
large region
• Difficult to differentiate 300-500 by size
• Rate of sequence production is limited by total reactions that can be
performed at one time
• Next-Generation sequencing (NGS)
o Function: allow single sequencing instruments to carry out millions of
sequencing reactions simultaneously
o Short read sequence: few nucleotides at once
o Procedure: 1 day long
• Ligating the same linkers to a mixture of DNA fragments
• Denature DNA
• Anneal DNA to complementary primers anchored to a solid
support
• PCR: amplifying the DNA fragments in a fixed spatial arrangement
• Cut one strand of double DNA:

• Sequence the left over strand with fluorescently labeled dNTPs(different colours
per base)
• Imaging and removal of fluorophore after each cycle to avoid mixing of colours
• Nanopore technology

o Procedure:
• Single stranded DNA binds to motor protein
• Motor protein pulls DNA strand through it
• DNA goes down into pore
• Electrical current goes through the pore
▪ Causes changes in current
▪ Current changes recorder
o Advantages:
• Sequencing single molecules makes studying new biological
questions possible
• Long reads mean les need of mapping of repetitive sequences
• Portable: small like a laptop
• Sequencing Genome:
o Assembly of whole genome sequences: challenge
• For complete clone map, repeat 30 times
• Create aligned library
• Align properly

DNA cloning, experimental gene expression, intro to genomes


• Recombinant DNA
o Overview of Recombinant DNA technology
• Vector + DNA fragment
• Recombinant DNA
• Replication of recombinant DNA within host cells
• Isolation, sequencing, and manipulation of purified DNA fragment
o Method for synthesis of recombinant DNA
• Plasmids
▪ Characteristics:
• The most common vector used in recombinant
DNA technology.
• They are circular, double-stranded
extrachromosomal DNA strands
• Found in bacteria and lower eukaryotes
• Replication occurs before cell division
• Restriction enzymes
▪ Function:
• Cut phosphodiester bonds in symmetrical staggared fashion
• Staggered cut at specific region -> making sticky ends(one strand
is cut shorter then the other
▪ Action:
• Make staggered cut at restriction site on Vector DNA
• Make staggered cut at restriction site on chromosomal DNA

• Recombinant DNA
▪ Insert genomic DNA at sticky ends
▪ Ligate into circular plasmid with T4 ligase and 2 ATP
▪ replication

o Uses of recombinant DNA


• Analysis methods using recombinant DNA
▪ Methods:
• In Situ hybridization
Function: reveals spatial distribution of an RNA
Example: sonic hedgehog mRNA in 10 day old mouse
embryo

• Microarray

• Cluster analysis: identify coordinately regulated genes


▪ Yield:
• mRNA expression
• Co-regulation
• Localization
• Regulated expression of exogenous genes
• Production of proteins in prokaryotic and eukaryotic cells
▪ In Ecoli with the inducible lac promoter
• in the presence of cacl2 and heat pulse
• Place on antibiotic nutrient plate: those who die had no plasmid,
those who lived have plasmids(with antibiotic resistant gene) and
are considered transformed
o Expression of cloned genes
• In Cells:
▪ Transient

▪ Stable

• In Genomes:
▪ Retroviral vectors
• Vector Plasmid
• Packaging Plasmid
• Viral coat Plasmid
• DNA libraries: permanent collections of genes
o Genomic DNA: chromosomal DNA
o cDNA: reverse transcribed from mRNA
• Synthesis and cloning
• Genome: The entirety of an organisms hereditary information
o Composition: Mostly DNA(some viruses have RNA)
• Measured in base pairs (bp, kb, Mb)
• Larger genomes are often because of more transposable elements
o Biological complexity
• Unrelated to content of genome
• Molecular cloning by dilution and transformation;
• Recombinant protein expression;
• Replication origins,
• Antibiotic resistance,
• Inserts and vectors;
Genes and genomes, transposable elements
o Genes: can be considered as transcription units
• Exons: coding region or open reading frame
• Control regions: promoter and cis-regulatory factors
• Introns: separate exons and are spliced out during mRNA processing
o Transposable (mobile DNA): move within genomes
• DNA transposons:
▪ Increasing copy number of DNA transposons:
• Move transposons from region that has been replicated to
region about to be replicated to give extra transposon to
one daughter chromosome
• Can carry unrelated flanking sequences with them
• Retrotransposons more common the DNA transposons in humans
▪ Has a RNA intermediate -> reverse transcriptions
▪ LTR: long terminal repeats: protein coding regions encoding
reverse transcriptase, integrase and more
• A lines: non-viral DNA retrotransposon
▪ AT rich region, protein coding region and target site direct repeat
▪ Propagation of line:
• DNA is cut at specific site
• Line with complementary bases binds and DNA grows
complementary to the line
o BLAST: finding nucleic acid and protein sequence similarities
• Proteins with similar functions often have similar AA sequences
o DNA content and gene number in different species
• DNA varies more then proteins among a species
• Differences in genome sizes are mostly due to different numbers of non-
coding regions and transposable elements
• Greater gene density in lower eukaryotes than in more complex
eukaryotes
o evolutionary homology and sequence similarity, Evolutionary relationships:
• Orthologs: same protein in different species
• Paralogs: closely related proteins in the same species
o gene families
• Related genes formed by the duplication of an original single-copy gene
make up a gene family
o Solitary or single-copy genes: represented once in the genome
o Single-sequence tandem array: DNA fingerprints
o Simple-sequence repeat: used for paternity tests or identification of criminals
because they are unique to individuals
• Microsatellite DNA:
▪ Found in transcription units
▪ Expansion: several neuromuscular diseases(mytonic dystrophy,
spinocerebellar ataxia)
▪ Short repeated sequences can generate backward slippage during
replication
• Minisatellite DNA:
▪ Often in centromeres and telomeres
o Long tandem arrays of repeated sequences: non-coding sequences in
multicellular organisms
o repetitive DNA elements;
o mobile elements
o From Integrated retroviral genomic DNA to retroviral genomic RNA
Chromosomes
• Chromatin loops; histone proteins
• origins, centromeres, telomeres; required for replication and stable inheritance
of linear chromosomes
o ARS: origin of replication of yeast
• If absent: no plasmid replication
o LEU: leu gene without leucine
o CEN: dna sequence for chromosome centromere
• Without it, mitotic segregation is faulty
o Yeast must be linear: add telomeres so it can be
• Centromeres connecting to spindles: CENP-A -> CBF3 -> Ndc80
• telomerase

RR1-RR16

ANALYSIS OF NUCLEIC ACIDS


Biological techniques advancing our analytical capabilities
Qualitative analysis:
o Nature of molecule in question
o Size
o Nucleotide composition
o Conformation/configuration
o Structure

Quantitative analysis:
o Determine the levels of gene products (tumour markers, p53, BRCA 1-2)
• Diagnostics context

Molecular probes:
Labelled oligonucleotides using polynucleotide kinase
o Require a known sequence corresponding to a gene product of interest

o Synthesize an oligonucleotide with the reverse complementary sequence

o Phosphorylate 5' end of the synthetic oligonucleotide's free hydroxyl


• Polynucleotide Kinase

Making labelled DNA probes


o Amplify DNA with PCR
o Incorporate dNTPs carrying radiolabels on alpha-phosphate
o Remove unincorporate radioactive dNTPs substrates
*** Radiolabelled PCR product must be single stranded before it is utilized***

Methods of Analysis and Detection


Analysis of DNA
o Electrophoresis in Agarose gel- diagnostics
• Cut DNA with a restriction enzyme
• Run through the gel:
▪ Separate DNA by size
o Transfer to a solid state membrane -> Permanent
• Method:
▪ Southern blot( permanent): DNA
• Denature all DNA in the fragments into single strands
• In alkaline solution
• Transfer to solid membrane
• By capillary action transfer
• From gel to nitrocellulose (or nylon)

• Make permanent record:


• Dry blot or crosslink with UV
• Results in record of:
• Levels: abundance
• Position: size
• Hybridizing the blot
• Probes complementary to sequence of interest
• Probes Watson-crick base pair with target sequence
• Wash
• Remove non-specific signals
• Complementary sequences are tagged with probes
• Radiography: or fluorescence
• Visualize target sequence by probes
• Functions:
▪ Distinguish Alleles in the family (always two per gene, one mom,
one dad)
▪ Used for relatedness or diagnostics
▪ Molecular identification of polymorphisms
• Variations if DNA sequences(allele)
• Method:
• Consider a known DNA sequence
• PCR to amplify a sequence
• Add PCR primer for the sequence of interest
• If checking for abnormalities in restriction site:
• Present restriction enzyme for specific
restriction site
• See how many bands show up in southern blot
• 1: restriction site isn't being cut ->
mutant allele
• 2: restriction site exhibits normal
function
• Example:
Normal functioning Gene: restriction enzyme for EcoRI would cut a
target sequence with the EcoRI restriction site resulting in two DNA
fragments of different sizes which can be visualized through
chromatographic electrophoresis
Gene sequence with a polymorphism: restriction enzyme for EcoRI
fails to cut a target sequence at the EcoRI restriction site, suggesting
there is an abnormality at the site and thus a polymorphism in the
gene sequence. This results in one large DNA fragment which can be
visualized through chromatographic electrophoresis

Analysis of mRNA
Northern blot: mRNA
1. Electrophoresis in Agarose gel- diagnostics
• Must denature before migration and maintain denaturation conditions
during migration since RNA twists and folds into weird energetic
structures
▪ Heat to eliminate secondary structures
▪ Denature in formaldehyde buffer integrated into the gel
• Run through gel:
▪ Separated by sizes corresponding to the various genes that
encode them
2. Transfer to solid state support

• Place gel in transfer buffer


▪ No need to denature like we do in DNA since the RNA was already
denatured before electrophoresis
• Transfer nucleic acids to solid state membrane
▪ By capillary action: Pulls molecules of gel up to the blot itself
▪ From gel to nitrocellulose (or nylon)
• Make permanent record:
▪ Dry blot or crosslink with UV
▪ Results in record of:
• Levels: abundance
• Position: size
• Hybridizing the blot
▪ Probes complementary to sequence of interest
▪ Probes Watson-crick base pair with target sequence
• Wash
▪ Remove non-specific signals
▪ Complementary sequences are tagged with probes
• Radiography: or fluorescence
▪ Visualize target sequence by probes
o Application: Detecting RNA isoforms: variants in RNA(Quantitative: method for
detecting amounts and qualitative: types of RNA in a given sample)

RT-qPCR: reverse transcriptase coupled with quantitative PCR


1. RT: Reverse transcribe mRNA into cDNA
• Prime all mRNA with poly A tail
• Prime mRNA with poly dT primer complementary to poly A tail (since all
mRNAs are primed with a poly A tail)
• Synthesize cDNA from dT primer to be complementary to mRNA
2. Conduct qPCR of cDNA
• With two specific sequence primers and cDNA
• Regular PCR mix + intercalating fluorescent dye
▪ Dye only fluoresces when incorporated into growing DNA
▪ More fluorescent = more growing DNA
• Determine mRNA levels with qPCR

▪ Phases in PCR
• Ground phase
• Exponential phase
• Linear phase
• Plateau phase
• Greater mRNA in sample = Greater starting cDNA = faster
the plateau is achieved
• Less cycles
• Lower mRNA in sample = Lower starting cDNA = slower the
plateau is achieved
• More cycles
▪ Compare curve with standard curve of varying conc samples
• Standard curve has all sample go exponential
▪ Get a relatively accurate value of amount of cDNA in original sample, thus
how much mRNA was present in the specific sample
▪ Good to quantify the regulation of one specific transcript
• Not so much a global view
• More precise
cDNA libraries: DNA based representation and abundancies of mRNA present in
initial sample
1. Purify RNA from a given sample(mRNA has poly A tail)
2. Prime mRNA with single stranded poly/oligo dT primer
3. Reverse transcribe into cDNA by growing from the dT primer
**Single stranded cDNA strand for every RNA in the given same**
4. Add alkali to remove RNA from cDNA
5. Add poly dG tail to cDNA 3' end using ligase
6. Hybridize the 3' end of cDNA where the polyG tail is with oligo-dC primer
7. Grow DNA from poly dC primer:
i. Generate a second strand of DNA--> double stranded DNA
**cDNA library gives permanent representation of all RNAs and their
abundancies in the original sample**
8. DNA polymerase I progresses through any remaining hybrid regions and extends
the second strand

RNA-seq: RNA sequencing


Next-generation sequencing method coupled with cDNA libraries
1. Purification of RNA from specific tissue
• Affinity chromatography with poly-t columns
• Gives poly-a enriched
• and size selection fraction
2. Convert poly A enriched fraction into cDNA
• Reverse transcriptase
• Rnase treatment
• Second strand synthesis
3. Make a cDNA library
4. Ligate each end of cDNA molecules with linkers(adaptors for NGS)
5. PCR amplification and sequencing
6. Genome alignment and quantification:
• Short reads
• Aligned to represent sequences
• Sequences aligned to represent all the genes in the genome
Gives understanding of most abundant RNAs
Not as accurate as RT-qPCR

TRANSCRIPTION

Overview of transcription
Conventions/components of template
o Transcription synthesizes pre mRNA having RNA polymerase II read a DNA
template strand form3' to 5' while adding rNTPs to elongate a complementary
RNA strand. This complementary RNA strand is the same as the non template
DNA strand but with Us instead of Ts, and is referred to as a pre mRNA as it has
not yet been edited.
o Regulator regions:
• Promoters:
▪ Can be upstream or down stream
▪ Comprises sequences that regulate efficiency of transcription
• Coding sequence:
▪ 5' etr and 3' etr
Transcription proteins: RNA polymerase

o RNA polymerase I
o RNA polymerase II
• Must be vary faithful and not fall off mid way through a 24hr
transcription
• Reads the template strand that goes 3'-5'
• Advances at a rate of 1000-3000 nt/min
o RNA polymerase III

Stages of Transcription
o Initiation:
• Polymerase binds to the promoter sequence,
• locally denatures the DNA,
• catalyzes the first phosphodiester linkage
o Elongation:
• RNA polymerase advances 3' to5' down the template strand, denaturing
the DNA and polymerizing the RNA
• Polymerization is favored:
▪ High energy bond between alpha and Beta phosphate is replaces
by a lower-energy phosphodiester bond
o Termination:
Prokaryotic transcription
Function/mechanisms
o Sigma factors
• Confer specificity to RNA polymerase
• Ensure efficient transcription rate
o DNA binding protein
• Regulate the relate of RNA synthesis
▪ Enhance RNA polymerase binding to promoter region
▪ Inhibit/impede RNA polymerase binding to promoter region
o Lots of allosteric regulation controlled by catabolites
▪ Means of adapting
o Polycistronic code:
▪ Make multiple RNA and one gene from it
▪ One protein coming from different rNA
Transcription proteins: RNA polymerase
• Types:
▪ Polymerase I:
▪ Sensitivity to toxin alpha-amanitin: not affected
▪ Polymerase II
▪ Sensitivity to toxin alpha-amanitin: highly sensitive
▪ Polymerase III
▪ Sensitivity to toxin alpha-amanitin: Slightly sensitive
• 3D structure

Eukaryotic Transcription
Function: Especially important in development
• Distinguish cell types that will differentiate and make us
• Mostly done during embryogenesis
• Monosystronic modes of transcription:
▪ make 1 RNA and result in one protein
Polymerases
• RNA polymerase I
• RNA polymerase II
• RNA Polymerase III
Structure: 3D RNA polymerase
• Exist in multimeric complexes
• Are similar to bacterial sub units

• CTD phosphorylation in Vivo


• Puffs: genes in the puff are being actively transcribed

REGULATORY SEQUENCES OF THE RNA POLYMERASE


PROMOTER
Regulation of Transcription
Eukaryotic Transcription Regulation Through Control elements: A
spectrum of different elements that regulate genes from different distances
o Chromatin structure: take part in regulating transcription
• Condensed regions inhibit polymerase and transcription factor binding
• Puffed regions:
o Promoter Region: Determines the site of Transcription initiation for an RNA
polymerase
• Activity dictated by transcription factors binding to control elements
o Transcription factors: Transcription of a gene may be regulated by the binding of
multiple transcription factors to alternative control elements, directing
expression of the same gene in different types of cells at different times during
development
• General factors:
▪ Required for transcription of all genes, participates in formation of
the transcription-preinitiation complex near the transcription
start site
• TATA box: More conserved and only "cis-acting" element
• Function:
▪ Directs transcription at the promoter of
some protein coding genes
• Location:
▪ Upstream sequences around the same
place in regard to transcription start sites of
many genes
• Specific factors:
▪ Stimulate or inhibit transcription of particular genes by binding to their
regulatory sequences
• Activator proteins
• bind to transcription-control regions
• Near transcription start site
• Kilobases from the transcription start site:
Enhancers
• Upstream from the promoter
• Down stream form the promoter
• Function:
• Promote chromatin decondensation
• Promote binding of RNA polymerase to the
promoter
• Promote transcriptional elongation
• Repressor proteins
• Bind to alternative control elements
• Function:
• Cause the condensation of chromatin
• Inhibition of polymerase binding
• Inhibition of transcriptional elongation
o Control elements:
• Characteristics:
▪ Regulation of a gene can be done by multiple transcription control
regions

• A spectrum of different elements can regulate genes from different distances


• ** Notes: many yeast genes have a regulatory element called UAS(upstream
activating sequences) that work like an enhancer
• UAS: interacting with RNA polymerase II to enhance efficiency of RNA
interactions at a specific site
• Types:
▪ Promoter
▪ Enhancer: Cis-acting elements that control tissue-specific or stage-
specific transcription
• Conservation of chromosomal region:
• SAL1 gene responsible for limb development
• Take segment of DNA and drive a reporter gene

• Function:
• Act at a distance, sometimes kilobases away from their regulatory targets
• Chromosomal regions far away on the sequence but close upon
looping
• Loops are often associated with active transcription
• Enhancers may help to generate, stabilize and increase the rate of
transcription within loops, even if they are linearly far apart
• Examples:
▪ 3 promoters for the expression of Pax6 that function in different cell
types and at different times during embryonic development
o Identification of transcriptional regulatory regions
Method 1
Method 2
• Identification of cis-acting regulatory sites through linker scanning mutations
▪ Reporter genes: relative quantification of transcriptional efficacy
• Scatter the sequences of specific overlapping regions of one gene
and see which are likely to have important regulatory sites
• Examples:
• GFP
• B galactosidase (lacZ)
• Thymidine kinase (tk)
• Luciferase (luc)
• Chloramphenicol acetyltransferase (CAT)

Prokaryotic control elements


o TFIIB
• TATA box: Initiator
• Downstream promoter element
• UAS(in many yeast genes): regulatory element that works like an
enhancer
• Promoter-proximal transcriptional regulatory regions
▪ Recombinant DNA technology for determining promoter-proximal
regulatory regions
• Transformation: bacteria
• Transfection: Mammalian cells

• Transgenics: Live animals/plants


TRANSCRIPTION
RNA polymerase II promoters and gene transcription factors
Determining factors of protein synthesis
• Transcription initiation and elongation by RNA polymerase II are the most
regulated steps in gene expression and thus determine when and in which cells
specific proteins are synthesized

Transcription-control regions: multiple proteins binding eukaryotic protein-coding


genes and controlling their expression

Promoter: determine where transcription begins

o TATA box: ~26-31 bp upstream of TSS


• If genes in between TATA box and TSS are deleted, a new TSS is made ~25
bp downstream from TATA box
• Single changes in TATA-box sequence can decrease transcription rate of
adjacent gene
o Initiators: ~ 2 bp upstream to 4 bp upstream of TSS
• Sequence:
▪ Naturally occurring sequence: -1(Cytosine) +1(Adenine)
▪ Degenerate sequence:

• A+1: transcription start site


• Y : pyrimidine (C or T)
• N: any nucleotide bases
• T/A : either T or A at +3 position
• Types:
▪ BRRs: TFIIB recognition element
• Strongest promoters containing optimal sequence for the
interaction with TFIIB
▪ DPEs: Downstream promoter elements
• Bound by some subunits of TFIID
o CpG islands: promoter(or initiation) regions with low rate of transcription that
codes for house keeping genes *Note: p = phosphate bond between C and G
• Characteristics:
▪ Regions of 100-1000 bp with high frequency of CG sequences
• 5'-CG-3' is normally rare in mammals,
• Most C followed by G are normally methylated at 5
on pyrimidine ring
• For recognition of parent strand (with
methylated C) from daughter strand
(without methylated C) by DNA repair
mechanism
• Spontaneous deamination of 5-methyl C makes
thymidine
• Evolutionarily lead to the conversion of CG
to TG
• but CpG islands have lots in a concentrated area( several
times within a few tens of bp)
• CG in CpG island promoter is as likely as any other
nucleotide to follow CC in CpG islands are
unmethylated, so when they deaminate, C turns to
U which is recognized by DNA repair enzymes and
turned back into C
• CG-rich sequences don’t wrap too well
around histones since they require more
energy to bend the gene into the small-
diameter loops around the histone octamer
to make a nucleosomes
• General transcription factors can
more easily bind to DNA since it is
not inhibited by DNA interactions
with histone octamers
▪ Transcription begins at any of the several alternative sites within
CpG island
▪ For transcription of ~70% of genes
• Divergent Transcription
▪ Function:
• Transcription from CpG islands can be initiated in both
directions
• Transcription of sense strand
• Results in mRNA
• Transcription of nonsense strand: stops 1-3 kb from
the start site
• Does not result in mRNA since RNA
processing destroys the RNA transcribed in
the wrong direction
▪ Discovery:
• RNA polymerase II clamp domain makes the elongation
complex stable when RNA-DNA hybrid is bound near the
active site
▪ Mechanism:
• Isolate nuclei from cultured human fibrocytes
• Remove RNA polymerase that isnt participating in
elongation by Incubate nuclei in buffer containing salt and
mild detergent
• RNA polymerase elongating arent removed as they
are tightly bound to DNA
• Add NTPS, replace UTP with bromo-UTP(bromide ion at 5
on pyrimidine)
• Incubate nuclei at 30 degrees to polymerize the RNA
polymerase II that were still elongating when nuclei were
isolated
• Isolate RNA, immunoprecipitate RNA containing bromo-U
using antibody for BrU-labeled RNA
• Massive parallel DNA sequencing of reverse
transcripts(cDNA)
• Mapping on human genome
• Result:
• Sense transcripts: polymerase II pauses in the +50
to +200 region before continuing elongation
• Antisense transcripts: polymerase II pauses at -250
to -500 relative to the major transcription start site
▪ Directionality:
• Strong promoters: Strong TATA box or initiator sequence
makes polymerase II transcribe in the sense direction
• Weak promoter: general transcription factors and rNA
polymerase II associate with promoters in both direction
• Half of polymerases transcribe in one direction, the
other half in the other
• Absence of promoter elements like BRE, TATA box,
Inr, DPE make a weak promoter and lead to
divergent transcription since cues from the DNA
sequence arent present to correctly orient the
preinitiation complex
▪ Chromatin Immunoprecipitation
• Determining multiple binding sites of specific proteins
along the entire genome within a resolution of 300 bp

• Method:
• Cross-linking proteins to other proteins or DNA in living cells by adding
formaldehyde to the media
• Isolate and fragment cross-linked chromatin into lengths of 2-3
nucleosomes(~300 bp)
• Immunoprecipitation of fragmented DNA using antibody specific to a
target protein
• DATA: # times per million bases that were immunoprecipitated a specific
sequence from a region of the genome was identified
Other transcription-control elements located near transcription start sites
Enhancers: located far from the gene they regulate
o Regulate cell-type specific transcription and how frequently specific genes are
transcribed

Transcription at DNA sequences corresponding to the 5' cap of


mRNA
• The capped nucleotide of eukaryotic mRNAs coincides with he transcription start
site

Preinitiation complex(PIC)
General transcription Factors
TFIID
o First protein to bind to a TATA box promoter
o Subunits:
• 1 TBP:

Highly preserved TATA-box binding protein that forms a saddle like


structure through folding of the C-terminal domains, exhibiting dyad
symmetry and bending the DNA helix by binding to one of its minor
grooves
▪ In CpG, where there is no TATA box, TBP binds to regions between transcriptions
start sites in CpG island promoter
• 13 TAF:
TBP-association factors initiate transcription from promoters that lack
TATA box by binding to initiators and/or DPE promoter elements
TFIIA-TFIIB complex
TFIIA
• Heterodimer larger then TBP
• Associates with TBP and DNA upstream of TBP-TATA box complex
TFIIB
• Monomeric protein slightly smaller then TBP
• C-terminal domain of TFIIB clamps to C-terminal stirrup of saddle-shaped DNA
• Contact major groove of DNA on either side of TATA box
Polymerase II- TFIIF
TFIIF
• Heterodimer
• Formation of core PIC(promoter initiation complex: Associates with promoter
DNA-TFIIA-TFIIB complex
Polymerase II
• The extended N terminal domain inserted into RNA exit channel of RNA
polymerase II to stabilize the complex and help hold the DNA at the TSS(over the
cleft between RPB1 and RPB2 when the clamp is open)
TFIIE
o Heterodimer of two different subunits
o Bind next to TIIF
o Completely enclose template DNA's TSS in protein channel
o Has docking site for TFIIH
TFIIH

o Multisubunit(10) factor of similar size to polymerase II


• Mutations: Defects in repair of DNA damaged by alkylation
▪ Base with covalently linked mutagen
▪ UV-induced thymidine dimer
• Results:
▪ Xeroderma pigmentosum
▪ Cockayne syndrome
o Closes PIC: completes the transcription preinitiation complex when it binds to it
o Helicase activity: Melt DNA
• ATP hydrolysis to unwind DNA duplex at the start site
• Polymerase II forms an open complex: DNA duplex around start site is
melted
• Template strand is now bound to polymerase II active site
o Contributes to transcription coupled DNA repair
• Heavily transcribed regions are repaired more efficiently since TFIIH is
present
• Polymerase stalls at damaged DNA, core TFIIH with helicases but no TFIIH
kinase recognizes stalled polymerase, associates with other proteins and
begin repair of damaged DNA
TFIIH kinase
o Three TFIIH subunits
o Phosphorylates the Polymerase II CTD multiple times of the underlined serine in
this repeated CTD segment: Tyr-Ser-Pro-Thr-Ser-Pro-Ser
• CTD phosphorylated at Ser5 becomes a docking site for enzymes forming
the cap structure on the 5' end of an RNA molecule transcribed by RNA
polymerase II

Initiated Transcription Complex


Initiations
General transcription factors fall off
Elongation factors
Paused RNA polymerase complex: inhibition of elongation
NELF
• Negative elongation factor
• Inhibits transcription elongation by:
▪ blocking the principal channel through which NTPs reach the active sites
of enzymes
▪ Inhibiting conformational changes in the RPB1 and RPB2 subunits of
polymerase II required for the translocation of the enzyme down the
template
• Binds to polymerase II with DSIF(the other elongation factor)
DSIF
• DRB sensitivity-inducing factor
In between
P-TEFb(Heterodimeric protein kinase CDK9-cyclin T) phosphorylates NELF,
DSIF and serine 2 of the polymerase II CTD
• Phosphorylated NELF dissociates from the complex so two additional elongation
factors can bind
• Phosphorylated DSIF binds to RPB1 clamp and RPB2 on the other side of the cleft
between RPB1 and RPB2
Active RNA polymerase complex: activation of elongation
PAF and SPT6
• Block binding of NELF allowing polymerase to continue elongating
DSIF
• Helps hold the clamp so polymerase can transcribe longer distances without
dissociating from the template
Example
Transcription of HIV(human immunodeficiency virus)
• Tat viral protein
▪ Sequence specific RNA-binding protein that acts as an Anti-pausing factor
▪ Allows RNA polymerase II to read through the transcriptional blocks
caused by NELF binding
▪ Bind to RNA copy of TAR sequence: forms stem-loop structure near the 5'
end of the HIV transcript
• TAR binds Cyclin T holding cyclin T-CDK9 complex close to the
polymerase
• Phosphorylation of substrates leads to release of NELF and
transcription elongation
▪ Results:
• Transcription of 30% of mamalian genes is regulated by the
cotrolle of cyclin T-CDK9(P-TEFb) activity
• Due to interaction between P-TEFb and sequence specific
DNA-binding transcription factors
• Not due to RNA-binding protein as in HIV Tat

Separation of general transcription factors


• Separation of general transcription factors using liquid chromatography
o Labelled RNA product that were synthesized in the in vitro run-off
transcription reaction can be separated and quantified by acrylamide gel
electrophoresis followed by autoradiography or another means of
detection

ACTIVATORS - PROTEINS THAT REGULATE


TRANSCRIPTION
Regulation of Eukaryotic genes by transcriptional control
elements
TATA box
Transcription factors work together at the TATA box to regulate transcription in a
more efficient unidirectional manner compared to the divergent way of CpG
islands
CpG island
Transcription is divergent, can go in the correct or incorrect directions, a
characteristic that can be dulled down when TFIID helps RNA polymerase

Finding cis-acting regulator sites


Linker scanning mutations
• Identifying subregions of the promoter required for the activation of
transcription

DNA-binding activity
Identify when DNA binding occurs
Electrophoretic mobility shift Assays(EMSA): identifies DNA binding activity(but
not specific sequence that is bound by the proteins) Depends on the migration of
DNA through gel and how it is effected by binding to proteins

o Probe:
• Radiolabelled dsDNA segment
▪ 5' end labelling with oligonucleotides corresponding to cis-acting
elements
Or knowing double stranded sequence,
• Label double stranded molecule formed in PCR = double stranded molecules
o Forming Protein:DNA complex
• Proteins interacting with sequence specific DNA
o Run mixture of protein and free DNA probes through non-denaturing
polyacrylamide gel (Positive control)

• Running free DNA with full nuclear extract proteins it may or may not
recognize
•Free probes go a given distance, but shift is different when it is carried
with a protein complex. Conformational change dependent on
interactions with proteins
o Chromatography to separate nuclear extract proteins into fraction
• Fractions have different compositions of proteins depending on the
nature of the chromatography
• Run the gel for each nucelar extract proteins
o Note: synthesis of oligonucleotides bound together to make double stranded
substrate
o Evaluate whether a protein can interact with a given DNA segment
• Test for the presence of a nuclear protein by taking a small volume and
mixing with DNA probe, then the associated protein should give rise to a
shift on the gel
• Where there are bands on the gel, a protein complex present in fraction
1, 7 and 8 are bound to a DNA probe
Cotransfection:
Test DNA binding transcription factors
When co transfected, there should be strong expression of the promoter genes
Mutation sequences so expressed protein no longer interacts properly
Function of promoter is dependent off factors

Recognizing specific DNA sequence motifs with Transcription factors


Transcription factors:
o Activators

Function
o Recognition Helix: Alpha-helix domain
• Recognize specific DN Abases within that DNA region
• Due to + AA in the region, interactions are favored through associations
with electronegative phosphate
o Non-covalent binding
o Interaction with the major groove of DNA
o Structural characteristics

Modular structure
o Most transcription factors have multiple domains that each perform distinct
functions
• Example: GAL4 transcription factor from yeast( critical for utilizing
Galactose)

Transcriptional activation depends on binding to UAS and beta-galactosidase


• Binding of UAS isn't enough to initiate transcription
GAL4(with a clear DNA binding domain and transcriptional control activation)
has a DNA binding domain that interacts with UAS which activates transcription
• Contains activation domain to stimulate transcription
• UAS can be integrated into reporter gene up stream of the TATA box
Reporter gene construct is a proxy to transcriptional …
NOTE: DNA binding domain and transcriptional activation domains are indedpendent

Transcription factors almosta always are associated to another domain which directly
impacts the transcription
o Domains for:

• DNA binding
• Transcriptional activation
▪ Do not yet understand their structure
• Relatively unstructured
• No sure how function is caused by structure
• Transcriptional repression
• Chromatin remodelling
• Nuclear import
• Protein interactions

Protein Motifs
o Homeodomain proteins: present in many transcription factors

• Confer positional orientation


• Initially describe since they were mutated in flies
• Normally would define where specific organs/parts grow in the wrong place
• When there is a mutation at a particular residue, transcription factors with
homeodomains will give rise to homeotic transformations
• Homeobox genes
o nc finger DNA binding domain transcription factors

• Function
▪ Domains can interact with DNA
• Types
▪ C2H2 types:
• Usually contain three or more finger units and bond to
DNA as monomers
▪ C4 types: 4 cysteines coordinated with zinc ions to give rise to
fingers
• Usually contain only two finger units and bind to DNA as
homo/heterodimers (Steroid receptors)
• Typical of glutical hormone/receptor
▪ C6: 6 cysteines
• Zinc finger transcription factor: variation wherein six
cysteine metal ligands coordinately bind two Zn2+ ions

o Leucine zipper proteins


• Function:
▪ Bind DNA exclusively as homo/heterodimers
▪ Zipper proteins' extended alpha-helices bind to the major groove of DNA
• Characteristics:
▪ Hydrophobic region along one face of helix
▪ Leucine or other hydrophobic AA at every 7th position in the C terminal
region of the DNA domain… bZIP protein
• Doesn't need to be leucine, just conduct hydrophobic integrations
to interact with different groups
▪ Hydrophobic residues form a coiled coil domain, which is required for
dimerization
• Transcription factors interact through hydrophobic interactions:
leucine on one interface interact with those on another interface
o Helix-loop-helix proteins (HLH)
• Look similar to leucine zippers, but with a little kink(loop) altering helix
▪ a:leucine zipper
▪ b: helix-loop-helix
• Contains hydrophobic AA spaced at intervals characteristic of an amphipathic
alpha-helix in the C-terminal region of the DNA binding domain
o Cooperative DNA binding

• Protein-protein interactions favour the formation and stability of the


ternary complex
▪ When they come together, synergistic cooperative binding, in
which they reinforce each other for transcription to enhance
▪ Act cooperatively
▪ Greatly improves the diversity of the transcriptional output
improved by transcriptional activity
• Combinatorial possibilities extent potential for diversified gene regulation
▪ Combo of transcription factor binding sites next to each other in
promoters leads to a diversity of transcriptional
response(outputs)
• Becomes more complex when repressors are factored in
▪ Homo- and heterodimer formation is common among
transcription factors
• Three transcription factors that can homodimerize -> 6
different possible combinations

• Transcription factor-DNA binding


o CHIP-Chromatin Immunoprecipitation (CHIP-seq): figure out the
sequence
• Pull down all DNA genomic sequences that are occupied by that
DNA binding transcription factor and then you sequence them all
and use computer to analyze what the sequence might look like
• Crosslink macromolecules
• Shear the DNA into small fragments
• Immunoprecipitated with an antibody
• NG sequencing of bound DNA
• DNA will correspond to genes that are bound to transcription
factors
• DNA binding elements can be determined
o CDX2 is a non-homeobox transcription factor
o Do chip seq with antibody against CDX2
o Identify peaks of protein or gene it is interacting with by DNA binding
activity
o Binding DNA of these genes in all these segments
o Result:
• Reasonable sequencing motif recognized by CDX2 called Hoxc
o Don’t have to do Chip seq: use protein to carry out just chip
immunoprecipitation, use primers to carry out PCR to see if that DNA
segment was present in that chip that you did.
• Result: identify occupancy of given transcription factor at a given
site
• Mediators
o BIG multi subunit protein complex interacts with DNA binding
transcription factors via specific subunits that bring RNA polymerase II
and the general transcription factors
o One per DNA-bound activators
o Function:
• Bridge vast sections of chromatin to enhance transcriptional
initiation
• Mediated through associations between various transcription
factor activation domains and specific mediator subunits
• Role is consistent with the topological hierarchy observed in
transcriptionally active looped-out chromatin
• Mediates the effects of enhancer elements and their binding
• proteins on the basal/general transcription machinery (RNA Pol II)

The Mediator

Discovery
Researchers were wondering how DNA binding transcription factors could have
an affect on transcription with their intrinsically disordered or disordered
unstructured activation domains. They knew that these DNA binding
transcription factors would bind to enhancers or proximal-promoter elements,
but weren't sure how this affected transcriptional efficiency.
The mediator was first discovered in yeast, but a homolog was found in humans
Function
Mediators are behind the efficiency of transcription when DNA binding
transcription factors bind to enhancers and proximal promoter elements. This
binding takes place through interactions between the DNA binding domain and
cis acting elements. It is important to note that the mediator complex is found in
a larger complex containing RNA polymerase which is referred to as a
holoenzyme.
The mediator complex has 31 sub units but can be separated into 3 major
domains, the middle, head and tail.

Middle and Head domain


The middle and head domains of the mediator are actually flexible in relation to
each other and can also undergo conformational changes to favor interaction
with RNA polymerase at a specific interface.

Tail domain
The tail region of the mediator interacts with transcription activation domain of
transcription activators.

Subunits of the mediator will interact directly with DNA binding transcriptional
activators which bind to DNA regions through the cis acting elements along with
RNA pol 2. And are independent in the sense that a mutation in one doesn’t
directly affect the rest of the mediator or overall transcription but might
effect/disable specific transcriptional activation whether it be the binding of
transcriptional factors to promoter-proximal elements or enhancers.

Head domain
The three subunits of the mediator are associated with various transcription
factor activation domains which help mediate the complexes function. Through
these interactions, the mediator complex bridges vast sections of chromatin to
enhance transcriptional initiation and also ensures that RNA polymerase 2 binds
optimally to initiate transcription.
The looping of chromatin allows regions normally far from each other in its linear
form to be closer together. The mediator will mediate the effects of enhancers
and their binding proteins on RNA polymerase II.

Transcriptional Activation/Initiation
Structure of DNA in transcription
We explain transcription in a linear manner to simplify it, but this is not the form in
which it takes place.
Highly transcribed Genes
What is the cause for more highly transcribed genes?
This question can be answered by looking into the dynamics of transcriptional
initiation and elongation

Methods of analysis of RNA levels


Steady state: RT-qPCR, RNA-seq, norther blotting gives us steady state of
RNA, that is considering how much RNA is being made and how much is
being destroyed. Evaluating the steady state of RNA doesn’t tell us what
genes are being transcribed and at what efficiency

Transcription efficiency: identify the particular RNA and how much is being
produced by introducing a secondary structure such as a stem loop in the 5'
region of a trans gene X such that a protein tagged with GFP can recognize
and bind to it.
When gene X is transcribed and makes RNA the stem loop structure will fold
up into it proper configuration and the GFP tagged proteins should be able to
bind to all the stem loops of all transcribed RNAs
Embryonic development in the fly-Drosphilia
Gastrolation: cells change in shape and this results in morphological changes
in the embryo

Transcription mechanism and Bursts


Mechanism
We cannot see in these images, but each of the illustrations above show how RNA
polymerase lights up when actively transcribing, but rather then remaining on and
turning off when transcription is over, they turn on and off in bursts. Transcription
is not a general flux, it happens in waves.
All components of transcription are put together and the first little bit of RNA that
is produced increases the RNA polymerase II activity until the quantity peaks and
electrostatic interactions cause the dissociation of the condensate. The cycle
repeats as transcription isn't totally complete yet.

Transcriptional efficiency and frequency of burst

The relation between the effectiveness of the enhancer and the efficiency of
transcription is in how stronger enhancers lead to a greater frequency of bursts.

P granules- liquid-liquid condensate


Discovery
Tony Hyman and Cliff Brangwyn were studying embryonic determinants at the
very first stage of cell division in the development of a Callegan's embryos looking
at the P granules which appeared to be migrating to the posterior end of the cell
that was dividing. We know that the posterior of dividing cells will later become
germ cells, so this mechanism seemed significant. Cliff and Tony realized these
granules were actually liquid-liquid condensates, droplets that don’t mix within
their medium.

Mechanism

The droplets aren't actually just present in the posterior end, that is just where
they condensate. Rather, in the anterior end, the droplets are soluble, and this is
why they are not seen.

Formation of condensates
The MED1 subunit of the mediator has an intrinsically disordered region that
contributes to its function. Researchers inserted a variant of MED1 with its
intrinsically disordered region along with mCherry into a cell. These MED1
subunits form punctas(aggregate together) made of thousands to millions of
copies of MED 1 which don’t allow for the penetration of mEGFP. However,
when a chromyl domain activator with an intrinsically disordered region
called BRD4-IDR tagged with mEGFP is added to the MED1-IDR tagged with
mCherry, an overlap can be seen. This overlap is due to co-localization since
both MED1 and BRD4 have sticky intrinsically disordered domains.
Elsewhere, they would be soluble, but when they begin to aggregate
together due to their IDR, they attract more and form larger and larger
liquid-liquid condensates

The formation of these condensates is largely dependent on the presence of


certain macromolecules, mainly DNA, RNA and protein, of which RNA's
electrostatic charge is of significant importance. The valency of the
components is also a determining factor in condensate formation, wherein
electrostatic interactions promote their formation, post-translational
modifications like phosphorylation and intrinsically disordered proteins bring
proteins and nucleic acids together to form the condensate.

Structure
Proteins that would normally come together congregate in these liquid-liquid
condensates so that they can preserve their function.

Dynamic Kissing model

Labelling the mediator and RNA polymerase II, the two would at times co-localize,
suggesting the activation of transcription of action genes, but not all the time.
Proteins forming in loops requiring mediator and RNA polymerase II activity to
activate the associated genes would form condensates along with other
macromolecules. Condensates form around these protein loops where all macros
can concentrate, carry out their functions and dissociate, which illustrates how
mediator and RNA polymerase II separate. This is the basis of BURSTS.

Establish condensate based on mediator and trans acting factors that loop out the
chromatin, concentrating together to bring out RNA polymerase II and general
transcription factors until TFIIH melts DNA and fires RNA polymerase II.
Destruction of condensates

Generate a liquid-liquid condensate containing everything required for


transcription reaction, DNA, general transcription factors, mediator, and RNA
polymerase II. RNA polymerase II generates RNA which will stick everything
together until enough is generated to cause electrostatic interaction of positive
RNA causes the dissociation of condensate

CHROMATIN, EPIGENETICS AND THE HISTONE CODE


Epigenetics
Epigenetics relates to the phenotype, that is, the way in which DNA sequences are read.
They are not the cause of genetic changes, that is, changes in DNA sequence, but rather
epigenetic changes, alterations in the way sequences are read. All in all, epigenetics is
the study of inheritable changes in phenotype of a cell without changing the DNA
sequence at all.
Epigenetic traits
Epigenetic traits are transmitted independently of the DNA sequence itself, that is,
they don’t effect the sequence, but how each part of the sequence is read. Cells
must maintain their methylated status in order to be recognized traits.
Examples:
• Inactive X chromosome in females
• Developmental restrictions:
Effectuate epigenetic changes so structures or cells that would
differentiate to form those structures, aren't expressed. Wherein flies
don’t grow legs out of their antennae
• Imprinting via DNA methylation
The marking of DNA in a sex-dependent manner resulting in the
differential expression of a gene depending on its parent of origin. A
complex containing mSin3 will recognize cytosine methylation of DNA
mark

Epigenetic marks are reversible chemical modifications to DNA that allow its genes
to be expressed in different ways. These marks, like H3K4 or H3K9 methylation,
can activate or repress the expression of a given gene and are often inherited
following cell division.

Epigenetic Writers are enzymes, such as histone methyltransferase, that


introduce epigenetic marks or chemical modifications to Histones

Epigenetic Readers are specialized proteins, such as complexes containing


mSin3, that can recognize epigenetic marks or modifications and the type of
proteins they are. They also ensure that every daughter cell contains the
appropriate modifications to allow them to fulfill their designated functions.

Epigenetic Erasers are what makes epigenetic marks reversible, that is, not
permanent

Assembly of DNA chromosomes


Histone modifications- Post-translational modifications
Histone Function
Histone proteins bind to and wind up DNA into higher order structures called
nucleosomes. These nucleosome structures are essential in packing the genomic DNA
into the nucleus in the form of chromosomes. Three types of histone proteins are of
importance in DNA winding; H2, H3 and H4.
Modification Techniques
Histone tails extend in random coils from the chromatin fiber and can be modified by
acetylation, hypoacetylation, phosphorylation, methylation and ubiquitination to
regulate chromatin-based processes by affecting the binding of different protein
complexes
*It is important to note that modifications of one histone on one residue will not
necessarily have the same effect as the same modification of the same histone on
another residue. Each event is independent. Also note that a residue can undergo
different modification, that is, it can be both methylated and acetylated.
Acetylation
Competition between histone acetylase and histone deacetylase, wherein the one with
the greatest activity decides whether or not the histone is acetylated. Usually give rise
to the activation of transcription as it neutralizes the interactions between histone tails
and DNA backbone that leads to compaction. When DNA bound activators transiently
bind histone acetylase complexes. Generally the acetylation of lysine residues at
specific points on histone tails will act to more or less open up the chromatin, or
loosen it. The acetylated histone tails grip on the backbone of the DNA largely
through neutralizing the electrostatic interaction between the histone tails and
the phosphate backbone of the DNA. To summarize, the acetylation of lysines in
those histone tails is associated with an opening up of the chromatin, and by
virtue of that, an increase in the transcription efficiency around those genes.
*Acetylation of histone's lysine has a fast turnover rate.
Acetylation
H3, H4, H2A, H2B
Deacetylation
When repressors transiently bind histone deacetylase complexes.

Methylation
The lysine residue of a histone is methylated, Me 1, Me2 or Me3, at the nitrogen atom
of the terminal epsilon group of its side chain. Whether mono-, di- or tri-methylated, the
lysine will hold a single positive charge. Methylation of histone's lysine residue has a
slower turnover which means it is more beneficial to do so as a post-translational
modification so as to propagate epigenetic information.
Methylation
H3K4 (Histone 3 lysine 4)
- Mono-Methylation in enhancer -> Activation
- Di
- Tri-Methylation in promoter region -> Activation

H3K9/K27 (Histone 3 lysine 9 or 27) More toward C terminal


- Mono-Methylation
- Di-Methylation -> Repression
- Tri-Methylation -> Repression

H3K36 (Histone 3 lysine 36) more toward C terminal


- Methylation in the transcribed region -> Activation

Demethylation
The methyl group(s) on a lysine residue can be removed using histone lysine
demethylase

Heterochromatin- Transcription Repression


Structure
• Condensed form of chromatin localizing at the nuclear envelope and near the
nuclear pore
• Some regions of the genome are always heterochromatin
• Some regions shift between heterochromatin to Euchromatin
Function
• Heterochromatin is transcriptionally inactive in a compact form to keep DNA
form being transcribes when transcription could be detrimental to the
cell/organism
Types of Transcription Repressions
Modification Site of modification
Hypoacetylated lysine
Methylated lysine

Ubiquitinylated lysine
Example: proper methylation of histone H3 lysine 9 during chromosome
replication(8.6)

The replication of a parent DNA molecule with H3K9 methylations will result in two
half methylated daughter chromosomes. This is where histone methyltransferase
HMT steps in as both an epigenetic reader and writer. HMT will recognize the
methylated H3K9s and identify the neighbouring naïve H3K9s with . Then, it will
catalyze the methylation of all of the unmethylated H3K9, ensuring that all
histones have the right tag.
The chromo domain of the repressor complex binds to H3K9me3, the tri-
methylated histone 3. The binding of the chromo domain leads to the recruitment
of corepressors. Chromatin condenses and heterochromatin is formed, repressing
transcription.
Repressor-directed Histone deacetylation complexes(HDAC)
The repressor Ume6's DNA binding domain (DBD) interacts with an upstream
control element called URS1. Its repression domain RD binds to Sin3 of the
multiprotein complex including the histone deacetylase Rpd3. The
deacetylation of the histone's N terminal tails on nucleosomes near the
Ume6(repressor) binding site(on URS1) will make the histones highly positive
and drawn to the opposite negative charge of the DNA backbone, closing
down the chromatin in those regions. The closed down chromatin will inhibit
the binding of general transcription factors at the TATA box and result in the
repression of gene expression.

Silencer sequences
In the silencer region, enzymes cannot access the DNA to interact with it
suggesting there must be a physical barrier, likely something to do with
histone mutations.
RAP1: RAP1 is the first DNA binding protein that recognizes sequences in the
telomere and silencer regions and acts as a transcription factor. These
regions can be HML or HMR of the mating type loci, or telomeric sequences.
Once bound to this region, RAP1 recruits the SIR proteins through
protein:protein interactions and these SIR proteins 2,3, and 4 are also drawn
to the hypoacetylated histones I the region surrounding RAP1.
* SIR: silent information regulator,
SIR1: the SIR1 works with RAP1 and is involved in binding the silencer region
to the loci to be silenced(The specific mating loci type in this case)
SIR2, 3, 4: The SIR2, 3, and 4 join due to recruitment by protein-protein
interactions RAP1 at the silencer or telomeric region and make a complex
around the DNA to be silence. These complexes will be added to the N-
terminals of deacetylated histones H3 and H4.
SIR2, the histone deacetylase will remove the acetyl groups to leave histone
tails bare and favor chromatin condensation. After SIR2, SIR3 and SIR4 sit on
the telomeres to form a higher order complex.

Ex: Silencing Mating types


In Saccharomyces Cerevisiae, there are three genetic loci on
chromosome III that control mating type. The two identity sequences,
HML-alpha and HMR-a sitting at opposite ends of the DNA will be
silenced at their location, but either one or the other will confer the
identity or mating type when expressed at the MAT locus. Between the
two identity loci is the MAT locus which can take on either alpha or a
identity.

EX: Telomeres
Identifying factors required for repression through in Situ
hybridization/immunofluorescence of the silent mating type loci. Figure
out where the telomeres are and carry out immunofluorescence, you
see SIR3 overlapping with the telomeres. Meaning there must be SIR3
at the telomeres, and they must have something to do with them being
silenced.

Euchromatin - Transcription Activation


Structure
• Loose and open form of chromatin localizing at the nuclear envelope and near
the nuclear pore
• Open enough to allow DNA binding transcription factors like general
transcription factors as well as RNA polymerase in to bind to DNA
• Some regions of the genome are always heterochromatin
• Some regions shift between heterochromatin to Euchromatin
Function
• Transcriptionally active
Types of Transcription Activations
Modification Site of modification
Acetylated lysine

Phosphorylated
serine/threonine

Methylated arginine

Methylated lysine

Ubiquitinylated lysine

Activator-directed histone hyperacetylation with Co-activators


*GCN4 and Gal 4 work in a similar way and both interact with the saga complex
GCN4 is a DNA binding transcription factor in yeast that interacts with the
UAS(upstream activation sequence). It turns out that GCN5 is a histone
acetyltransferase that works as a coactivator with GCN4. When GCN4 binds
to its activator region, GCN5 acetylates the histones(opposite affect to SIN2
in deacetylation), adding an acetyl group which will simultaneously release
the histone tails from the electrostatic interaction with the DNA backbone
and loosen the chromatin, allowing for transcription complex formation

Decondensation of chromatin with activation domains and remodelers

a) Condensed chromatin: in this experiment Lac repressor elements or cis acting


elements which interact with DNA binding proteins in bacteria, along with LacI are
introduced into a cell and when visualized, the repressor elements appear in a
condensed ball, indicating that the chromatin itself was condensed and
inaccessible to transcription factors or complexes required for the initiation of
transcription.
b) Decondensed chromatin: in this experiment, the Lac repressor elements were
fused with a strong viral transcriptional activation domain VP16, and added, along
with LacI into a cell. When visualized, the Lac repressor fused with the activation
domain appeared more spread around indicating that the chromatin had opened
up. This is because the presence of an activation domain attracts both chromatin
remodelers and Co activators.
Chromatin remodelers manipulate the chromatin to change the accessibility of
specific regions, and Coactivators help change the histone acetylation in that
region.

Pioneer Transcription factors

Pioneer transcription factors are the first transcription factors on site and
have the ability to recognize their target sequences even in the compacted
state of chromatin. That is likely because these sequences are on outer
surface of the nucleosomes making them easily accessible. These pioneers
are of significant importance with regard to the gene expression taking place
during embryogenesis, activating the transcription of specific genes in charge
of differentiating between cell types. Their binding to the DNA leads to the
transcription factor cascade. They also recruit coactivators that with the use
of free energy, will modify histones and confer configurational changes. This
means the chromatin will loosen up and give space to the mediator complex.
The Mediator complex is recruited to the site of transcriptional initiation
The mediator recognizes and binds to the transcriptional activation
domains, loops the chromatin and recruits RNA polymerase II to perform
transcription during embryogenesis, a time when the chromatin is highly
compacted and transcription is at a minimum.

Replication of modified regions of DNA


ChIP: Chromatin Immunoprecipitation with Antibodies for epigenetic
marks

Identifying epigenetic modifications by conducting CHIP seq with antibodies


against specific epigenetic marks. This lets us identify regions of the genome
affected by various histone modifications. CHIP seq gives you the location of any
mark you want to locate.
1. Use reversible crosslinking agents to isolate chromatin bound proteins with
specific antibodies
2. PCR or NGS
1. Use a known primer to identify whether a specific gene is affected
2. Use NGS to analyze the entire genome and determine what regions are
being affected
3. Sequence the bound DNA

Diagram:
H3K4 mono-methylation is present a little bit throughout with
specific peaks. H3K4 mono methylation is associated generally
with enhancers. Very often, you can identify where specific
enhancers are in the genome by carrying out a chip seq
experiment with antibodies that recognize this histone
modification.
H3K4 di-methylation is associated both with enhancers and
active regions around proximal promoter elements and even the
start sites. The pattern hor H3K4 dimethylation differs from
mono-methylation, with clear peaks that are shared in certain
circumstances, but the occasional peaks that reach significantly
higher.
H3K4 trimethylation identifies regions around active promoters
right around the start site in general. There's only 1 peak on this
chromosome where you have this presumably active transcription
going on.

Dimethylation patterns: some peeks correspond with monmethylation


With CHIPseq with various antibodies
Peaks coincide by some sites recognized by antibodies
Looks like upstream are all enhancers
Gives an idea of what is happening genome-wide
Associated with repressor or activator genes
RNA Processing I
RNA Polymerases
The three Eukaryotic RNA polymerases are multimeric, the sub units of which are
essential to their function and seem to show some homology with bacterial RNA
Polymerase.
RNA Polymerase I
Making ribosomal RNA
RNA Polymerase II

Large subunit
The large subunit of RNA polymerase II is unique among the RNA polymerases as it
contains a carboxy terminal domain, CTD, an identifying factor of RNA polymerase
II. In humans, the unique CTD YSPTSPS has 52 repeats, and in yeast, approximately
26.
Note: YSPTSPS is a heptapeptide, the TFIIH protein kinase molecule
phosphorylates the serine 5 of the heptapeptides during initiation, and later on, a
second phosphorylation happens on serine 2.
The RNA polymerase will undergo initiation, pausing, in which capping enzymes
are added, and elongation, in which the mRNA strand grows. It transcribes about
100 nucleotides before pausing.

The processing of the 5' end of the elongating RNA.


Serine 5 phosphorylation at the CTD(which is close to emerging 5' elongating end
is near) recruits the capping enzyme. RNA pol II is stalled and given enough time
for capping enzyme to act. The active capping enzyme recognizes the 5' end of
growing RNA and adds the cap to the 5' end through a weird 5'-5' interaction to
protect it from exonucleases. The cap is a critical and common component to all
mRNAs. At the same time, the 2' hydroxyl of the first and then the second
nucleotide get methylated.
Serine 2 phosphorylation recruit splicing, polyadenylation and export factors as
well as factors required for interactions with the large domain of RNA polymerase
II. The stall allows for the conformational change of the CTD which is required to
identify the pre mRNA being transcribed and to verify that all required factors are
present.

Initiation

Pausing
The pausing of RNA polymerase is an essential step in transcription as it allows for
a change in the factors that block elongation, that is, NELF, for the those that
enhance it, say DSIF, SPT6 and PAF.
But how does the polymerase actually stall?
This pause depends on the phosphorylation of the CTD, a process mediated by the
CDK9/P-TEFb. Allows for the protection of mRNA. The presence of the two
negative factors slow and pause RNA polymerase II near the first nucleosome, DSIF
and NELF. Note that phosphorylated DSIF causes the closing of the clamp.

Elongation

RNA Polymerase III


Making tRNA

mRNA
Pre-mRNA to mRNA

Pre-mRNAs are modified at their 5' end


o A 7' methylguanylate CAP is added to the 5' terminal nucleotide through a 5'-5'
triphosphate linkage
o Methylation of
• In animal cells and higher plants: 2' hydroxyl of ribose group of the first
base
• In Vertebrates: 2' hydroxyl of ribose group of the first second base
The addition of the CAP helps protect the pre-mRNA as well as facilitate both
nuclear transport and its recognition by translation factors, overall, preserving it
and its function for efficient protein synthesis.

Overview of splicing

Eukaryotic genes code for both introns and exons, so pre-mRNAs, the primary
mRNA structures are composed of both introns and exons. To transition to mature
mRNA involves the splicing out of introns from the pre-mRNA to achieve mature
mRNA containing only exons. Although they are spliced out, introns do serve a
purpose as they can encode regulatory information.

Introns

An RNA-DNA hybrid is formed when the mRNA of adenovirus hexon gene


hybridizes to the DNA fragment containing the hexon gene.
Introns were found due to the difference in the length of DNA sequence and the
final mRNA sequence.
Notice that mRNA interact very strongly with certain DNA regions, those that
would encode exons and float around freely where there are just introns in the
DNA since the RNA lacks complementary regions to the DNA intron.

In pre-mRNA, at the boundary of the 5' end of the intron is a GU. At the boundary
of the 3' end of the intron next to the 5' end of the adjacent exon always has an
AG and 20-25 bp upstream, a branch point at an A nucleotide. The GU and AG
ends at the 5' and 3' splicing sites as well as the A at the branching point are all
very conserved.

Spliceosome
The spliceosome is composed of 5 snRNPs and 6 to ten proteins. snRNPs are small
nuclear ribonucleoprotein particles made up of snRNAs or small nuclear RNAs, U1,
U2, U4, U5 or U6. snRNAs are essential to splicing and each have their designated
function. The snRNPs pronounced snerp, are determined by their snRNAs.

Spliceosome cycle
1) U1 and U2 snRNP

U1: contacts the intron's 5' border (3' end of upstream exon)
Base pair happens at exon 1 and pre mRNA
It must be noted that snRNAs will require interactions with other
RNAs for splicing to occur efficiently. RNA:RNA pairing is critical
for U1 functioning, therefor, when there is a mutation at the
splice site of pre-mRNA, splicing is blocked. To restore splicing, U1
snRNA will undergo a compensatory mutation so as to bind to the
mutated portion of the pre-mRNA splicing site and restore splicing
altogether.

U2: contacts the branch point region


Interacts upstream to pyrimidine region where the branch point
is(adenosine), but never binds to adenosine, instead, letting it
bulge out
State
U1 and U2 snRNAs are in contact with each other as well and are held in
place by surrounding proteins
2) U4, U6, U5 snRNP input

The snRNPs assemble sequentially on the intron and rearrange the


RNA-RNA interactions between pre-mRNA and snRNA, looping the pre-
mRNA at the region of the introns.

3) U1, U2 output

U1 and U4 exit the complex leaving the active spliceosome behind with
U2, U5 and U6
4) Transesterification reactions with no net expenditure following the
spliceosome formation
Reaction 1

The hydroxyl group of the residue at the branch point attacks the 5'
phosphate group of the first intron residue (G) leading to the formation
of a lariat.

Reaction 2

The free 3' end of exon 2 attacks the 5' phosphate of the first residue of
exon 2, resulting in the joining of the two exons and the release of the
intron lariat and a grouping of the two spliced out exons connected by
the phosphate 2.
5) unlooping lariat intron with debranching enzyme to form a linear intron
RNA

Viewing splicing reactions


Radiolabel RNA substrates can allow us to separate and quantify each
intermediate splicing product during an in vitro splicing reaction.

Self splicing introns(the exception not the rule)


Introns that do not require proteins to be spliced out of pre-mRNA, but may
require a cofactor such as Mg2+ or other ions.

The linear form of RNA shown here must be spliced to give rise to linear form
Treating the entire reaction with substances that eat out the proteins -> as long as
Mg is added, RNA can splice itself

2 groups:
Group II introns are self-spicing introns that form a structure very
similar to that of the spliceosome. Group II introns are only present in
the mitochondria and chloroplast genes, but they may be the
evolutionary predecessors of other introns.

RNA PROCESSING
RNA Binding Proteins
RNA binding proteins are made up of different RNA-binding domains that are
capable of binding to RNA through their shapes and opposite charges to contribute
to some of the RNA processing. RNA binding proteins are essential for example, in
setting up the splicing apparatus, deciding where it will sit on the transcript and
where it will begin to splice.

RRM domains are made up of beta-pleated sheets with positively charged residues
that will interact with the negatively charged RNA
The polypyrimidine tract binding protein contains RRM domains, allowing it to
interact with the conserved polypyrimidine tract in the introns slightly off the 3'
side of the branch point adenosine.

Types of RNA processing


Splicing
Class 2 transcripts are spliced by the spliceosome
Introns are spliced out, leaving the exons to bind together to make mature RNA.
Exons
Small segments around 150 bp
Introns
Long segments around 3500 bp- 500kb

RNA binding proteins are essential in directing the splicing apparatus to where it
will sit and begin splicing. The whole complex is referred to as the cross-exon
recognition complex made up of SR protein:protein/snRNP interactions.
U2AF
Small subunit of 35 kd interacts with the nucleotides around the 3' end
of introns including the AG dinucleotide.
Large subunit of 65 kd interacts with sequences around the
polypyrimidine region which will define at least the 3' end of the intron
to be spliced out
Overall, U2AF helps with splicing efficiency
SR proteins
SR proteins are RNA binding proteins with RRM domains and protein:
protein interaction domains that identify the location of exons by
interacting with exonic splicing enhancers within the exons themselves.
SR proteins are rich in serine and arginine which will help them bind
and cover exons. The covering of the exons by SR proteins helps U2AF
locate the AG dinucleotide at the 3' splice site, U1snRNP bind to 5'
splice site and U2snRNP bind to the A of the branch point.

U1 SNRP
Uses the information provided by SR proteins to locate the 5' end of the
intron so it can sit on the GU dinucleotide
Exonic splicing enhancers are sequences within the exon that promote exon
joining during splicing

Alternative splicing
Alternative slicing is a temporal and tissue specific splicing method that can result
in different gene products and different proteins with different properties. In
alternative gene splicing, we start with the same gene, transcribe it to get preRNA,
but then, depending on the location and the desired characteristics of the protein
to be translated, will choose to keep or splice out certain parts of the RNA.

In alternative splicing, products can be traced back to the DNA that encodes them,
and nothing has changed, the sequences are consistent.

Example: Alternative splicing of Fibronectin gene

For example, the fibronectin gene is expressed in both fibroblasts and


hepatocytes and the EIIIB and EIIIA domains of the gene are kept in the mature
mRNA of fibroblasts so that the fibronectin it translates has the two sticky domains
to help with cell structure, but are not kept in the mature mRNA of hepatocytes
who must travel in the blood stream and shouldn’t have sticky domains as these
could inhibit proper circulation. The same gene can be transcribed in different
ways in different cells or organs to that the resulting proteins have the proper
characteristics to fulfill the functions required in that area.

*Humans have about the same amount of genes as worms 20 000

Controlling sex determination in drosophila


Males have only 1 X chromosome while females have to, so dosage compensation
must take place. Dosage compensation is carried out by chromosome counting
and the activation of expression of certain sex-determining genes.

Sex-lethal gene
The need for dosage compensation is why females inactivate their second X
chromosome. Male and female flies are really quite different and have
sexually dimorphic characteristics that differentiate them from one another.
These differences in sexually dimorphic characteristics are controlled by a
cascade of RNA binding events that result from alternative splicing, in the
case of sex determination in drosophila, by the RNA binding protein Sex
lethal.
Sex-lethal (Sxl) is under transcriptional control, that is, expressed only in
females in early embryogenesis. Later in development, the female specific
sex-lethal promoter is repressed and a late Sxl promoter is activated in both
sexes. The later pre-mRNA must undergo alternative splicing, which will only
be done appropriately and yield functional Sxl in the presence of the early Sxl
proteins.

Sex-lethal protein present in female embryos can bind to a specific spot on


the 5' end of an intron between exons 2 and 3, blocking U2AF from binding
and splicing exons 2 and 3 together. Instead, while exon 3 is off limits for
U2AF due to Sxl, U2AF will splice exons 2 and 4 four together, making a
mature mRNA that codes for a functioning Sxl protein. In males however,
there is no Sxl proteins, so nothing to block U2AF splicing of exons 2 and 3
together. This means the resulting mature mRNA will include exon 3 with an
in frame stop that will be recognized by the ribosome as a sign to stop
translating. This early inframe stop means no functional Sxl protein is
synthesized in male embryos.

Transformer Tra Gene


The Tra gene requires interaction with Sxl protein to undergo proper splicing
and produce functional Tra protein, so, although the gene is expressed in
both males and females, only the females, will be able to produce the Tra
protein. The males will not produce Tra as they don’t have and cannot
produce the Sxl proteins required for their production.
Females
• Sex lethal interacts with the pre-RNA at a spot of the 3' end of the intron
between exons 1 and 2 at an inframe stop site.
• U2AF cannot bind at that location, so its absence tells the splicing apparatus not
to include exon 2
• Exons 1 and 3 are spliced instead making the proper mature RNA coding for a
functional Tra protein
DOUBLE SEX GENE
• Tra protein produced in flies interacts with proteins RBP 1 and TRA 2 making a
ternary complex
• The protein complex decorates exon 4 at the double sex gene. Double sex is a
transcription factor that activates genes defining sexual characteristics
• Exon 4 will be included in the mature mRNA of females synthesizing a female
specific double sex transcription factor
• Female specific DSTF interact with genes expressing femal characteristics and
blocking male specific characteristics
Males
• No sex lethal, so U2AF can bind to the region before exon 2 telling the splicing
apparatus to splice exons 1, 2 and 3 together.
• The resulting mature mRNA is read by the ribosome which stops translation
when it reaches exon 2 with an inframe stop site
• The ribosome detaches and no functional Tra protein is made.
DOUBLE SEX GENE
• The absence of Tra, RBP 1 and Tra 2 means the protein complex will not bind to
exon 4 on the double sex gene
• Exon 4 will not be included in the mature double sex mRNA of males meaning
the male double sex transcription factor will differ from the female one
• Result is a male specific double sex transcription factor

Cascade
The activity of a single RNA binding protein, Sex-lethal, drives a RNA binding
protein cycle making sex-specific double sex transcription factors and
conferring more sex-specific characteristics lading two flies of two different
sexes.

Cleavage
Capping
Polyadenylation

All mRNAs except histone mRNAs are polyadenylated, and all mRNAs, except
histone mRNAs, that lack a poly(A) tail are rapidly degraded within the nucleus.
Histone mRNAs have unique secondary structures in their 3' UTRs and although
they lack poly(A) tails, will not be degraded.

Phase 1- Slow phase


Mediated by PAP, 12 A residues are added on the cleaved 3' end
• Poly A AAUAAA signal on 3' end of pre-mRNA is helped by a more down stream
poly A signal including a G, in recruiting CPSF(cleavage and polyadenylation
specificity factor), CStF, CFI and CFII to cleave the mRNA at a specific site.
• Right when cleavage happens, enzyme polyA polymerase begins to add
approximately 12 adenosines just to protect the 3' end for now.
• Poly A polymerase is a crappy enzyme, so adding 12 adenosines takes a while
Phase 2- speedy
The structure is recognized by nuclear poly A binding protein (PABPN1)
• Nuclear Poly A binding protein(from the nucleus) enters the complex and acts as
a cofactor, enhancing the function of Poly A Polymerase
• Poly A polymerase can now added up to 200 A efficiently

This polyadenylation protects or delays the degradation of these mRNAs from


exoribonucleases and is one of the final steps maturation before pre-mRNA
becomes mRNA. Now, although Poly A tails are important In protecting mRNA,
histone mRNAs don’t undergo polyadenylation. Instead, they have a stem loop at
their 3' untranslated region UTR that protects it from degradation.

Examples of RNA modifications


rRNA
Transcription
Synthesized by RNA polymerase I in the nucleolus,
Happens in the same sequence for all rRNAs from yeast to humans
transcribed by repeated segments of ribosomal DNA.

Processing
Not spliced, but transcribed spacers are removed.
RNA polymerase I makes a long pre-rRNA which will be processed and cleaved
resulting in an 18 S RNA, 5.8 S RNA and a 28 S RNA

Transfection
If you take a portion of DNA encoding our rRNA, that is, the repeat segments, and
introduce it to the cell of drosophila, you introduce a trans gene that will integrate
into its genome. The integration of these segments means RNA polymerase I will
transcribe the rRNAs and begin to form liquid-liquid condensate, recruiting other
macromolecules to grow to become a nucleolus.

tRNA
Transcription
Transcribed by RNA polymerase II

Processing

The 5' green region is removed from all pre-mRNA and in certain circumstances,
other segments are spliced out(but not always). Then, the short purple segment
on the 3' end is removed and replace with the CCA which will later partake in
aminoacyl tRNA synthesis. And lastly, the pre-tRNA will undergo extensive
modification of internal bases to acquire its mature tRNA form

RNA EDITING
In RNA editing, unlike alternative splicing, specific deaminase enzymes convert 1 RNA
nucleotide to another in a permanent fashion. The DNA is unaltered and still transcribes
the proper RNA, but the alteration of this RNA results in a protein that isn't consistent
with the DNA template.

RNA editing is widespread in the mitochondria and plastids of protozoans and


plants and is also observed, although very rarely, in the nuclear genomes of higher
eukaryotes.

Characteristic changes conferred by deaminases:


• Adenosine -> Inosine
• Cytosine -> Uracil

Example: Apolipoproteins
Apolipoprotein B is a major proteins involved in LDL(low density lipoproteins)
that carry lipids in and around the body and cells with receptors.
Apolipoprotein B is synthesized in the liver and intestine. In the liver the Apo-
B protein is large, at 4536 amino acids long, however, in the intestine, the
apo-B synthesis is cut short around halfway through due to premature stop
introduced by the deamination of cytosine into uracil making for a UAA stop
codon. The intestinal apolipoprotein is thus only 2152 amino acids long.

Deamination

Divergent transcription by RNA polymerase takes place at the CPG


promoters, but the antisense transcripts come out destabilized and the sense
transcripts, stabilized.
Th antisense transcripts tend to be degraded as they are highly
polyadenylated and lack U1 snRNP sites meaning they are full of these
rare occurrences and lack the required sites.

NUCLEOCYTOPLASMIC TRANSPORT
Nuclear Pore Complex
Structure

These nuclear pores are dispersed all over the nuclear envelope and are held up by
structures called nuclear pore complexes
The nuclear pore complex is 125 megadaltons big, that is, 30x bigger than a
ribosome and is composed of 50-100 different proteins(yeast vs vertebrates).
Proximal and cytoplasmic filaments project from the complex on the cytoplasmic
side and a these same proximal filaments in addition to a nuclear basket project
out on the nuclear side. Molecules smaller than 40-60 kDa can freely pass through
the NPCs, but larger molecules and multimolecular complexes such as RNPs must
be transported.

The pore itself is made up of proteins called nucleoporins, and the ones rich in
phenylalanine and glycine are referred to as FG nucleoporins made up of proteins
with FG repeats.

Mechanism
The FG repeats in the nucleoporins interact with one another through
hydrophobic interactions forming a highly disordered gel-like interface.
This characteristic is essential to the regulation of movement through the pore as
specific proteins with domains that can interact with the disordered domains of FG
nucleoporins will be able to pass through the pore. These domains are called
nuclear localization signals or NLS and in their presence, any protein can enter the
nucleus through nuclear pore complexes.

Example: SV40 virus uses Antigen T to cause damage in a lot of nuclear processes,
but when proteins in T antigen where mutated, the T antigen could no longer
enter the nucleus and its harmful affects were muted. It turns out a specific stretch
of amino acids containing lots of lysine and arginine had to be present for it to be
able to enter the nucleus.
Example: adding this lysine and arginine rich sequence to other proteins like
pyruvate kinase which isnt a nuclear protein suddenly allowed them to enter the
nucleus

Transport
Nuclear Protein Import
Required proteins
Two types of proteins are required in order to get proteins synthesized in the
cytoplasm into the nucleus via NLSs.
RAN: monomeric G-proteins that exist in two configurations
1. Bound to GTP
2. Bound to GDP

Nuclear transport receptors(importins): proteins that bind to NLS domains


present on cargo proteins to facilitate transport through the pore by
associating with FG repeats on the nucleoporins

Cargo protein with NLS

Mechanism
Note that Ran-GAP has GTPase function and converts Ran-GTP into Ran-GDP in the
cytoplasm, while Ran-GEF, a guanine nucleotide exchange factor, turns Ran-GDP
into Ran-GTP in the nucleoplasm.
1. Importins in the cytoplasm recognize the NLS on a given cargo protein and form
a complex with it
2. By virtue of the FG repeats on the nucleoporins in the NPC(nuclear pore
complex), the cargo/NLS/importin complex will travel through the pore and into
the nucleoplasm
3. Ran in its GTP bound form Ran-GTP greets the Cargo/importin complex in the
nucleoplasm
4. A conformational change happens and importin releases the cargo into the
nucleoplasm where is will do its job
5. RAN-GTP bound to importin make their way out of the nucleus and into the
cytoplasm through the NPC simply down its concentration gradient.
6. Ran-GAP greets RAN-GTP/importin, hydrolyzing it into RAN-GDP causing a
conformational change that releases importin so that it can interact with a new
cargo protein
7. RAN-GDP then makes its way back into the nucleoplasm to b converted back
into Ran-GTP by Ran-GEF

Nuclear Protein Export


Required Proteins
Two types of proteins are required in order to get proteins from the
nucleoplasm to the cytoplasm via NES.
RAN: monomeric G-proteins that exist in two configurations
1. Bound to GTP
2. Bound to GDP

Nuclear transport receptors(Exportin 1): Exportins recognize their


nuclear factor proteins via the NES

Cargo protein with NES

Mechanism
1. Exportin 1 in the nucleoplasm recognize the NES on a given cargo protein and
form a complex
2. Now bound to the cargo protein, the exportin undergoes a conformational
change allowing it to recruit Ran-GTP forming a ternary complex
3. Cargo protein/exportin/RAN-GTP complex make their way out of the nucleus
and into the cytoplasm through the NPC.
4. Ran-GAP greets the complex, hydrolyzing Ran-GTP into RAN-GDP causing a
conformational change that releases the cargo protein and exportin
5. RAN-GDP then makes its way back into the nucleoplasm to be converted back
into Ran-GTP by Ran-GEF
6. Exportin also returns to the nucleoplasm to be used again in export

RNA Export
Exportin t- Ran dependent
Types of RNA
• tRNA: Exported into the cytoplasm to participate in protein synthesis with the
ribosome
• rRNA
• mRNA: some specific mRNA that associate with hnRNP proteins(HIV Rev) can be
exported through association with Ran

Mechanism
i. Exportin t in the nucleoplasm binds to fully processed tRNA and Ran-GTP
ii. tRNA/exportin t/RAN-GTP complex make their way out of the nucleus and into
the cytoplasm through the NPC.
iii. Ran-GAP greets the complex, hydrolyzing Ran-GTP into RAN-GDP causing a
conformational change that releases the tRNA and exportin t
iv. RAN-GDP then makes its way back into the nucleoplasm to be converted back
into Ran-GTP by Ran-GEF
v. Exportin also returns to the nucleoplasm to be used again in export

mRNA Exporter - Ran independent


Required proteins
• NXF1
• NXT1
• mRNP proteins: SR proteins

Mechanism

1. Mature mRNA with a poly A tail interact with NXF1 and NXT1 subunits of the
RNA exporter.
2. NXF1 and NXT1 interact cooperatively with specific mRNP proteins including SR
proteins that already decorate the mature mRNA
3. These protein interactions on mRNA will form a domain on RNA that will interact
with FG repeats in nucleoporins to go through NPC into the cytoplasm
4. Here, mRNAs can be translated
mRNP Packaging in Balbiani Rings
On one DNA template of the insect polytene chromosomes(Balbiani Rings),
both transcription and mRNP export are microscopically imaged. These
insects have synthesize a transcribe a specific gene to synthesize a given
protein that allows them to stick their eggs onto leaves.

The mRNA is being transcribed with the hnRNPs and released in the form of
these little croissants as mRNPs that will undergo transport across the
nuclear envelope through the NPC.

Transport Mechanism
1. mRNP reaches the NPC where there are gatekeepers ensuring that the exporterd
mRNA is infact mature.
2. mRNA begins to be threaded through the NPC 5' end first
3. The 5' mRNA reaching the cytoplasmic side is immediately bound by ribosomes

Cytoplasmic remodelling
Helicase unwinds the RNA and initiates the removal of nuclear factors from
RNA once the RNA has been transported across the nuclear envelope. An the
replacement of these factors by cytoplasmic proteins.

Example:
Nuclear cap binding complex that recognizes a 5' end is replaced by
EIF4E
PABPN1 that interacts with poly A tail in cytoplasm is replaced by
PABPC1
POST-TRANSCRIPTIONAL/TRANSLATIONAL
REGULATION OF GENE EXPRESSON
mRNP Export Model

The first round of translation can be used as a secondary backup mechanism for
ribosomes to go through and knock off all the proteins still associated with the
mRNA.
And in a perfect world, the RNA HeLa case would be sufficient to get rid of
all those proteins and all those nuclear proteins would make their way back
into the nucleus. But cells are not perfect, and sometimes proteins get
through that check mechanism on the cytoplasmic side and they don't get
knocked off. And in those circumstances, translation is used as a secondary
backup mechanism to ensure that all of these proteins are
eliminated because of that property that I just described to you of the
ribosome going through and knocking off all the proteins associated with
the mRNA. So the first round of translation is a little different than all the
other translational rounds that will continue after it. The first round is going
to eliminate these proteins that are still bound.

RNA surveillance and Quality Control


NMD: Nonsense mediated decay
Affects of premature stops
Now sometimes there'll be errors that are made in the mRNAs for
whatever reason, and you'll end up with an mRNA with an in frame stop
. In frame stops will often give rise to truncated proteins, which, 70%
of the time wouldn't cause a problem, but depending on the nature
of the protein in its truncated form, can cause grief for the cell.

Example: Dominant negative effect Steroid hormone


receptor(transcription factor)
These are modular, so you'll have a ligand binding domain and a
DNA binding domain and you'll have a transcriptional activation
domain. If you have an in frame stop in a steroid
hormone receptor which permits the truncated protein to bind to
its steroid hormone. Truncated proteins compete with the real
functional proteins for active positions within the genome and
that can give rise to a loss of function or at least a reduction of
function phenotype if you rely on activating all of those given
steroid responsive genes
Example:
The same could be true for other types of receptors that require
various modules. Sometimes you'll get an extracellular domain of
a receptor that will interact with its ligand, but it can't carry out the
intracellular functions associated with that receptor. In the
end, you end up binding up all the ligand, but you can't carry out
the normal function of that protein. So these things are
recognized by the cell as being a little poisonous. You don't want
to have these truncated proteins around because they
can interfere with normal cellular functions that require the
full length protein to carry out those functions.
1. SR proteins define the exons so introns can be appropriately excised
2. Pre-mRNA is polyadenylated
3. Export factors are loaded onto the export factors
4. Remaining factors must be removed during pioneering round of translation
When there are errors in mRNA, can lead to premature stop leading to the
truncated form of the protein. The truncated protein in its estrogen bound
form competes with real receptors leading to a loss or reduce of function
especially for processes that require complete activation. These receptors
just block up access. The proteins can bind, but they don’t have the
associated function. This is toxic to the cells.

When ribosomes on mRNA reach the a premature in frame stop they


dissociate. And, if there are proteins bound to mRNA past the stop site after
the round of pioneer translation, the cell recognizes this as a truncated
proteins, calling in exoribonuclease to eliminate that mRNA so truncated
proteins don’t flow around in the cell and block important receptors. It's
very efficient and it's an important quality control mechanism.

mRNA stability
Stability of RNAs is under strict regulation and is critical for that steady
state concentration
The stability of RNA is critical for that steady state conformation. In Ecoli and
other bacteria, transcription must rapidly switch depending on environment, so,
mRNAs are quite unstable as you might not want the RNAs used n one
environment to be used in another. Because of the way that a given organism
must adapt, it will destabilize its mRNA so they are present for limited time.

Examples of this are genes involved in regulating the cell cycle, so when
tissues begin to differentiate, the mRNAs should be destabilized so as not to
lead to cancer.

Example: stability of cytoplasmic mRNAs in different organisms


Transcription has to really rapidly switch depending on the
environment that bacteria are living in. In these situations, an RNA or
an mRNA that's made in one environment might be relatively
useless or problematic in another environment. As such, the mRNAs
themselves are very short lived. They're very unstable because
you might not want those mRNAs sticking around when the
environment changes, and that's shown here.
E coli
In E coli, cell generation time is much more rapid than most other
organisms, but due to the way they have to adapt, are going to
change their transcription very quickly. This altered transcription
destabilizes mRNAs that are transcribed very quickly given them
a low average half life of about 3 to 5 minutes.
SAC Saccharomyces cerevisiae-yeast
Yeast are single celled eukaryotes, not bacteria, but does, in a
similar manner, have to change very rapidly when its growing
environment changes. The need for rapid adaptation to
environmental changes explains the low half life of about 22
minutes.
In mammals
Now when we get up to mammalian cells, certainly the cell
generation time is much longer and the average half life of an
mRNA in these cells is, is around 10 hours. So once as you make
an mRNA, it sticks around for much longer and that's a reflection
of the kind of environment that most cells experience in our
bodies. It's generally pretty stable. We maintain a temperature, a
constant flow of nutrients just to ensure that cellular homeostasis
is relatively stable throughout our lives. But very often there are
some mRNAs that have to be purposely destabilized and are only
supposed to be present for a very short period of time. Some of
those mRNAs might be mRNAs that are maybe linked to the cell
cycle, for example. Genes that are going to be involved in
activating the cell cycle are really important for example during
development or growth. But you don't want those things around
for very long. So once these tissues terminally differentiate, you
want to make sure that all those cell cycle mRNAs are
destabilized, right. You don't want them around because if you
have continuous cell cycle, that's one of the hallmarks of tumor
formation, right. Other things like cytokines or mRNAs that are
involved in a lot of immune responses that are radical and
dramatic. You want them to be activated very quickly, but you
don't want them to stick around for very long. So they do the job
quickly, but then you get rid of them and then we go back to that
good old cellular homeostasis stable.

mRNA Destabilization
Many of the short lived mRNAs have some elements rich in AUUUA sequences.

Example:
GMCSF or granulocyte macrophage colony stimulating factor (You don't
have to care) is very important for driving the proliferation of a number of
immune cells. There's a sequence in the mRNA of GMCSF that destabilize it
so as to avoid keeping it around for long after completing its job. Too many
white blood cells is a hallmark of leukemia or other immunological
disease. So within the three prime UTR of the mRNA that corresponds to
GMCSF, there are some elements that are rich in AUUUA sequences.

Adding the AUUUA sequence


The addition a AUUUA sequence in the 3' UTR of genes that do not naturally
have them(beta-globin) drastically affects the half life of mRNAs. And in a
control with a scrambled variant of the AUUUA repeats, there is no effect on
the half life of that new variant mRNA, suggesting then that it's this AUUUA
sequence that's very important and the sequence is presumably recognized
by critical proteins that will recruit in an assembly of exoribonucleases that
will destroy the mRNA.

RNA Decay/Degradation
Location
Degradation mostly occurs in the P-bodies, devoted sectors of the cytoplasm.
These are membraneless organelles, or liquid-liquid condensates formed by
the recruitment of RNA and enzymes required for RNA degradation.

Mechanism
Deadenylation-dependent
1. The deadenylase complex deadenylates poly-A tails from 3' to 5'
2. Exosome RNA degradation machine chews up RNA from 3' to 5'
At the very end of the exosome, there are two different
ribonucleases, an exoribonuclease that will chew up the RNA as it
comes within its proximity, and an endoribonuclease if any
fragments weren't fully degraded. So it's highly efficient.

3. Decapping enzymes remove the methyl guanylin cap from 5' to 3'
4. Exposed 5' end of mRNA allows XRN1 enzyme to degrade RNA from 5' to 3'

Deadenylation-independent

Endonucleolytic cuts and mRNA degradation


Endonucleolytic cut in mRNA will destroy it very, very rapidly.
If an mRNA gets cut in the middle by an endoribonuclease, this is
the kiss of death for an mRNA because you activate 3' to
5' degradation on one of those products, 5' to 3' degradation
on the other of those products, in addition to the normal
degradation processes that were ongoing on that particular
mRNA. This is particularly important when talking about RNA
mediated interference.
Regulation

mrRNAs are regulated by key proteins that affect their stability, and their
steady state within the cell is dependent both on transcription and their
stability.

Example: Transferrin receptors

Transferrin receptor is an essential protein in transporting iron to


cells. It does so under conditions when the cell needs more iron,
aka, when the levels are starting to drop.
Transferrin receptors also have really important secondary
structures in their three prime untranslated region UTR and
these contain specific elements called iron response elements or
IREs. So under normal conditions transferrin receptor is going to
bring in iron if the cell needs it. There's a protein called iron
response Element binding proteins, IREBP, an RNA binding protein
that interacts with IRES. These critical RNA sequences, in this case
present in these stem loops in the three prime UTR, IREBP isn't in an
active confirmation in high iron concentrations, but when the
concentrations of intracellular iron drop it takes on an active
confirmation and when it's in its active confirmation it will interact with
its IRES binding sequences.
High Iron
But if the cellular levels of iron increase, these stem loops in the
three prime UTR that also have IREs associated with them also have
AU rich elements that will recruit in proteins to destabilize the
mRNA. Because in conditions of high iron concentration within the cell
you don't want to bring more iron in. So you don't need the
transferrin receptor and that's regulated by destabilizing the mRNA by
virtue of proteins interacting with these AU rich elements and stem
loops. In the three prime UTR the mRNA gets rapidly degraded
and under those circumstances you don't bring any more iron to cell to
maintain intracellular iron homeostasis.
Low Iron
And when in low iron conditions and the IREBP is in this
confirmation, it will interact with its cognate binding element, IREs,
present on these stem loops, and in doing so, it protects those stem
loops from interaction with the proteins that would normally interact
with the AU elements to destabilize the mRNA. In low iron situations
within the cell, the IREBP takes on active confirmation, protects the
stem loops, thereby stabilizing the mRNA, allowing it to be translated
at a higher and more efficient rate, giving rise to transferrin
receptor protein that's capable of bringing in more iron into the cell.

Remember IRE-BP only active in low mRNA conditions, interacting with IREs in
the 5' UTR of mRNA

Translational regulation
Example: When RNA levels are consistent with the quantity of protein they
produce
Normally, the synthesis of polypeptides is under strict control and mRNA
abundance reflects protein levels such that more mRNA equals more protein.
But in some cases, the relationship appears skewed, suggesting that protein
synthesis or the stability of the protein is regulated. Somewhere between
mRNA(after transcription) and the translation of proteins.
Example:
Normal condition
Hunchback is anterior specific transcription factor, specifying the
structure that will give rise to the anterior end of the embryo
The mRNA is spread out all over the embryo
The associated proteins are expressed only in the anterior end
Nano on the other end is responsible for specifying the structure that
will give rise to the posterior end
The mRNA is only in the posterior end
The associated proteins are expressed only in the posterior end

Removing Nano function


Removing nanos function results in no change to hunchback mRNA
locations, but, rather then being localized in the anterior end, the
associated proteins will be expressed everywhere, suggesting that the
presence of nano stops hunchback mRNA from being translated in the
posterior region. Hunchback translation cannot take place near nano,
so, in its presence, Hunchback mRNA isnt eliminated, but rather
blocked.

Example: Ferritin
There are two stem loops in the 5' end of the ferritin mRNA, as
well as iron response elements.

High Iron
Ferritin is required to sequester the excess intracellular iron
In high iron conditions, IREBP is inactive, so the scanning complex can
go right through them, translating the mRNA to give rise to ferritin
protein that will actively remove iron.

Low iron
In low iron conditions, IREBP is in its active form and interacts with IREs
in the 5' stem loop, blocking the coding region and thus the translation
into ferritin protein. This means less iron will be sequestered and more
will remain in the low iron concentration area.

Example: Lin 4 mutation, the peter pan gene


Interested in understanding developmental regulation in a temporal
manner, Victor Ambrose studied C.elegans which normally go through
many stages of development
Under normal circumstances, the antisense small non-coding Lin
4 RNA interacts with those sequences of limited homology in the
3' UTR of Lin 14 mRNA and thereby ends up blocking
the translation of the Lin 14 protein. This takes place in two
mechanisms
1 is it affects somehow the direct translatability of that
protein
2 it induces a deadenylation of the mRNA and eventually the
loss of the transcript
When a mutation in Lin-14 leads to the absence of it 3' UTR, Lin 4 has
nowhere to bind and Lin 14 has no way to be disactivated, so the
embryo remains in the 1st larval stage

POST TRANSCRIPTIONAL GENE SILENCING


Consequences of transfection of modified small dsRNA(siRNA)
The introduction of siRNA or miRNA will result in the loss of function phenotypes
typical of genetic mutations in the target gene.

RISC, the RNA inducing silencing complex, is a multiprotein complex that


incorporates on strand of modified dsRNA by using the siRNA and miRNA as
templates to recognize complementary mRNA and activate RNAase to cleave the
RNA.

siRNA
siRNA are the double stranded small interfering RNAs that will induce mRNA
degradation to completely complementary target RNAs. Because of their
complete complimentarity, siRNAs eliminate all of the mRNA through
endoribonucelolytic cleavage
RNAi or RNA interference refers to the developmentally/physiological
regulated process in which a small RNA strand binds to a longer RNA strand
coding for a known protein so as to interfere with translation and thus the
production of that protein. Argonaute proteins of the RISC and Dicer are both
present in this pathway. Introducing a transgenic complex that will fold over
itself in a hairpin to trigger the RNAi pathway.

The Dicer enzyme which functions a little like RNAase III acts as a dimer to
cleave dsRNA. The dsRNAs are cleaved into siRNAs, fragments of 21-23nt,
and then bound by an Argonaute protein within the RISC. Then, the helicase
activity of RISC triggered by ATP hydrolysis drives the unwinding of the siRNA
so that the siRNA product can be used to target the complex to the
complementary mRNA. The complementary mRNA is cleaved with a kiss of
death, and the resulting cleaved transcripts are then degraded by
cytoplasmic ribonucleases.

Example: Drosophila red -> white eyes


Wild type Drosophila have red eyes, but if you introduce a
transgene that makes one of these snapback double
stranded RNAs somewhat complimentary to the gene that
gives flies red eyes, It's called white, Go figure, you end up
getting a white eye phenotype.
• double stranded RNAs that are generated from these kind of transgenic
constructs will end up eliminating all the mrnas that are responsible for
telling the eyes to be red
• The result is a mutant phenotype, that is, a fly with white eyes

Example: Plants and Clavata 3


Clavata 3 is an important gene product that limits the stem
cell population in plants and getting rid of Clavata 3
using RNA mediated interference or RNA I results in an
increase the stem cell population that would normally be
limited by the Clavata 3 gene function.

Example: Tay Sachs


Tay Sachs disease affects the way the correct functionality
of neurons. If you introduce specific double stranded RNAs
into a wild-type healthy mouse, you will convert a wild
type healthy looking mouse into a Tay Sachs disease model
because you eliminate that gene product that would
normally inhibit it.

siRNA with double stranded RNAs


Example: chromosomal silencing Centromeric heterochromatin

The chromatin within centromeres must be silenced as they are


associated with the kinetochore-a structure that is very important for
accurate cell division. Having transcription factors and RNA polymerase
coming in and out would be disruptive to its intended function. The
dsRNA siRNA nucleates a complex that involves several proteins that all
play a role in generating H3K9me3 which inhibit transcription and keep
the chromatin compact.

miRNA
Transcription
miRNA are transcribed by RNA polymerase II and thus capped in the process

Making mature miRNA


o Pri-miRNA fold up into dsRNA hairpin
o Drosha digests pri-miRNA into pre-miRNA
o Exportin 5 takes pre-miRNA into the cytoplasm via the nuclear pore complex

Biogenesis

1. Dicer, the RNAase III-like enzyme cuts double stranded pre-miRNA into mature
miRNA
1. Acts as a dimer to cleave dsRNA into 21-23nt fragments
2. Dicer hands over the double stranded miRNA to RISC, we call this a miRISC
3. Argonaut proteins in the miRISC complex(silencer complex) bind to miRNA and
use their ATP driven helicase activity to unwind miRNA into single strands
4. The single stranded miRNA re used as guides to mRNA targets with antisense or
limited antisense homology
5. miRNA guide interact in antisense with the mRNA targets at the 3' UTR to block
translation or destabilize the mRNA target through de-adenylation

Structure
The miRNA pathway, like the siRNA pathway is triggered by dsRNA
molecules. miRNAs are also not completely complementary to their target
RNAs, forming many loop structures when binding to 3' regions of the
UTR(untranslated region). Because of their incomplete complimentarity,
miRNAs destabilize or block translation.

Regulatory role- Developmental fine tunning through homology and


bioinformatic analysis
~60 % of the predicted coding genes in our genome may be under miRNA-
mediated control
1. Metabolism
2. Tissue growth
3. Developmental timing
4. Stem cell biology/pluripotency
5. cancer

Titration of miRNAs to alleviate repression on mRNA targets


Example: titration of miRNA by novel cellular RNAs to alleviate repression on
mRNA targets
Cells use RNAi or siRNA based mechanisms to attack invaders, so
when double stranded viral RNA comes into the cell or gets generated
immediately, the RNAI pathways are activated and they'll chop it up
into little bits. Same with transposable elements. But some viruses
have evolved this very interesting means of circumventing that
defense.
So what I just described to you were the roles of these siRNAs
in regulating either invaders in an RNAI based system or miRNAs in
their ability to regulate developmental events. In the case of viral
infection. Very often the host we'll use these small RNA based
mechanisms to get rid of very often RNA based products from the
virus.
But viruses have come up with a means of sequestering those
small RNAs by
1. Generating long non coding RNAs that have homologous or
complementary sequence to the critical miRNAs or siRNAs that could be
involved in interfering with its infection.
2. Using these circular RNAs or sponge RNAs so that they can titrate all of
the important micro RNAs that might be involved in inducing for
example, a very important immune response that would eliminate the
virus.
And so these are just two mechanisms through which this host
defense tug of war just seems to be propagating. But in this case the
viruses are using or attacking our small RNA based defenses.
piRNA
Role
piRNA or PIWI-interacting RNA are longer then siRNA and miRNA and differ in that
they do not depend of Dicer activity while the siRNA and miRNA do. They do
however share the requirement of an Argonaut protein, which in the case of
piRNA is called PIWI. piRNAs are transcribed from a DNA cluster made up of
disabled transposable elements in flies. The piRNA are modified and then bound to
their targets through antisense complementarity while PIWI cleaves the
transposon RNA. This function helps protect the germ cells by eliminating bad
harmful RNAs which is important as germ cells give rise to gametes.

RNA polymerase 2 goes through the region of clustered up sequence of


transposable element and makes a long Pi RNA precursor complementary
to sequences present in transposon. These piRNAs are then met by
Piwi, and they form a ribonuclear protein complex whereby Piwi, the
Argonaut and its associated Pi RNA will seek out RNAs that
are complementary to that sequence (could be transposable elements that
actually got transcribed). mRNAs of transposable elements could cause
grief in the cell as it could give rise to truncated proteins or other kinds of
weird translated products. So in order to limit that, the Pi RNA associated
with its Piwi Argonaut protein will interact, and then the Piwi will cut that
transposable element, transcribed mRNA into two sections, and then one of
the sections can get reused to go back and be involved in an amplification
process to make more of these primary Pi RNA peewee complexes.
The whole idea is that you're protecting the germ cells from
these transcribed transposable element DNA sequences that cause grief for
the cell. The Pi RNAs also have nuclear functions so they can
assemble chromatin modifying complexes to shut down specific regions of
the genome. This is particularly important in recognizing genes that come
from self versus gene products that come from non self, like viral invaders
or transposable elements or other.

piRNA and PIWI interactions are involved in many processes relating to gene
expression from the regulation of mRNA stability to enhancing protein synthesis.
In simple terms, piRNA and PIWI form a ribonuclear complex that chops up
complexes of transposable elements, so they are no longer dangerous.

Dosage compensation: too much of a good thing


Females have XX chromosomes, one of which remains inactivated in all cells of the
organism and is referred to as a Barr Bodies.

Example: Alteration of inactivated state of second X chromosome

In calico cats/tortoise shell coloured cats, some regions have active X


chromosome 1 and the others active X chromosome 2, rendering the cats
coat colour a mix between the two different X chromosomes.
This alteration is the result of an anti-acetylated H4 meaning that this X
chromosome is not acetylated like the others and is thus tightly wound
up and inactive.

Example: An extra X chromosome


Normally 1 X chromosome will be inactivated. Every different cell can decide
which chromosome to inactivate, but all subsequent divisions result in cells
with the same X chromosome inactivated. Regardless of amount of X
chromosomes present, only one can be active at once
XXX inactivates 2 X chromosomes

lncRNAs
Regulation of gene expression- transcriptional interference

XIST and TSIX are antagonistic long non-coding RNAs that are both expressed in
early embryogenesis. When it comes time to decide whether to inactivate one of
the X chromosomes, the relative amounts/expression of XIST and TSIX decide
which process predominates. More TSIX then XIST lead to active X chromosome
while more XIST then TSIX results in inactivation of the X chromosome seeing as
XIST coats and silences chromosome and TSIX antagonizes XIST expression. Later in
embryonic development, the two generally become mutually exclusive.

XIST RNA
The XIST RNA is a long non-coding RNA lncRNA encoded by the XIST locus. The
lncRNA binds to discrete regions of the X chromosome and X-tinguishes gene
expression as it spreads down the X. Its mechanism and function isn't completely
understood yet.
I left off describing the effects of a long, non coding RNA that's critical in
the decision to inactivate a specific X chromosome during a dosage
compensation process. In mammalian female cells, the RNA gets
expressed, it's spliced, polyadenylated, and then coats the presumptive X
chromosome to be inactivated. And while coding it, somehow it's capable
of extinguishing gene expression from most of that X chromosome
Insys. We should be familiar with these terms. Now where it's expressed, it's
actually acting on that chromosome. It's not working on the other
chromosome.

Mechanism
By coating the entire X chromosome in cis XIST, RNA recruits chromatin
modifying complexes. These poly-comb complexes partake in repressive
chromatin modifications, that is, histone deacetylation by HDAC 3 and lysine
methylation of H3K27. These changes condense the X chromosome and
render it mostly inaccessible to transcription factors. In these regions where
XIST is covering the chromosome in Cis, gene expression is largely X-
tinguished from the inactive X, a characteristic that is maintained through
epigenetic processes in every new daughter cell.

TSIX RNA
TSIX is a long non-coding RNA lncRNA whose expression biases against XIST
expression.

Systems Biology Approaches


Letters that make up the identities of individual nucleotides…how can the mysteries of biology

be discovered as this forms nothing but a sequence; does not give us info to better understand

what makes life.

- How do they give rise to functions, cells, tissues, organs, systems, organisms (living
things; animals, plants, fungi…)?
- This is not obnoxious in the sequence of information; requires deciphering = new age of

biological investigation (how genes work, how they work together how they give rise to

the life we know)


Analysis of full genomes indicates that much characterisation remains to be done.

- What is the minimum essential toolkit we need? If you examine these pie charts, the

predominant element (more than half of the genome is genes we have no idea what

they do; genes of unknown function); make comparisons across species.

- The basic cellular toolkit shared by each organism is strikingly conserved.


- Genes required for cellular metabolism make up a large proportion of the total number
of genes.
- Transcription and translation related genes are also present in significant number
- The vast majority of the genes identified are considered to be of unknown function

Modifying the yeast genome

- The disruption construct is introduced into diploid yeast cells to replace the

appropriate region; 2 chromosomes - disrupt one gene product on the other

chromosome (yeast comes as diploid or haploid;

- Remove gene function systematically by replacing genes; yeast carry out homologous

recombination events so long as the replacement used has homologous sequences to

the endogenous gene in the chromosome

- Will direct homologous recombination so that those new genes will replace these

endogenous genes; have to know something about the sequences of all the genes you

are working with

- This property/propensity to carry out homologous recombination = individually disrupt

single genes using targeting/disrupting constructs = amplicons with a dominant

selectable markers (Can be a drug resistance gene for example) flanked by sequences

with homologous sequences with the yeast

itself
- Carry out a PCR reactions with primers that contain these homologous regions
- Dominant selectable marker gene flanked by regions of 100% homology to regions that

flank your gene of interest = direct homologous recombination event

- Presence of the dominant selectable marker confers drug resistance (G-418) so the cells
can grow on drug.
- When allowed to sporulate the haploid progeny (spores) will either have a wild type
chromosome or a recombinant chromosome.
- The effects of the gene replacement can then be assessed ie…viability or growth rate

- G-418 resistance = dominant selectable marker; only this one will grow in that

environment (will have that resistance therefore can be selected for if inserted in the

right place);

- If the disrupted gene is essential, these spores will be non viable = would not be able to
duplicate the yeats cell to form haploid cells;
- If you end up with 2 cells instead of 4, then it was an essential gene in basic cellular
processes
- therefore whatever gene was knocked out, is not involved in sporulation, but is rather

responsible for something else if viable cells result after sporulation

- Then once you have these isolated haploid cells, you put them under various conditions

in the absence of this wildtype gene functions to see how this gene plays a role in the

normal physiological development of this cellular organism

- you HAVE to start with diploid yeast just incase this change is lethal; if that gene were

actually essential gene in basic cellular processes, the cell would die…need to have a
backup wildtype chromosome thats not affected just in case the gene eliminated is

essential (can only see that if sporulation occurs)

Functional Genomics

- RNAi can affect all diff kinds of cells in a full grown, living animal; C. elegans are good for

this because they express the desired effects once RNA is taken up by the organism =

loss of function of a particular gene (RNAs introduced in C. elegans goes through an

amplification process and goes to all tissues except neurons)

- Come up with a genome wide means of analyzing every single gene (19,000) function in

C. elegans for all predicted genes = engineer plasmids to that each plasmid would drive

a dsRNA by having a T7 promoter that drives the expression of one RNA in one direction,

and another T7 promoter driving the expression of RNA in the other direction; indice

expression with IPTG (way of activating the T7 promoters that will eventually make the

ds RNA)

- Makes 19,000 gene constructs; Each one of those constructs will make double stranded RNA that corresponds to a

single predicted gene in C. elegans; then you transform 19,000 independent bacterial colonies and then you can feed each one of those

bacteria, which corresponds to a bacteria that will make double stranded RNA, to a predicted gene; And then feed that bacteria to the

animals and each one of them will show an RNAI effect typical of loss of function in that particular gene

- i dentified a number of genes that fell into a few major categories.

- Genes that were sterile, that gave rise to sterile animals or caused embryonic lethality not surprisingly fall into the major classes of those

genes involved in the basic cellular toolkit.

- DNA synthesis, RNA metabolism, translation, transcription, these are all very important for every single cell on the planet

- Things that fell into the uncoordinated category that made the animal so they didn't move properly = generally genes that are involved in

neuromuscular function, many of which are conserved up to us; can understand just by l ooking at all these uncoordinated animals what

genes are involved in all of these various functions. It might be synapses, it might be the way that cells send out their neurotransmitters,

might be the way that they form neurotransmitters, but they all fall into this category

- post embryonic phenotypes tend to be involved in signaling so that the organs get formed in the correct time, the right place and carry out

their appropriate function as the animal (tend to be a little bit more animal specific and not necessarily part of the basic cellular toolkit)
Transcription Factors are Modular
- don't have to do functional genomic analysis in order to start to put labels on genes that have unknown function. You can carry out

more proteomic approaches

- take advantage of the fact that t ranscription factors are modular, that they have transcriptional activation domains and DNA binding

domains. And as long as these things are put together, it didn't matter if they come from different transcription factors, they always

activate the downstream gene based on the sequence that the DNA binding will interact with

- Make fusion proteins with protein of interest and a known DNA binding sequences (ex: GAL4)

- If proteins A and B interact the two fusion proteins (A-DNA Binding Domain and B-

Transactivation Domain) may be brought into proximity to reconstitute a functional

transcription factor.

- Can activate a selectable marker or a reporter gene (ie..His, LacZ)


- 2 function proteins; if A and B interact, then they bring those 2 factor together such that

the DNa binding domain will bring down the transcriptional activation domain of the

prey,

- Can reconstitute TF to interact with UAS if for example, a cell is compromised because

it can't make histidine, then you drive a trans gene; it will activate that enzyme and will

render the cell possible to make histidine, allowing it to grow;


- i f you put this particular construct and a GAL4 DNA binding domain bound to your protein of interest. And in the same cell, you introduce

another protein that you think might be interacting with your protein, but it's bound to a transcriptional activation function. And they

come together well. That GAL4 DNA binding domain will interact with the UAS, and when it i nteracts with UAS, it activates transcription of

the downstream gene. If it activates transcription of downstream gene and i t's a histidine synthesis gene, suddenly you make histidine in a

cell that normally can't do it, the cell grows.

- Can select very efficiently for all the cells where a and b interact
- Bait: a protein you're interested in bound to a DNA binding domain

- Prey: a protein that you're going to query that's b ound or fused to a transcriptional activation function
- ut B can be every single protein in the entire proteome, every single predicted gene product and you can make libraries that are all fused

to transcriptional activation and then you can Co transform these things and evaluate which cells grow and then you can go back and figure

out what the gene product was that was in the prey = 2 hybrid system

- Problem: these proteins may not like going into the nucleus even though that's where their function is…

Different Types of Protein Fragment Complementation (direct interaction)


- reconstitute a protein that has been separated into two separate halves, the n-terminal and the C terminal halves. If the bait and prey

come together, this will reconstitute a protein and it might have an actual function

- the most commonly used protein fragment complementation strategies and it's using a protein called dihydrofolate reductase that's

really important for purine synthesis. If you don't have it, you don't grow; if protein X and protein Y (2 query proteins) come together, it'll

reconstitute dihydrofolate reductase and suddenly cells will be able to grow again

- You can even do it with GFP s o that GFP is engineered where it's split in two, and the two parts of GFP themselves don't give rise to

fluorescence. But if they're brought together by a protein protein interaction between, in this case, protein X and protein Y, the protein

halves of GFP will reform and reconstitute proper GFP protein that will fluoresce and you can detect all of which give you an idea of which

proteins are interacting together in a cellular environment

Thi howeverdoes notworkformembraneboundprotei s

BioID-using proximity labeling to tag the proteins in your neighborhood

- proximity labeling = dependent on labeling proteins that come within a very small

distance with your target bait protein; not necessarily direct protein protein interaction

- Bait protein of interest fused to biotin ligase (enzyme)


- Biotin = vitamin B like compound important for normal cellular processes and have to be

ligated to specific p roteins on the primary amines by a specific enzyme called a biotin ligase ; used to covalently

affect a limited number of substrates within a cell; only recognizes those substrates
- by mutagenesis and by identifying the changes that give rise to mutant biotin ligases in bacteria, a very promiscuous biotin ligase was

i dentified = BirA* = it will biotinylate, it will add on a biotin molecule to any protein with a primary amine
- Expression construct that will express this particular fusion protein = introduce it into cells and have it expressed correctly in

those cells; theoretically the promiscuous biotin ligase will biotinylate any other protein that comes within a very close
range to the protein that you're interested in = biotin tag

- Mass spectrometry and protein identification:


- can easily purify Biotin tagged proteins through affinity chromatography

- grind up those cells that were transfected with BirA; all of the proteins that are biotinylated within the cell can be purified

from a protein extract by running the protein extract over a streptavidin, sepharose, or agarose column; can wash the column

and get rid of all the other garbage and then elute all those biotinylated proteins from the column just by adding biotin. By

competition, all the proteins will fall off and you'll have the collection of proteins that came within a specific range in the cell

- run those individual proteins through a mass spectrometer and you'll get identities

- Run it through a mass spectrometer = mass identities for each protein = help

you understand who was hanging out with your protein in the cell = important

(what kind of neighbors it has; who its interacting with to carry out its functions)
- can give you some important information about what the protein does; eventually you end up with an idea, a network of

proteins that interact together

- Enhancing our understanding of protein interactions can provide clues to


function
- Genome analysis
- Networks of gene products and how they interact
- Important for pluripotency
28. Molecular biology of gene targeting
Functions of genes:

- Function is best addressed through removal of gene activity and analysis of the resulting

phenotype; Abnormal phenotypes indicate specific processes have been disrupted that

rely of the activity of the affected gene

- When you examine mutant phenotypes (ex homeobox mutations); mutants obtained

through random mutagenesis and selection for mutant phenotypes thereafter (defective

phenotype)
- Removal of gene activity
- Disrupt homeostasis based on random mutation -> forward genetic analysis (looking

for mutants with a phenotype, but don't know what the gene is that corresponds)

- Randomly mutate and randomly look for phenotypes, then go back to find gene: forward

genetic analysis (finding mutants and going to see which gene)

- Disrupt the activity of specific gene product to assess its function -> reverse genetic

analysis (interested in what a gene does in an organism…analyze gene function and the

phenotypes that arise once you remove that gene function of interest)

- Understand what that protein/gene does in cell, and go back to eliminate that gene

function: reverse genetic analysis (start with sequence, go back and try to understand

what that gene might be doing)

- Ex: Functional genomics


- C. elegans researchers “knocked down” every predicted transcription unit in the genome

by using feeding RNAi… All the analyzed genes were assigned some function as

determined by the visible RNAi phenotype

- RNAi in C. elegans where we can eliminate gene functions of every single predicted

genes; what are the mutant genotypes and how they affect life

- GENE TARGETING in mice

Modifying the yeast genome

- The disruption construct is introduced into diploid yeast cells to r eplace the

appropriate region; Eleminitae gene functions using homology directed replacements;

As long as you have flanking sequence with 100% homology, introduce into that cell and

those sequences will direct replacement

- Presence of the dominant s electable marker confers d rug resistance (G-418) so the
cells can grow on drug.
- When allowed to sporulate, the haploid progeny (spores) will either have a wild type
chromosome or a recombinant chromosome.
- The effects of the gene replacement can then be assessed ie…viability or growth rate
- Use the same kind of properties (homology directed replacement), not only for yeast but
for other organisms
HR can also be performed in the pluripotent stem cells of mouse
- homologous recombination could also occur in pluripotent embryonic mouse cells (ES
cells).
- Use homology directed recombination to engineer/alter chromosomes such that you can

replace a section/whole gene with some dominant selectable marker that allows you to

select for that particular event

- Inner cell mass = embryonic cells are characteristic of the cells of this inner cell mass;

gives rise to every tissue in the body (pluripotent); can make an animal all on their own;

Embryonic stem cells = can contribute to every single tissue in a growing embryo;

- Make a replacement construct w ith dominant selectable marker (neomycin/G-418

resistance gene) and flanking sequence that binds that dominant selectable marker; you

need very very large flanking sequences with 100% homology (direct homologous

recombination event and not a random insertion event)

- G-418 resistance gene = 1st line of selection: recombination vs not


- Outside the flanking sequences of homology, you would like to introduce another gene
product = Herpes simplex virus thymidine kinase (tk
HSV); Such that if that gene product gets incorporated, it can be killed if placed in an
antiviral drug environment
- 2 potential results when incorporating the knockout mutation
- (1) replacement construct drives a homologous recomb event such that you replace

the gene correctly, and do not include the tk HSV thus are resistant to the anti-drug

environment
- (2) construct gets integrated r andomly into sections that have nothing to do with the

targeted gene; in this case, everything will go in including the tk gene meaning the cells

are not resistant to the anti-drug environment

G-418 will select for all recombination events

- 2nd selection: Ganciclovir is toxic in the presence of the herpes virus tk gene, so it will

negatively select against all non-homologous recombination events (homologous vs

non-homologlous)

- Only ES cells that have undergone Homologous Recombination (HR) can survive the
two selection steps.
- Of those cells retain tk, therefore probably not correct, will die in the presence in the

drug; Select for those cells that went through homology driven gene replacement event

- ES cells are then used to populate the blastocyst of an acceptor mouse; This mouse
strain has to be another coat color that is recessive.
- Take cells out of culture, and introduce into mouse blastocyst; those cells will contribute
to the final formation of a mouse embryo
- The blastocyst is transferred to a surrogate mouse mother
- ES cells came from a mouse that actually had brown coat color (dominant color) =

distinguish genotypes of ES cells in a pool of inner cell mass cells

- In the blastocysts = black hair = wild type embryonic cells = not manipulated
- The progeny will be a mixture of both genotypes if the cells were viable and the process
worked properly.
- (1) black = unaffected b y injecting manipulated cells
- (2) chimeric = brown segments come from initial ES cells that were manipulated
(mosaic of different genotypes put together)
- Hope that some of the affected cells ended up being germ cells; so you cross them…

eventually to get a homozygous pure animal that's been manipulated for that one

particular gene segment of interest; often homozygous die early or the gene loss is

compensated for and its thus redundant

The population of totipotent cells is therefore heterogeneous in the blastocyst

- Homogeneous/homozygous host embryos will give rise to black mice


- Embryos that are heterogeneous (have cells from the host (black fur) and the targeted

ES cells (brown fur) will be CHIMERIC ie…coat color will be spotted/patched)

- This means that the implanted cells contribute to various tissues. The hope is that one
of the tissues they contribute to is the germ line!
- 2 diff phenotypes (manipulated AND non-manipulated cells from the mom); This will

then be transmitted to their germ cells; Cross these animals back and forth until you

end up with a homozygous manipulated gene product

- Often mice and born with no detectable phenotype; often genes are redundant, such
that one gene product may be compensated with another

Transgenic mice are simpler to make and can provide important information

- Transgenes integrate randomly into the genome s o positional effects may affect gene
expression
- Transgenic reporter genes are very important for understanding expression patterns of
genes
- Can express transgenes under endogenous promoter or heterologous control (inducible

promoters..ie heat shock or using heavy metals) - Can be used to to edit the

genome

- Get a fertilized oocyte, inject DNA construct into one of the pronuclei before they fuse

; reimplant t hat injected zygote into a foster mother; give rise to pups that at a very

surprising rate will integrate that trans gene into their chromosome (10-30%) - Induce

a loss of function, over express to swamp systems…etc - Use transgenic animals

to edit the genome however we want!


CRISPR/Cas9-a Revolution in Biology: Bacterial acquired immune response

- Segments of bacteriophage DNA sequence are integrated into the genome of some

species of bacteria in Clustered Regularly Interspaced Short Palindromic Regions

Regions are transcribed into primary RNA that is bound by tracr/trRNA. Cas9 recognises

structures in the tracrRNA and is recruited to foreign DNA segments that are recognized

by the crRNA. Cas9 has been very well characterized in Streptococcus pyogenes. It

possesses both HNH and RuvC-like endonuclease activities

- Strange, repetitive sequences in e coli in some part of the chromosome = Clustered

Regularly Interspaced Short Palindromic Regions; sequences within the cluster share

homology with the bacteria’s worst enemy…the bacteriophage

- Bacteriophage DNA was somehow acquired by the bacteria , but into a cluster, and
interspaced
- Adaptive immune response based on acquiring chunks of DNA from enemy, and using it
against it
- Trans acting crispr RNA interacts with these interspersed regions through
complementarity
- Helps to mature these sessions of this large RNA to give rise to individuals crispr RNAs
- Hockey stick structures from diagram = stem loop structures; recognized by cas9 = form
a complex
- Watson crick base spring between segment of RNA and cas 9; Cas 9 nuclease will

generate a dsDNA cut within the bacteriophage target, debilitating the genome,

degraded by bacterial host

- These clustered, regularly interspersed short palindromic regions, now known as CRISPR, are in fact transcribed in the bacteria to make a

primary transcript that has all these repeat regions and RNAs that correspond to these bacteriophage genes, We'll call them CRISPR RNAs.

The bacteria also has a transactivating CRISPR RNA that's shown here by this little hockey stick that interacts through complementarity

with these interspersed regions, these repeats

- by interacting with those repeats, it will eventually help to mature that primary transcript into individual CRISPR RNAs

- The hockey stick is a specific RNA as a sequence recognized by a protein in the bacteria = CAS 9 (RNA binding protein that interacts

specifically with this stem loop)

- i n doing so, it will use the CRISPR RNA to take it to a target DNA on an invading bacteriophage. And when that CRISPR RNA recognizes the

sequences that are complementary to it, CAS 9 will carry out a killer double stranded nucleolytic cleavage of the DNA; bacteria has this

incredible adaptive immune response

Genomic Engineering/Editing
- i t was found that you could actually generate an RNA such that you could eliminate the necessity of a trans activating crna; ou could make

a single RNA that had these same kinds of stem loops and you could add on an RNA sequence that recognizes almost any DNA target and

any substrate that you want to eliminate = single guide RNA

- specificity that's conferred by the sequence relies on a protospace or adjacent motif (PAM motif); In order to have this endonucleolytic

cleavage to work, you have to engineer your guide RNA segment such that it complements or base pairs with sequences just next to this

trinucleotide sequence, a PAM motif; his could be any nucleotide followed by GG

- providing that that Pam sequence is positioned correctly, you'll get 2 endonucleolytic cleavages:

- One catalyzed by the RuvC domain; and the other by the HNH domain

- you need to introduce CAS 9, which is not present in most of our cells, and you need to introduce this engineered single guide RNA with

this interesting stem loop that's going to bring CAS 9 into the reaction. So you need at least two different transgenes here, one that's going

to make your engineered single guide RNA and one transgene that's going to make CAS 9.

- The newly optimized CAS 9 always have this NLS sequence and then you can drive these things in any cells you want with the promoters

that you've defined based on that cell type that you're interested in
- A combination of the crRNA and the tracrRNA can be expressed as a single guide RNA

(sgRNA) that will recruit the endonuclease Cas9 to a region of the genome. Specificity is

achieved through 20nt homology to a DNA gene target upstream of a Protospacer

Adjacent Motif (PAM) sequence

(NGG). Cas9 will generate a DSB 3nt upstream (5’) of the PAM sequence in the target
DNA

- Pam motif = trinucleotide sequence with 2Gs in 3’ position


- Single guide RNA will watson crick base pair with one of the stands with the dna
segment

A sgRNA can be expressed from a transgene using a suitable promoter:

- A separate transgene is required to express the Cas9 endonuclease so that both the

sgRNA and Cas9 are present in the same cell (nuclei).The RNA structure of the sgRNA

will be recognized and bound by Cas9 and the complex will be directed to the target

DNA site. Cas9 uses two separate endonuclease domains to cut the DNA strands at

regions that are directed by the PAM sequence.

- Insert 2 transgenes into the cell to manipulate it:


- (1) guide RNA expression: species specific promoter, target sequence, RNA scaffold

(palindromic repeats that allows it to fold onto itself and form these stem loops; cas9

recognized stem loop on RNA)

- (2) Cas9 expression: species specific promoter, Cas9 coding gene, NLS (so it can enter
the nucleus)
the double strand break causes a problem. So the cell recognizes it right away.

Nonhomologous end joining:

- it ends up making insertions and deletions giving rise to nonsense mutations or frame
shifts (to re-anneal it quickly)

introducing another transgene with a dna template

- use the template to restore the integrity of that DNA segment; template used has to

have a great deal of homology shared with the region to be repaired

- can repair double stranded break with a new DNA segment = homology driven repair
mechanism
- OR it can be used to change DNA homology; insert bits of DNA template that will be

taken up because of this recombination event (can alter the genes that encode proteins

any way you want; repair template technology)

- Can change just about anything, quick and efficient

You might also like