Biol 200 Notes
Biol 200 Notes
KH1-KH8
Topics
• Informational biopolymers
• Monomers: Asymmetric
• Polymerization
• Replication, Transcription, translation
Replication Transcription Translation
Adding 3' OH on last 3' OH on last tRNA carry 1 AA for every codon(3
monomers DNA nucleotide RNA nucleotide nucleotides)
of chain attacks of chain attacks
alpha phosphate alpha phosphate
of incoming of incoming rNTP
dNTP --> last --> last two
two phosphates phosphates are
are dropped dropped
Note: rNTP
diffuses
randomly and is
only linked if
Watson crick is
respected
• Hydrophobic AA side-chains are not exposed at the surface but buried in the
core
o Hydrophobic effect:
• Hydrophobic molecules coalescence favored by weak non-covalent Van
der Waals intermolecular interactions
Amino Primary Seco Seco Tertiary Quaterna Supramolec
acid structur ndary ndar structure ry ular
e struc y structure complex
ture: struc
Alpha ture:
helix Beta
pleat
ed-
shee
ts
Char Peptide Local Local Overall Multimeri Large
acte chain confo conf conformation c molecular
ristic backbon rmati orma of proteins- machines
s e- on of tion polypeptides: contains made of
specific pepti of spatial any multiple
sequenc de pepti organization of number distinct
e of AA chain de multiple sec of proteins
back back structures identical
bone bone or
60% Protein tertiary different
of structure polypepti
polyp ancient des
eptid evolutionary
e relationship:
chain Leghemoglobin
segm , evolved in AA
ents sequence but
of structures of
alpha hemoglobin
-beta and myoglobin
remain similar
Shap mono Twisted, Alpha Beta
e mer snaking -helix pleat
random ed-
coils shee
ts
Stru Alpha Beta
ctur -helix pleat
e ed-
shee
ts
Paral
lel or
Anti
paral
lel
R 20 Protr
grou kinds ude
ps abov
e
and
belo
w
shee
t
Buil Translati
ding on and
proc peptidyl
ess transfer
ase
Inter Covale H- H- Amino acid
actio nt bond bond side chains
n bonds betw betw Intrachai
with een een n
in pepti pepti disulphid
stru de- de- e bridges
ctur bondi bond
e ng ing
carbo carb
nyl O onyl
on O on
one one
AA AA
and and
H on H on
amin amin
o o
grou grou
p on p on
differ diffe
ent rent
AA AA
n H- H-
bond bond
s s link
with adjac
amin ent
o stran
acid ds
n+4
Perio
dicity
= 3.6
resid
ues
per
turn
Inter Peptid Depe Depe
actio e nd of nd
n bonds side on
bet betwe chain side
wee en s chai
n amino ns
stru group
ctur and
es carbox
yl
group
Exa Influenza virus Three Transcriptio
mpl Hemagglutinin tertiary of n initiation
e subunits -HA2 hemagglu complex
fibrous domain tinin
and HA1 heterodi
globular mer
domain subunits
interact
together
o Disulfide bonds:
• Intrachain: contribute to tertiary structure
• Interchain: Contribute to quaternary structure
Motifs Domains
Description Combination of sec Characteristic 3D structures
structures forming
distinct local 3D
structure
Level of Structurally Structurally independent- contain
dependence dependent -Few sufficient # bonds to hold domain
structure together when independent from the
maintaining bonds rest of the protein
Rest of protein
contributes to
stability
Size, shape and Small local Large
structure structures Compactly folded
Characteristic AA
sequences
Composition Secondary Made of various motifs
structures
Types
• Macromolecular assemblies
• Protein folding and misfolding
o Native and denatured proteins;
• Proof 3D structure is determined by AA sequence
▪ Native conformation -- add urea --> denatured -- remove
urea --> renature through spontaneous refolding in folding
pathways(but sometime misfolded)
▪ Determined 3D structure is determined by AA sequence
o folding pathways,
• Synthesized from N-terminus to C-terminus
• N-terminus begins to fold before C-terminus is synthesized
o Chaperones: proteins that help guide protein folding along productive
pathways by permitting partially misfolded proteins to return to the
proper folding pathways
• Recognize misfolded proteins by hydrophobic patches exposed
• Functions:
▪ Fold newly made proteins into functional conformations
▪ Refold misfolded proteins
▪ Refold unfolded proteins into function configuration
▪ Disassemble toxic protein aggregates
▪ Assemble and dismantle large multiprotein complexes
▪ Mediate transformations between inactive and active
forms of proteins
• Types:
▪ Molecular chaperones: operate as single molecules
• HSP70: helps newly-synthesized proteins follow
correct folding pathway
• HSP: heat shock protein: because up
regulated under conditions that denature
proteins = high heat
• Mechanism:
• Bind to exposed hydrophobic
residues of nascent polypeptides
• Protect from aggregation until
properly folded
Undergoes concerted
ATP-binding/hydrolysis
Cycle of client and conformational No cap, lid is
protein binding changes integrated into
and chaperonin
conformational Unfolded protein binds Twist to open and
change associated in chamber close
with ATP binding ATP + cap -> closed
and hydrolysis conformation, folding open configuration,
protein with the misfolded protein
ATP bound to chamber enters through open
nucleotide ATP hydrolysis: rate lid, binds inside
binding site: open determining step(causes
configuration, so GroES lid to bind to ATP binding and
misfolded protein other chamber) hydrolysis -> closed
binds to SBS ADP + GroES lid leave -> conformation, client
properly folded(or protein folding
ATP incompletely folded)
hydrolysis(DNAJ/ protein released Release ADP and P -
HSP40)--> closed > open
configuration conformation,
release protein,
ADP bound: reset chamber
protein folding,
ADP release, ATP
binding
(GrpE/BAG1) -->
open
configuration
• G-protein
• "on" GTP bound
• Switch "on" to "off" wit GAPs
• "off: GDP bound
• Switch "off" to "on" with GEFs
• Phosphorylation/dephosphorylation: (important in cellular signal transduction
pathways)
• Two-dimensional:
• Mechanism: isoelectric focusing followed by SDS PAGE
• There is no relationship between isoelectrip point and molecular
weight
• Separating by SDS PAGE lets us see the different proteins that
may have had the same charge in isoelectric focusing
• Method:
• Isoelectric focusing: separate in first dimension by
charge(isoelectric point)
• Place vertical isoelectric strip parallel to 2D gel
• SDS PAGE: separate in second dimension by size
• Mass spectrometry
▪ Mechanism: High precision analytical(non preparative) method of
determining charge to mass ration of ionized molecules
• Each amino acid and oligopeptide has a characteristic molecular
weight
• Molecules carrying single charge have molecular weight equal to
m/z
▪ Method:
• Electrospray ionization: produce dispersed gas-phase ions
• Fragment by high-energy collision with an inert gas
• Separated in the mass analyzer into separate populations
of differing m/z
• Measure acceleration of ions in an electric or magnetic field
• Acceleration depends to charge-to-mass ratio
▪ Tandem MS method:
• Method:
• First step: gives m/z of all fragments of polypeptide
• ions(carry electric current) accelerated through
electric or magnetic field
• Ions collide with a detector
• Acceleration determined
• m/z determined
• Second step: gives m/z of smaller fragments of target part
of original polypeptide
• Consider m/z of fragments determined in step 1,
• Decide what particles get destroyed based on m/z,
• Decide what fragments you are interested in based
on m/z
• Redirect beam of interest by altering
electric field to isolate it by its m/z
• Destroy the rest
• Isolated target fragment is destroyed on
collision
• Determine charge-to-mass of its sub
fragments
• Product ion spectrum
analyzed computationally
with respect to known
protein sequences(based on
computer translations of
genome DNA sequences)
• Identify amino acid sequence of the
peptide ion
• Characteristics:
• Fragmentation is partial and random: one or few peptide
bonds per molecule break
• Liquid chromatography,
▪ Function:
• Separation of components based on their differential interactions
with an immobile(solid beads) material
▪ Structure:
• Mobile phase(aqueous buffer in proteins) moves continuously
past the solid phase
• Proteins molecules
• Usually done in columns
▪ Steps:
• Separation based on differential interactions with immobile phase
• Mobile phase(usually aqueous buffer) moves past solid phase
• Proteins moved along in mobile phase
▪ Different rates depending on protein interactions with
solid phase
▪ Types:
• Gel filtration: separate by size
Topics:
DNA replication
• Polymerization:
o Alpha phosphate of incoming deoxy nucleotide triphosphate(dNTP)
reacts with the 3' hydroxyl group(OH) in the growing DNA strand
• Primer and primase: DNA polymerase cannot initiate synthesis of a new strand,
only elongate it. Primase makes a primer complementary to the DNA sequence
from which DNA polymerase can begin to elongate the strand, adding DNA
nucleotides
• Replication protein:
o RFC loads PCNA clamp onto DNA template and DNA polymerase
o PCNA is a homotrimer protein that acts as a clamp, preventing DNA
polymerase epsilon or delta from separating from the template
o Large T-antigen(looks like a tangerine): is a hexamer helicase, 6 slices,
encoded by the viral genome and serves to unwind DNA helix at the
replication fork
o Primase/polymerase alpha: DNA polymerase alpha extends the primer
made by primase with DNA nucleotide before being switched out for
polymerase delta/RFC/PCNA complex to hold polymerase to template
strand as it adds nucleotides
o Ribonuclease H and FEN-1 displace RNA at 5' end of Okazaki fragment
• Polymerase delta replace RNA with DNA
• Fragments are ligated with DNA ligase
• Replication origins are rich in A and T
• Sequence of replication:
o Unwind helix: helicase
o Leading strand primer synthesis: Primase/DNA polymerase alpha
o Extension of primer: Polymerase epsilon/RFC/PCNA replace polymerase
alpha/primer
o Further unwinding: large T proteins helicase
• Bind RPA to single stranded regions
Constant DNA damage
DNA sequencing
• Dideoxy Chain-Termination Method of DNA sequencing (classical Sanger
sequencing)
o Function: verification of individual results
o Setup:
• 4 tubes all containing
▪ DNA polymerase
▪ Oligonucleotides primer: that anneals to one end of
fragment to be synthesized
▪ DNA template
▪ dNTPs (100 mM)
▪ Chain terminator: 1 ddNTP( ddATP, ddGTP, ddTTP, ddCTP)
• No 3' OH, just H
• Synthesis stops because OH of 5' phosphate cannot
bind, so chain stops growing
o Limitations:
• Polymerase only runs for 300-500 nucleotide sequences, so gels only
resolve that much. Do multiple short sequencing reactions to sequence a
large region
• Difficult to differentiate 300-500 by size
• Rate of sequence production is limited by total reactions that can be
performed at one time
• Next-Generation sequencing (NGS)
o Function: allow single sequencing instruments to carry out millions of
sequencing reactions simultaneously
o Short read sequence: few nucleotides at once
o Procedure: 1 day long
• Ligating the same linkers to a mixture of DNA fragments
• Denature DNA
• Anneal DNA to complementary primers anchored to a solid
support
• PCR: amplifying the DNA fragments in a fixed spatial arrangement
• Cut one strand of double DNA:
• Sequence the left over strand with fluorescently labeled dNTPs(different colours
per base)
• Imaging and removal of fluorophore after each cycle to avoid mixing of colours
• Nanopore technology
o Procedure:
• Single stranded DNA binds to motor protein
• Motor protein pulls DNA strand through it
• DNA goes down into pore
• Electrical current goes through the pore
▪ Causes changes in current
▪ Current changes recorder
o Advantages:
• Sequencing single molecules makes studying new biological
questions possible
• Long reads mean les need of mapping of repetitive sequences
• Portable: small like a laptop
• Sequencing Genome:
o Assembly of whole genome sequences: challenge
• For complete clone map, repeat 30 times
• Create aligned library
• Align properly
• Recombinant DNA
▪ Insert genomic DNA at sticky ends
▪ Ligate into circular plasmid with T4 ligase and 2 ATP
▪ replication
• Microarray
▪ Stable
• In Genomes:
▪ Retroviral vectors
• Vector Plasmid
• Packaging Plasmid
• Viral coat Plasmid
• DNA libraries: permanent collections of genes
o Genomic DNA: chromosomal DNA
o cDNA: reverse transcribed from mRNA
• Synthesis and cloning
• Genome: The entirety of an organisms hereditary information
o Composition: Mostly DNA(some viruses have RNA)
• Measured in base pairs (bp, kb, Mb)
• Larger genomes are often because of more transposable elements
o Biological complexity
• Unrelated to content of genome
• Molecular cloning by dilution and transformation;
• Recombinant protein expression;
• Replication origins,
• Antibiotic resistance,
• Inserts and vectors;
Genes and genomes, transposable elements
o Genes: can be considered as transcription units
• Exons: coding region or open reading frame
• Control regions: promoter and cis-regulatory factors
• Introns: separate exons and are spliced out during mRNA processing
o Transposable (mobile DNA): move within genomes
• DNA transposons:
▪ Increasing copy number of DNA transposons:
• Move transposons from region that has been replicated to
region about to be replicated to give extra transposon to
one daughter chromosome
• Can carry unrelated flanking sequences with them
• Retrotransposons more common the DNA transposons in humans
▪ Has a RNA intermediate -> reverse transcriptions
▪ LTR: long terminal repeats: protein coding regions encoding
reverse transcriptase, integrase and more
• A lines: non-viral DNA retrotransposon
▪ AT rich region, protein coding region and target site direct repeat
▪ Propagation of line:
• DNA is cut at specific site
• Line with complementary bases binds and DNA grows
complementary to the line
o BLAST: finding nucleic acid and protein sequence similarities
• Proteins with similar functions often have similar AA sequences
o DNA content and gene number in different species
• DNA varies more then proteins among a species
• Differences in genome sizes are mostly due to different numbers of non-
coding regions and transposable elements
• Greater gene density in lower eukaryotes than in more complex
eukaryotes
o evolutionary homology and sequence similarity, Evolutionary relationships:
• Orthologs: same protein in different species
• Paralogs: closely related proteins in the same species
o gene families
• Related genes formed by the duplication of an original single-copy gene
make up a gene family
o Solitary or single-copy genes: represented once in the genome
o Single-sequence tandem array: DNA fingerprints
o Simple-sequence repeat: used for paternity tests or identification of criminals
because they are unique to individuals
• Microsatellite DNA:
▪ Found in transcription units
▪ Expansion: several neuromuscular diseases(mytonic dystrophy,
spinocerebellar ataxia)
▪ Short repeated sequences can generate backward slippage during
replication
• Minisatellite DNA:
▪ Often in centromeres and telomeres
o Long tandem arrays of repeated sequences: non-coding sequences in
multicellular organisms
o repetitive DNA elements;
o mobile elements
o From Integrated retroviral genomic DNA to retroviral genomic RNA
Chromosomes
• Chromatin loops; histone proteins
• origins, centromeres, telomeres; required for replication and stable inheritance
of linear chromosomes
o ARS: origin of replication of yeast
• If absent: no plasmid replication
o LEU: leu gene without leucine
o CEN: dna sequence for chromosome centromere
• Without it, mitotic segregation is faulty
o Yeast must be linear: add telomeres so it can be
• Centromeres connecting to spindles: CENP-A -> CBF3 -> Ndc80
• telomerase
RR1-RR16
Quantitative analysis:
o Determine the levels of gene products (tumour markers, p53, BRCA 1-2)
• Diagnostics context
Molecular probes:
Labelled oligonucleotides using polynucleotide kinase
o Require a known sequence corresponding to a gene product of interest
Analysis of mRNA
Northern blot: mRNA
1. Electrophoresis in Agarose gel- diagnostics
• Must denature before migration and maintain denaturation conditions
during migration since RNA twists and folds into weird energetic
structures
▪ Heat to eliminate secondary structures
▪ Denature in formaldehyde buffer integrated into the gel
• Run through gel:
▪ Separated by sizes corresponding to the various genes that
encode them
2. Transfer to solid state support
▪ Phases in PCR
• Ground phase
• Exponential phase
• Linear phase
• Plateau phase
• Greater mRNA in sample = Greater starting cDNA = faster
the plateau is achieved
• Less cycles
• Lower mRNA in sample = Lower starting cDNA = slower the
plateau is achieved
• More cycles
▪ Compare curve with standard curve of varying conc samples
• Standard curve has all sample go exponential
▪ Get a relatively accurate value of amount of cDNA in original sample, thus
how much mRNA was present in the specific sample
▪ Good to quantify the regulation of one specific transcript
• Not so much a global view
• More precise
cDNA libraries: DNA based representation and abundancies of mRNA present in
initial sample
1. Purify RNA from a given sample(mRNA has poly A tail)
2. Prime mRNA with single stranded poly/oligo dT primer
3. Reverse transcribe into cDNA by growing from the dT primer
**Single stranded cDNA strand for every RNA in the given same**
4. Add alkali to remove RNA from cDNA
5. Add poly dG tail to cDNA 3' end using ligase
6. Hybridize the 3' end of cDNA where the polyG tail is with oligo-dC primer
7. Grow DNA from poly dC primer:
i. Generate a second strand of DNA--> double stranded DNA
**cDNA library gives permanent representation of all RNAs and their
abundancies in the original sample**
8. DNA polymerase I progresses through any remaining hybrid regions and extends
the second strand
TRANSCRIPTION
Overview of transcription
Conventions/components of template
o Transcription synthesizes pre mRNA having RNA polymerase II read a DNA
template strand form3' to 5' while adding rNTPs to elongate a complementary
RNA strand. This complementary RNA strand is the same as the non template
DNA strand but with Us instead of Ts, and is referred to as a pre mRNA as it has
not yet been edited.
o Regulator regions:
• Promoters:
▪ Can be upstream or down stream
▪ Comprises sequences that regulate efficiency of transcription
• Coding sequence:
▪ 5' etr and 3' etr
Transcription proteins: RNA polymerase
o RNA polymerase I
o RNA polymerase II
• Must be vary faithful and not fall off mid way through a 24hr
transcription
• Reads the template strand that goes 3'-5'
• Advances at a rate of 1000-3000 nt/min
o RNA polymerase III
Stages of Transcription
o Initiation:
• Polymerase binds to the promoter sequence,
• locally denatures the DNA,
• catalyzes the first phosphodiester linkage
o Elongation:
• RNA polymerase advances 3' to5' down the template strand, denaturing
the DNA and polymerizing the RNA
• Polymerization is favored:
▪ High energy bond between alpha and Beta phosphate is replaces
by a lower-energy phosphodiester bond
o Termination:
Prokaryotic transcription
Function/mechanisms
o Sigma factors
• Confer specificity to RNA polymerase
• Ensure efficient transcription rate
o DNA binding protein
• Regulate the relate of RNA synthesis
▪ Enhance RNA polymerase binding to promoter region
▪ Inhibit/impede RNA polymerase binding to promoter region
o Lots of allosteric regulation controlled by catabolites
▪ Means of adapting
o Polycistronic code:
▪ Make multiple RNA and one gene from it
▪ One protein coming from different rNA
Transcription proteins: RNA polymerase
• Types:
▪ Polymerase I:
▪ Sensitivity to toxin alpha-amanitin: not affected
▪ Polymerase II
▪ Sensitivity to toxin alpha-amanitin: highly sensitive
▪ Polymerase III
▪ Sensitivity to toxin alpha-amanitin: Slightly sensitive
• 3D structure
Eukaryotic Transcription
Function: Especially important in development
• Distinguish cell types that will differentiate and make us
• Mostly done during embryogenesis
• Monosystronic modes of transcription:
▪ make 1 RNA and result in one protein
Polymerases
• RNA polymerase I
• RNA polymerase II
• RNA Polymerase III
Structure: 3D RNA polymerase
• Exist in multimeric complexes
• Are similar to bacterial sub units
• Function:
• Act at a distance, sometimes kilobases away from their regulatory targets
• Chromosomal regions far away on the sequence but close upon
looping
• Loops are often associated with active transcription
• Enhancers may help to generate, stabilize and increase the rate of
transcription within loops, even if they are linearly far apart
• Examples:
▪ 3 promoters for the expression of Pax6 that function in different cell
types and at different times during embryonic development
o Identification of transcriptional regulatory regions
Method 1
Method 2
• Identification of cis-acting regulatory sites through linker scanning mutations
▪ Reporter genes: relative quantification of transcriptional efficacy
• Scatter the sequences of specific overlapping regions of one gene
and see which are likely to have important regulatory sites
• Examples:
• GFP
• B galactosidase (lacZ)
• Thymidine kinase (tk)
• Luciferase (luc)
• Chloramphenicol acetyltransferase (CAT)
• Method:
• Cross-linking proteins to other proteins or DNA in living cells by adding
formaldehyde to the media
• Isolate and fragment cross-linked chromatin into lengths of 2-3
nucleosomes(~300 bp)
• Immunoprecipitation of fragmented DNA using antibody specific to a
target protein
• DATA: # times per million bases that were immunoprecipitated a specific
sequence from a region of the genome was identified
Other transcription-control elements located near transcription start sites
Enhancers: located far from the gene they regulate
o Regulate cell-type specific transcription and how frequently specific genes are
transcribed
Preinitiation complex(PIC)
General transcription Factors
TFIID
o First protein to bind to a TATA box promoter
o Subunits:
• 1 TBP:
DNA-binding activity
Identify when DNA binding occurs
Electrophoretic mobility shift Assays(EMSA): identifies DNA binding activity(but
not specific sequence that is bound by the proteins) Depends on the migration of
DNA through gel and how it is effected by binding to proteins
o Probe:
• Radiolabelled dsDNA segment
▪ 5' end labelling with oligonucleotides corresponding to cis-acting
elements
Or knowing double stranded sequence,
• Label double stranded molecule formed in PCR = double stranded molecules
o Forming Protein:DNA complex
• Proteins interacting with sequence specific DNA
o Run mixture of protein and free DNA probes through non-denaturing
polyacrylamide gel (Positive control)
• Running free DNA with full nuclear extract proteins it may or may not
recognize
•Free probes go a given distance, but shift is different when it is carried
with a protein complex. Conformational change dependent on
interactions with proteins
o Chromatography to separate nuclear extract proteins into fraction
• Fractions have different compositions of proteins depending on the
nature of the chromatography
• Run the gel for each nucelar extract proteins
o Note: synthesis of oligonucleotides bound together to make double stranded
substrate
o Evaluate whether a protein can interact with a given DNA segment
• Test for the presence of a nuclear protein by taking a small volume and
mixing with DNA probe, then the associated protein should give rise to a
shift on the gel
• Where there are bands on the gel, a protein complex present in fraction
1, 7 and 8 are bound to a DNA probe
Cotransfection:
Test DNA binding transcription factors
When co transfected, there should be strong expression of the promoter genes
Mutation sequences so expressed protein no longer interacts properly
Function of promoter is dependent off factors
Function
o Recognition Helix: Alpha-helix domain
• Recognize specific DN Abases within that DNA region
• Due to + AA in the region, interactions are favored through associations
with electronegative phosphate
o Non-covalent binding
o Interaction with the major groove of DNA
o Structural characteristics
Modular structure
o Most transcription factors have multiple domains that each perform distinct
functions
• Example: GAL4 transcription factor from yeast( critical for utilizing
Galactose)
Transcription factors almosta always are associated to another domain which directly
impacts the transcription
o Domains for:
• DNA binding
• Transcriptional activation
▪ Do not yet understand their structure
• Relatively unstructured
• No sure how function is caused by structure
• Transcriptional repression
• Chromatin remodelling
• Nuclear import
• Protein interactions
Protein Motifs
o Homeodomain proteins: present in many transcription factors
• Function
▪ Domains can interact with DNA
• Types
▪ C2H2 types:
• Usually contain three or more finger units and bond to
DNA as monomers
▪ C4 types: 4 cysteines coordinated with zinc ions to give rise to
fingers
• Usually contain only two finger units and bind to DNA as
homo/heterodimers (Steroid receptors)
• Typical of glutical hormone/receptor
▪ C6: 6 cysteines
• Zinc finger transcription factor: variation wherein six
cysteine metal ligands coordinately bind two Zn2+ ions
The Mediator
Discovery
Researchers were wondering how DNA binding transcription factors could have
an affect on transcription with their intrinsically disordered or disordered
unstructured activation domains. They knew that these DNA binding
transcription factors would bind to enhancers or proximal-promoter elements,
but weren't sure how this affected transcriptional efficiency.
The mediator was first discovered in yeast, but a homolog was found in humans
Function
Mediators are behind the efficiency of transcription when DNA binding
transcription factors bind to enhancers and proximal promoter elements. This
binding takes place through interactions between the DNA binding domain and
cis acting elements. It is important to note that the mediator complex is found in
a larger complex containing RNA polymerase which is referred to as a
holoenzyme.
The mediator complex has 31 sub units but can be separated into 3 major
domains, the middle, head and tail.
Tail domain
The tail region of the mediator interacts with transcription activation domain of
transcription activators.
Subunits of the mediator will interact directly with DNA binding transcriptional
activators which bind to DNA regions through the cis acting elements along with
RNA pol 2. And are independent in the sense that a mutation in one doesn’t
directly affect the rest of the mediator or overall transcription but might
effect/disable specific transcriptional activation whether it be the binding of
transcriptional factors to promoter-proximal elements or enhancers.
Head domain
The three subunits of the mediator are associated with various transcription
factor activation domains which help mediate the complexes function. Through
these interactions, the mediator complex bridges vast sections of chromatin to
enhance transcriptional initiation and also ensures that RNA polymerase 2 binds
optimally to initiate transcription.
The looping of chromatin allows regions normally far from each other in its linear
form to be closer together. The mediator will mediate the effects of enhancers
and their binding proteins on RNA polymerase II.
Transcriptional Activation/Initiation
Structure of DNA in transcription
We explain transcription in a linear manner to simplify it, but this is not the form in
which it takes place.
Highly transcribed Genes
What is the cause for more highly transcribed genes?
This question can be answered by looking into the dynamics of transcriptional
initiation and elongation
Transcription efficiency: identify the particular RNA and how much is being
produced by introducing a secondary structure such as a stem loop in the 5'
region of a trans gene X such that a protein tagged with GFP can recognize
and bind to it.
When gene X is transcribed and makes RNA the stem loop structure will fold
up into it proper configuration and the GFP tagged proteins should be able to
bind to all the stem loops of all transcribed RNAs
Embryonic development in the fly-Drosphilia
Gastrolation: cells change in shape and this results in morphological changes
in the embryo
The relation between the effectiveness of the enhancer and the efficiency of
transcription is in how stronger enhancers lead to a greater frequency of bursts.
Mechanism
The droplets aren't actually just present in the posterior end, that is just where
they condensate. Rather, in the anterior end, the droplets are soluble, and this is
why they are not seen.
Formation of condensates
The MED1 subunit of the mediator has an intrinsically disordered region that
contributes to its function. Researchers inserted a variant of MED1 with its
intrinsically disordered region along with mCherry into a cell. These MED1
subunits form punctas(aggregate together) made of thousands to millions of
copies of MED 1 which don’t allow for the penetration of mEGFP. However,
when a chromyl domain activator with an intrinsically disordered region
called BRD4-IDR tagged with mEGFP is added to the MED1-IDR tagged with
mCherry, an overlap can be seen. This overlap is due to co-localization since
both MED1 and BRD4 have sticky intrinsically disordered domains.
Elsewhere, they would be soluble, but when they begin to aggregate
together due to their IDR, they attract more and form larger and larger
liquid-liquid condensates
Structure
Proteins that would normally come together congregate in these liquid-liquid
condensates so that they can preserve their function.
Labelling the mediator and RNA polymerase II, the two would at times co-localize,
suggesting the activation of transcription of action genes, but not all the time.
Proteins forming in loops requiring mediator and RNA polymerase II activity to
activate the associated genes would form condensates along with other
macromolecules. Condensates form around these protein loops where all macros
can concentrate, carry out their functions and dissociate, which illustrates how
mediator and RNA polymerase II separate. This is the basis of BURSTS.
Establish condensate based on mediator and trans acting factors that loop out the
chromatin, concentrating together to bring out RNA polymerase II and general
transcription factors until TFIIH melts DNA and fires RNA polymerase II.
Destruction of condensates
Epigenetic marks are reversible chemical modifications to DNA that allow its genes
to be expressed in different ways. These marks, like H3K4 or H3K9 methylation,
can activate or repress the expression of a given gene and are often inherited
following cell division.
Epigenetic Erasers are what makes epigenetic marks reversible, that is, not
permanent
Methylation
The lysine residue of a histone is methylated, Me 1, Me2 or Me3, at the nitrogen atom
of the terminal epsilon group of its side chain. Whether mono-, di- or tri-methylated, the
lysine will hold a single positive charge. Methylation of histone's lysine residue has a
slower turnover which means it is more beneficial to do so as a post-translational
modification so as to propagate epigenetic information.
Methylation
H3K4 (Histone 3 lysine 4)
- Mono-Methylation in enhancer -> Activation
- Di
- Tri-Methylation in promoter region -> Activation
Demethylation
The methyl group(s) on a lysine residue can be removed using histone lysine
demethylase
Ubiquitinylated lysine
Example: proper methylation of histone H3 lysine 9 during chromosome
replication(8.6)
The replication of a parent DNA molecule with H3K9 methylations will result in two
half methylated daughter chromosomes. This is where histone methyltransferase
HMT steps in as both an epigenetic reader and writer. HMT will recognize the
methylated H3K9s and identify the neighbouring naïve H3K9s with . Then, it will
catalyze the methylation of all of the unmethylated H3K9, ensuring that all
histones have the right tag.
The chromo domain of the repressor complex binds to H3K9me3, the tri-
methylated histone 3. The binding of the chromo domain leads to the recruitment
of corepressors. Chromatin condenses and heterochromatin is formed, repressing
transcription.
Repressor-directed Histone deacetylation complexes(HDAC)
The repressor Ume6's DNA binding domain (DBD) interacts with an upstream
control element called URS1. Its repression domain RD binds to Sin3 of the
multiprotein complex including the histone deacetylase Rpd3. The
deacetylation of the histone's N terminal tails on nucleosomes near the
Ume6(repressor) binding site(on URS1) will make the histones highly positive
and drawn to the opposite negative charge of the DNA backbone, closing
down the chromatin in those regions. The closed down chromatin will inhibit
the binding of general transcription factors at the TATA box and result in the
repression of gene expression.
Silencer sequences
In the silencer region, enzymes cannot access the DNA to interact with it
suggesting there must be a physical barrier, likely something to do with
histone mutations.
RAP1: RAP1 is the first DNA binding protein that recognizes sequences in the
telomere and silencer regions and acts as a transcription factor. These
regions can be HML or HMR of the mating type loci, or telomeric sequences.
Once bound to this region, RAP1 recruits the SIR proteins through
protein:protein interactions and these SIR proteins 2,3, and 4 are also drawn
to the hypoacetylated histones I the region surrounding RAP1.
* SIR: silent information regulator,
SIR1: the SIR1 works with RAP1 and is involved in binding the silencer region
to the loci to be silenced(The specific mating loci type in this case)
SIR2, 3, 4: The SIR2, 3, and 4 join due to recruitment by protein-protein
interactions RAP1 at the silencer or telomeric region and make a complex
around the DNA to be silence. These complexes will be added to the N-
terminals of deacetylated histones H3 and H4.
SIR2, the histone deacetylase will remove the acetyl groups to leave histone
tails bare and favor chromatin condensation. After SIR2, SIR3 and SIR4 sit on
the telomeres to form a higher order complex.
EX: Telomeres
Identifying factors required for repression through in Situ
hybridization/immunofluorescence of the silent mating type loci. Figure
out where the telomeres are and carry out immunofluorescence, you
see SIR3 overlapping with the telomeres. Meaning there must be SIR3
at the telomeres, and they must have something to do with them being
silenced.
Phosphorylated
serine/threonine
Methylated arginine
Methylated lysine
Ubiquitinylated lysine
Pioneer transcription factors are the first transcription factors on site and
have the ability to recognize their target sequences even in the compacted
state of chromatin. That is likely because these sequences are on outer
surface of the nucleosomes making them easily accessible. These pioneers
are of significant importance with regard to the gene expression taking place
during embryogenesis, activating the transcription of specific genes in charge
of differentiating between cell types. Their binding to the DNA leads to the
transcription factor cascade. They also recruit coactivators that with the use
of free energy, will modify histones and confer configurational changes. This
means the chromatin will loosen up and give space to the mediator complex.
The Mediator complex is recruited to the site of transcriptional initiation
The mediator recognizes and binds to the transcriptional activation
domains, loops the chromatin and recruits RNA polymerase II to perform
transcription during embryogenesis, a time when the chromatin is highly
compacted and transcription is at a minimum.
Diagram:
H3K4 mono-methylation is present a little bit throughout with
specific peaks. H3K4 mono methylation is associated generally
with enhancers. Very often, you can identify where specific
enhancers are in the genome by carrying out a chip seq
experiment with antibodies that recognize this histone
modification.
H3K4 di-methylation is associated both with enhancers and
active regions around proximal promoter elements and even the
start sites. The pattern hor H3K4 dimethylation differs from
mono-methylation, with clear peaks that are shared in certain
circumstances, but the occasional peaks that reach significantly
higher.
H3K4 trimethylation identifies regions around active promoters
right around the start site in general. There's only 1 peak on this
chromosome where you have this presumably active transcription
going on.
Large subunit
The large subunit of RNA polymerase II is unique among the RNA polymerases as it
contains a carboxy terminal domain, CTD, an identifying factor of RNA polymerase
II. In humans, the unique CTD YSPTSPS has 52 repeats, and in yeast, approximately
26.
Note: YSPTSPS is a heptapeptide, the TFIIH protein kinase molecule
phosphorylates the serine 5 of the heptapeptides during initiation, and later on, a
second phosphorylation happens on serine 2.
The RNA polymerase will undergo initiation, pausing, in which capping enzymes
are added, and elongation, in which the mRNA strand grows. It transcribes about
100 nucleotides before pausing.
Initiation
Pausing
The pausing of RNA polymerase is an essential step in transcription as it allows for
a change in the factors that block elongation, that is, NELF, for the those that
enhance it, say DSIF, SPT6 and PAF.
But how does the polymerase actually stall?
This pause depends on the phosphorylation of the CTD, a process mediated by the
CDK9/P-TEFb. Allows for the protection of mRNA. The presence of the two
negative factors slow and pause RNA polymerase II near the first nucleosome, DSIF
and NELF. Note that phosphorylated DSIF causes the closing of the clamp.
Elongation
mRNA
Pre-mRNA to mRNA
Overview of splicing
Eukaryotic genes code for both introns and exons, so pre-mRNAs, the primary
mRNA structures are composed of both introns and exons. To transition to mature
mRNA involves the splicing out of introns from the pre-mRNA to achieve mature
mRNA containing only exons. Although they are spliced out, introns do serve a
purpose as they can encode regulatory information.
Introns
In pre-mRNA, at the boundary of the 5' end of the intron is a GU. At the boundary
of the 3' end of the intron next to the 5' end of the adjacent exon always has an
AG and 20-25 bp upstream, a branch point at an A nucleotide. The GU and AG
ends at the 5' and 3' splicing sites as well as the A at the branching point are all
very conserved.
Spliceosome
The spliceosome is composed of 5 snRNPs and 6 to ten proteins. snRNPs are small
nuclear ribonucleoprotein particles made up of snRNAs or small nuclear RNAs, U1,
U2, U4, U5 or U6. snRNAs are essential to splicing and each have their designated
function. The snRNPs pronounced snerp, are determined by their snRNAs.
Spliceosome cycle
1) U1 and U2 snRNP
U1: contacts the intron's 5' border (3' end of upstream exon)
Base pair happens at exon 1 and pre mRNA
It must be noted that snRNAs will require interactions with other
RNAs for splicing to occur efficiently. RNA:RNA pairing is critical
for U1 functioning, therefor, when there is a mutation at the
splice site of pre-mRNA, splicing is blocked. To restore splicing, U1
snRNA will undergo a compensatory mutation so as to bind to the
mutated portion of the pre-mRNA splicing site and restore splicing
altogether.
3) U1, U2 output
U1 and U4 exit the complex leaving the active spliceosome behind with
U2, U5 and U6
4) Transesterification reactions with no net expenditure following the
spliceosome formation
Reaction 1
The hydroxyl group of the residue at the branch point attacks the 5'
phosphate group of the first intron residue (G) leading to the formation
of a lariat.
Reaction 2
The free 3' end of exon 2 attacks the 5' phosphate of the first residue of
exon 2, resulting in the joining of the two exons and the release of the
intron lariat and a grouping of the two spliced out exons connected by
the phosphate 2.
5) unlooping lariat intron with debranching enzyme to form a linear intron
RNA
The linear form of RNA shown here must be spliced to give rise to linear form
Treating the entire reaction with substances that eat out the proteins -> as long as
Mg is added, RNA can splice itself
2 groups:
Group II introns are self-spicing introns that form a structure very
similar to that of the spliceosome. Group II introns are only present in
the mitochondria and chloroplast genes, but they may be the
evolutionary predecessors of other introns.
RNA PROCESSING
RNA Binding Proteins
RNA binding proteins are made up of different RNA-binding domains that are
capable of binding to RNA through their shapes and opposite charges to contribute
to some of the RNA processing. RNA binding proteins are essential for example, in
setting up the splicing apparatus, deciding where it will sit on the transcript and
where it will begin to splice.
RRM domains are made up of beta-pleated sheets with positively charged residues
that will interact with the negatively charged RNA
The polypyrimidine tract binding protein contains RRM domains, allowing it to
interact with the conserved polypyrimidine tract in the introns slightly off the 3'
side of the branch point adenosine.
RNA binding proteins are essential in directing the splicing apparatus to where it
will sit and begin splicing. The whole complex is referred to as the cross-exon
recognition complex made up of SR protein:protein/snRNP interactions.
U2AF
Small subunit of 35 kd interacts with the nucleotides around the 3' end
of introns including the AG dinucleotide.
Large subunit of 65 kd interacts with sequences around the
polypyrimidine region which will define at least the 3' end of the intron
to be spliced out
Overall, U2AF helps with splicing efficiency
SR proteins
SR proteins are RNA binding proteins with RRM domains and protein:
protein interaction domains that identify the location of exons by
interacting with exonic splicing enhancers within the exons themselves.
SR proteins are rich in serine and arginine which will help them bind
and cover exons. The covering of the exons by SR proteins helps U2AF
locate the AG dinucleotide at the 3' splice site, U1snRNP bind to 5'
splice site and U2snRNP bind to the A of the branch point.
U1 SNRP
Uses the information provided by SR proteins to locate the 5' end of the
intron so it can sit on the GU dinucleotide
Exonic splicing enhancers are sequences within the exon that promote exon
joining during splicing
Alternative splicing
Alternative slicing is a temporal and tissue specific splicing method that can result
in different gene products and different proteins with different properties. In
alternative gene splicing, we start with the same gene, transcribe it to get preRNA,
but then, depending on the location and the desired characteristics of the protein
to be translated, will choose to keep or splice out certain parts of the RNA.
In alternative splicing, products can be traced back to the DNA that encodes them,
and nothing has changed, the sequences are consistent.
Sex-lethal gene
The need for dosage compensation is why females inactivate their second X
chromosome. Male and female flies are really quite different and have
sexually dimorphic characteristics that differentiate them from one another.
These differences in sexually dimorphic characteristics are controlled by a
cascade of RNA binding events that result from alternative splicing, in the
case of sex determination in drosophila, by the RNA binding protein Sex
lethal.
Sex-lethal (Sxl) is under transcriptional control, that is, expressed only in
females in early embryogenesis. Later in development, the female specific
sex-lethal promoter is repressed and a late Sxl promoter is activated in both
sexes. The later pre-mRNA must undergo alternative splicing, which will only
be done appropriately and yield functional Sxl in the presence of the early Sxl
proteins.
Cascade
The activity of a single RNA binding protein, Sex-lethal, drives a RNA binding
protein cycle making sex-specific double sex transcription factors and
conferring more sex-specific characteristics lading two flies of two different
sexes.
Cleavage
Capping
Polyadenylation
All mRNAs except histone mRNAs are polyadenylated, and all mRNAs, except
histone mRNAs, that lack a poly(A) tail are rapidly degraded within the nucleus.
Histone mRNAs have unique secondary structures in their 3' UTRs and although
they lack poly(A) tails, will not be degraded.
Processing
Not spliced, but transcribed spacers are removed.
RNA polymerase I makes a long pre-rRNA which will be processed and cleaved
resulting in an 18 S RNA, 5.8 S RNA and a 28 S RNA
Transfection
If you take a portion of DNA encoding our rRNA, that is, the repeat segments, and
introduce it to the cell of drosophila, you introduce a trans gene that will integrate
into its genome. The integration of these segments means RNA polymerase I will
transcribe the rRNAs and begin to form liquid-liquid condensate, recruiting other
macromolecules to grow to become a nucleolus.
tRNA
Transcription
Transcribed by RNA polymerase II
Processing
The 5' green region is removed from all pre-mRNA and in certain circumstances,
other segments are spliced out(but not always). Then, the short purple segment
on the 3' end is removed and replace with the CCA which will later partake in
aminoacyl tRNA synthesis. And lastly, the pre-tRNA will undergo extensive
modification of internal bases to acquire its mature tRNA form
RNA EDITING
In RNA editing, unlike alternative splicing, specific deaminase enzymes convert 1 RNA
nucleotide to another in a permanent fashion. The DNA is unaltered and still transcribes
the proper RNA, but the alteration of this RNA results in a protein that isn't consistent
with the DNA template.
Example: Apolipoproteins
Apolipoprotein B is a major proteins involved in LDL(low density lipoproteins)
that carry lipids in and around the body and cells with receptors.
Apolipoprotein B is synthesized in the liver and intestine. In the liver the Apo-
B protein is large, at 4536 amino acids long, however, in the intestine, the
apo-B synthesis is cut short around halfway through due to premature stop
introduced by the deamination of cytosine into uracil making for a UAA stop
codon. The intestinal apolipoprotein is thus only 2152 amino acids long.
Deamination
NUCLEOCYTOPLASMIC TRANSPORT
Nuclear Pore Complex
Structure
These nuclear pores are dispersed all over the nuclear envelope and are held up by
structures called nuclear pore complexes
The nuclear pore complex is 125 megadaltons big, that is, 30x bigger than a
ribosome and is composed of 50-100 different proteins(yeast vs vertebrates).
Proximal and cytoplasmic filaments project from the complex on the cytoplasmic
side and a these same proximal filaments in addition to a nuclear basket project
out on the nuclear side. Molecules smaller than 40-60 kDa can freely pass through
the NPCs, but larger molecules and multimolecular complexes such as RNPs must
be transported.
The pore itself is made up of proteins called nucleoporins, and the ones rich in
phenylalanine and glycine are referred to as FG nucleoporins made up of proteins
with FG repeats.
Mechanism
The FG repeats in the nucleoporins interact with one another through
hydrophobic interactions forming a highly disordered gel-like interface.
This characteristic is essential to the regulation of movement through the pore as
specific proteins with domains that can interact with the disordered domains of FG
nucleoporins will be able to pass through the pore. These domains are called
nuclear localization signals or NLS and in their presence, any protein can enter the
nucleus through nuclear pore complexes.
Example: SV40 virus uses Antigen T to cause damage in a lot of nuclear processes,
but when proteins in T antigen where mutated, the T antigen could no longer
enter the nucleus and its harmful affects were muted. It turns out a specific stretch
of amino acids containing lots of lysine and arginine had to be present for it to be
able to enter the nucleus.
Example: adding this lysine and arginine rich sequence to other proteins like
pyruvate kinase which isnt a nuclear protein suddenly allowed them to enter the
nucleus
Transport
Nuclear Protein Import
Required proteins
Two types of proteins are required in order to get proteins synthesized in the
cytoplasm into the nucleus via NLSs.
RAN: monomeric G-proteins that exist in two configurations
1. Bound to GTP
2. Bound to GDP
Mechanism
Note that Ran-GAP has GTPase function and converts Ran-GTP into Ran-GDP in the
cytoplasm, while Ran-GEF, a guanine nucleotide exchange factor, turns Ran-GDP
into Ran-GTP in the nucleoplasm.
1. Importins in the cytoplasm recognize the NLS on a given cargo protein and form
a complex with it
2. By virtue of the FG repeats on the nucleoporins in the NPC(nuclear pore
complex), the cargo/NLS/importin complex will travel through the pore and into
the nucleoplasm
3. Ran in its GTP bound form Ran-GTP greets the Cargo/importin complex in the
nucleoplasm
4. A conformational change happens and importin releases the cargo into the
nucleoplasm where is will do its job
5. RAN-GTP bound to importin make their way out of the nucleus and into the
cytoplasm through the NPC simply down its concentration gradient.
6. Ran-GAP greets RAN-GTP/importin, hydrolyzing it into RAN-GDP causing a
conformational change that releases importin so that it can interact with a new
cargo protein
7. RAN-GDP then makes its way back into the nucleoplasm to b converted back
into Ran-GTP by Ran-GEF
Mechanism
1. Exportin 1 in the nucleoplasm recognize the NES on a given cargo protein and
form a complex
2. Now bound to the cargo protein, the exportin undergoes a conformational
change allowing it to recruit Ran-GTP forming a ternary complex
3. Cargo protein/exportin/RAN-GTP complex make their way out of the nucleus
and into the cytoplasm through the NPC.
4. Ran-GAP greets the complex, hydrolyzing Ran-GTP into RAN-GDP causing a
conformational change that releases the cargo protein and exportin
5. RAN-GDP then makes its way back into the nucleoplasm to be converted back
into Ran-GTP by Ran-GEF
6. Exportin also returns to the nucleoplasm to be used again in export
RNA Export
Exportin t- Ran dependent
Types of RNA
• tRNA: Exported into the cytoplasm to participate in protein synthesis with the
ribosome
• rRNA
• mRNA: some specific mRNA that associate with hnRNP proteins(HIV Rev) can be
exported through association with Ran
Mechanism
i. Exportin t in the nucleoplasm binds to fully processed tRNA and Ran-GTP
ii. tRNA/exportin t/RAN-GTP complex make their way out of the nucleus and into
the cytoplasm through the NPC.
iii. Ran-GAP greets the complex, hydrolyzing Ran-GTP into RAN-GDP causing a
conformational change that releases the tRNA and exportin t
iv. RAN-GDP then makes its way back into the nucleoplasm to be converted back
into Ran-GTP by Ran-GEF
v. Exportin also returns to the nucleoplasm to be used again in export
Mechanism
1. Mature mRNA with a poly A tail interact with NXF1 and NXT1 subunits of the
RNA exporter.
2. NXF1 and NXT1 interact cooperatively with specific mRNP proteins including SR
proteins that already decorate the mature mRNA
3. These protein interactions on mRNA will form a domain on RNA that will interact
with FG repeats in nucleoporins to go through NPC into the cytoplasm
4. Here, mRNAs can be translated
mRNP Packaging in Balbiani Rings
On one DNA template of the insect polytene chromosomes(Balbiani Rings),
both transcription and mRNP export are microscopically imaged. These
insects have synthesize a transcribe a specific gene to synthesize a given
protein that allows them to stick their eggs onto leaves.
The mRNA is being transcribed with the hnRNPs and released in the form of
these little croissants as mRNPs that will undergo transport across the
nuclear envelope through the NPC.
Transport Mechanism
1. mRNP reaches the NPC where there are gatekeepers ensuring that the exporterd
mRNA is infact mature.
2. mRNA begins to be threaded through the NPC 5' end first
3. The 5' mRNA reaching the cytoplasmic side is immediately bound by ribosomes
Cytoplasmic remodelling
Helicase unwinds the RNA and initiates the removal of nuclear factors from
RNA once the RNA has been transported across the nuclear envelope. An the
replacement of these factors by cytoplasmic proteins.
Example:
Nuclear cap binding complex that recognizes a 5' end is replaced by
EIF4E
PABPN1 that interacts with poly A tail in cytoplasm is replaced by
PABPC1
POST-TRANSCRIPTIONAL/TRANSLATIONAL
REGULATION OF GENE EXPRESSON
mRNP Export Model
The first round of translation can be used as a secondary backup mechanism for
ribosomes to go through and knock off all the proteins still associated with the
mRNA.
And in a perfect world, the RNA HeLa case would be sufficient to get rid of
all those proteins and all those nuclear proteins would make their way back
into the nucleus. But cells are not perfect, and sometimes proteins get
through that check mechanism on the cytoplasmic side and they don't get
knocked off. And in those circumstances, translation is used as a secondary
backup mechanism to ensure that all of these proteins are
eliminated because of that property that I just described to you of the
ribosome going through and knocking off all the proteins associated with
the mRNA. So the first round of translation is a little different than all the
other translational rounds that will continue after it. The first round is going
to eliminate these proteins that are still bound.
mRNA stability
Stability of RNAs is under strict regulation and is critical for that steady
state concentration
The stability of RNA is critical for that steady state conformation. In Ecoli and
other bacteria, transcription must rapidly switch depending on environment, so,
mRNAs are quite unstable as you might not want the RNAs used n one
environment to be used in another. Because of the way that a given organism
must adapt, it will destabilize its mRNA so they are present for limited time.
Examples of this are genes involved in regulating the cell cycle, so when
tissues begin to differentiate, the mRNAs should be destabilized so as not to
lead to cancer.
mRNA Destabilization
Many of the short lived mRNAs have some elements rich in AUUUA sequences.
Example:
GMCSF or granulocyte macrophage colony stimulating factor (You don't
have to care) is very important for driving the proliferation of a number of
immune cells. There's a sequence in the mRNA of GMCSF that destabilize it
so as to avoid keeping it around for long after completing its job. Too many
white blood cells is a hallmark of leukemia or other immunological
disease. So within the three prime UTR of the mRNA that corresponds to
GMCSF, there are some elements that are rich in AUUUA sequences.
RNA Decay/Degradation
Location
Degradation mostly occurs in the P-bodies, devoted sectors of the cytoplasm.
These are membraneless organelles, or liquid-liquid condensates formed by
the recruitment of RNA and enzymes required for RNA degradation.
Mechanism
Deadenylation-dependent
1. The deadenylase complex deadenylates poly-A tails from 3' to 5'
2. Exosome RNA degradation machine chews up RNA from 3' to 5'
At the very end of the exosome, there are two different
ribonucleases, an exoribonuclease that will chew up the RNA as it
comes within its proximity, and an endoribonuclease if any
fragments weren't fully degraded. So it's highly efficient.
3. Decapping enzymes remove the methyl guanylin cap from 5' to 3'
4. Exposed 5' end of mRNA allows XRN1 enzyme to degrade RNA from 5' to 3'
Deadenylation-independent
mrRNAs are regulated by key proteins that affect their stability, and their
steady state within the cell is dependent both on transcription and their
stability.
Remember IRE-BP only active in low mRNA conditions, interacting with IREs in
the 5' UTR of mRNA
Translational regulation
Example: When RNA levels are consistent with the quantity of protein they
produce
Normally, the synthesis of polypeptides is under strict control and mRNA
abundance reflects protein levels such that more mRNA equals more protein.
But in some cases, the relationship appears skewed, suggesting that protein
synthesis or the stability of the protein is regulated. Somewhere between
mRNA(after transcription) and the translation of proteins.
Example:
Normal condition
Hunchback is anterior specific transcription factor, specifying the
structure that will give rise to the anterior end of the embryo
The mRNA is spread out all over the embryo
The associated proteins are expressed only in the anterior end
Nano on the other end is responsible for specifying the structure that
will give rise to the posterior end
The mRNA is only in the posterior end
The associated proteins are expressed only in the posterior end
Example: Ferritin
There are two stem loops in the 5' end of the ferritin mRNA, as
well as iron response elements.
High Iron
Ferritin is required to sequester the excess intracellular iron
In high iron conditions, IREBP is inactive, so the scanning complex can
go right through them, translating the mRNA to give rise to ferritin
protein that will actively remove iron.
Low iron
In low iron conditions, IREBP is in its active form and interacts with IREs
in the 5' stem loop, blocking the coding region and thus the translation
into ferritin protein. This means less iron will be sequestered and more
will remain in the low iron concentration area.
siRNA
siRNA are the double stranded small interfering RNAs that will induce mRNA
degradation to completely complementary target RNAs. Because of their
complete complimentarity, siRNAs eliminate all of the mRNA through
endoribonucelolytic cleavage
RNAi or RNA interference refers to the developmentally/physiological
regulated process in which a small RNA strand binds to a longer RNA strand
coding for a known protein so as to interfere with translation and thus the
production of that protein. Argonaute proteins of the RISC and Dicer are both
present in this pathway. Introducing a transgenic complex that will fold over
itself in a hairpin to trigger the RNAi pathway.
The Dicer enzyme which functions a little like RNAase III acts as a dimer to
cleave dsRNA. The dsRNAs are cleaved into siRNAs, fragments of 21-23nt,
and then bound by an Argonaute protein within the RISC. Then, the helicase
activity of RISC triggered by ATP hydrolysis drives the unwinding of the siRNA
so that the siRNA product can be used to target the complex to the
complementary mRNA. The complementary mRNA is cleaved with a kiss of
death, and the resulting cleaved transcripts are then degraded by
cytoplasmic ribonucleases.
miRNA
Transcription
miRNA are transcribed by RNA polymerase II and thus capped in the process
Biogenesis
1. Dicer, the RNAase III-like enzyme cuts double stranded pre-miRNA into mature
miRNA
1. Acts as a dimer to cleave dsRNA into 21-23nt fragments
2. Dicer hands over the double stranded miRNA to RISC, we call this a miRISC
3. Argonaut proteins in the miRISC complex(silencer complex) bind to miRNA and
use their ATP driven helicase activity to unwind miRNA into single strands
4. The single stranded miRNA re used as guides to mRNA targets with antisense or
limited antisense homology
5. miRNA guide interact in antisense with the mRNA targets at the 3' UTR to block
translation or destabilize the mRNA target through de-adenylation
Structure
The miRNA pathway, like the siRNA pathway is triggered by dsRNA
molecules. miRNAs are also not completely complementary to their target
RNAs, forming many loop structures when binding to 3' regions of the
UTR(untranslated region). Because of their incomplete complimentarity,
miRNAs destabilize or block translation.
piRNA and PIWI interactions are involved in many processes relating to gene
expression from the regulation of mRNA stability to enhancing protein synthesis.
In simple terms, piRNA and PIWI form a ribonuclear complex that chops up
complexes of transposable elements, so they are no longer dangerous.
lncRNAs
Regulation of gene expression- transcriptional interference
XIST and TSIX are antagonistic long non-coding RNAs that are both expressed in
early embryogenesis. When it comes time to decide whether to inactivate one of
the X chromosomes, the relative amounts/expression of XIST and TSIX decide
which process predominates. More TSIX then XIST lead to active X chromosome
while more XIST then TSIX results in inactivation of the X chromosome seeing as
XIST coats and silences chromosome and TSIX antagonizes XIST expression. Later in
embryonic development, the two generally become mutually exclusive.
XIST RNA
The XIST RNA is a long non-coding RNA lncRNA encoded by the XIST locus. The
lncRNA binds to discrete regions of the X chromosome and X-tinguishes gene
expression as it spreads down the X. Its mechanism and function isn't completely
understood yet.
I left off describing the effects of a long, non coding RNA that's critical in
the decision to inactivate a specific X chromosome during a dosage
compensation process. In mammalian female cells, the RNA gets
expressed, it's spliced, polyadenylated, and then coats the presumptive X
chromosome to be inactivated. And while coding it, somehow it's capable
of extinguishing gene expression from most of that X chromosome
Insys. We should be familiar with these terms. Now where it's expressed, it's
actually acting on that chromosome. It's not working on the other
chromosome.
Mechanism
By coating the entire X chromosome in cis XIST, RNA recruits chromatin
modifying complexes. These poly-comb complexes partake in repressive
chromatin modifications, that is, histone deacetylation by HDAC 3 and lysine
methylation of H3K27. These changes condense the X chromosome and
render it mostly inaccessible to transcription factors. In these regions where
XIST is covering the chromosome in Cis, gene expression is largely X-
tinguished from the inactive X, a characteristic that is maintained through
epigenetic processes in every new daughter cell.
TSIX RNA
TSIX is a long non-coding RNA lncRNA whose expression biases against XIST
expression.
be discovered as this forms nothing but a sequence; does not give us info to better understand
- How do they give rise to functions, cells, tissues, organs, systems, organisms (living
things; animals, plants, fungi…)?
- This is not obnoxious in the sequence of information; requires deciphering = new age of
biological investigation (how genes work, how they work together how they give rise to
- What is the minimum essential toolkit we need? If you examine these pie charts, the
predominant element (more than half of the genome is genes we have no idea what
- The disruption construct is introduced into diploid yeast cells to replace the
- Remove gene function systematically by replacing genes; yeast carry out homologous
- Will direct homologous recombination so that those new genes will replace these
endogenous genes; have to know something about the sequences of all the genes you
selectable markers (Can be a drug resistance gene for example) flanked by sequences
itself
- Carry out a PCR reactions with primers that contain these homologous regions
- Dominant selectable marker gene flanked by regions of 100% homology to regions that
- Presence of the dominant selectable marker confers drug resistance (G-418) so the cells
can grow on drug.
- When allowed to sporulate the haploid progeny (spores) will either have a wild type
chromosome or a recombinant chromosome.
- The effects of the gene replacement can then be assessed ie…viability or growth rate
- G-418 resistance = dominant selectable marker; only this one will grow in that
environment (will have that resistance therefore can be selected for if inserted in the
right place);
- If the disrupted gene is essential, these spores will be non viable = would not be able to
duplicate the yeats cell to form haploid cells;
- If you end up with 2 cells instead of 4, then it was an essential gene in basic cellular
processes
- therefore whatever gene was knocked out, is not involved in sporulation, but is rather
- Then once you have these isolated haploid cells, you put them under various conditions
in the absence of this wildtype gene functions to see how this gene plays a role in the
- you HAVE to start with diploid yeast just incase this change is lethal; if that gene were
actually essential gene in basic cellular processes, the cell would die…need to have a
backup wildtype chromosome thats not affected just in case the gene eliminated is
Functional Genomics
- RNAi can affect all diff kinds of cells in a full grown, living animal; C. elegans are good for
this because they express the desired effects once RNA is taken up by the organism =
- Come up with a genome wide means of analyzing every single gene (19,000) function in
C. elegans for all predicted genes = engineer plasmids to that each plasmid would drive
a dsRNA by having a T7 promoter that drives the expression of one RNA in one direction,
and another T7 promoter driving the expression of RNA in the other direction; indice
expression with IPTG (way of activating the T7 promoters that will eventually make the
ds RNA)
- Makes 19,000 gene constructs; Each one of those constructs will make double stranded RNA that corresponds to a
single predicted gene in C. elegans; then you transform 19,000 independent bacterial colonies and then you can feed each one of those
bacteria, which corresponds to a bacteria that will make double stranded RNA, to a predicted gene; And then feed that bacteria to the
animals and each one of them will show an RNAI effect typical of loss of function in that particular gene
- Genes that were sterile, that gave rise to sterile animals or caused embryonic lethality not surprisingly fall into the major classes of those
- DNA synthesis, RNA metabolism, translation, transcription, these are all very important for every single cell on the planet
- Things that fell into the uncoordinated category that made the animal so they didn't move properly = generally genes that are involved in
neuromuscular function, many of which are conserved up to us; can understand just by l ooking at all these uncoordinated animals what
genes are involved in all of these various functions. It might be synapses, it might be the way that cells send out their neurotransmitters,
might be the way that they form neurotransmitters, but they all fall into this category
- post embryonic phenotypes tend to be involved in signaling so that the organs get formed in the correct time, the right place and carry out
their appropriate function as the animal (tend to be a little bit more animal specific and not necessarily part of the basic cellular toolkit)
Transcription Factors are Modular
- don't have to do functional genomic analysis in order to start to put labels on genes that have unknown function. You can carry out
- take advantage of the fact that t ranscription factors are modular, that they have transcriptional activation domains and DNA binding
domains. And as long as these things are put together, it didn't matter if they come from different transcription factors, they always
activate the downstream gene based on the sequence that the DNA binding will interact with
- Make fusion proteins with protein of interest and a known DNA binding sequences (ex: GAL4)
- If proteins A and B interact the two fusion proteins (A-DNA Binding Domain and B-
transcription factor.
the DNa binding domain will bring down the transcriptional activation domain of the
prey,
- Can reconstitute TF to interact with UAS if for example, a cell is compromised because
it can't make histidine, then you drive a trans gene; it will activate that enzyme and will
another protein that you think might be interacting with your protein, but it's bound to a transcriptional activation function. And they
come together well. That GAL4 DNA binding domain will interact with the UAS, and when it i nteracts with UAS, it activates transcription of
the downstream gene. If it activates transcription of downstream gene and i t's a histidine synthesis gene, suddenly you make histidine in a
- Can select very efficiently for all the cells where a and b interact
- Bait: a protein you're interested in bound to a DNA binding domain
- Prey: a protein that you're going to query that's b ound or fused to a transcriptional activation function
- ut B can be every single protein in the entire proteome, every single predicted gene product and you can make libraries that are all fused
to transcriptional activation and then you can Co transform these things and evaluate which cells grow and then you can go back and figure
out what the gene product was that was in the prey = 2 hybrid system
- Problem: these proteins may not like going into the nucleus even though that's where their function is…
come together, this will reconstitute a protein and it might have an actual function
- the most commonly used protein fragment complementation strategies and it's using a protein called dihydrofolate reductase that's
really important for purine synthesis. If you don't have it, you don't grow; if protein X and protein Y (2 query proteins) come together, it'll
reconstitute dihydrofolate reductase and suddenly cells will be able to grow again
- You can even do it with GFP s o that GFP is engineered where it's split in two, and the two parts of GFP themselves don't give rise to
fluorescence. But if they're brought together by a protein protein interaction between, in this case, protein X and protein Y, the protein
halves of GFP will reform and reconstitute proper GFP protein that will fluoresce and you can detect all of which give you an idea of which
- proximity labeling = dependent on labeling proteins that come within a very small
distance with your target bait protein; not necessarily direct protein protein interaction
ligated to specific p roteins on the primary amines by a specific enzyme called a biotin ligase ; used to covalently
affect a limited number of substrates within a cell; only recognizes those substrates
- by mutagenesis and by identifying the changes that give rise to mutant biotin ligases in bacteria, a very promiscuous biotin ligase was
i dentified = BirA* = it will biotinylate, it will add on a biotin molecule to any protein with a primary amine
- Expression construct that will express this particular fusion protein = introduce it into cells and have it expressed correctly in
those cells; theoretically the promiscuous biotin ligase will biotinylate any other protein that comes within a very close
range to the protein that you're interested in = biotin tag
- grind up those cells that were transfected with BirA; all of the proteins that are biotinylated within the cell can be purified
from a protein extract by running the protein extract over a streptavidin, sepharose, or agarose column; can wash the column
and get rid of all the other garbage and then elute all those biotinylated proteins from the column just by adding biotin. By
competition, all the proteins will fall off and you'll have the collection of proteins that came within a specific range in the cell
- run those individual proteins through a mass spectrometer and you'll get identities
- Run it through a mass spectrometer = mass identities for each protein = help
you understand who was hanging out with your protein in the cell = important
(what kind of neighbors it has; who its interacting with to carry out its functions)
- can give you some important information about what the protein does; eventually you end up with an idea, a network of
- Function is best addressed through removal of gene activity and analysis of the resulting
phenotype; Abnormal phenotypes indicate specific processes have been disrupted that
- When you examine mutant phenotypes (ex homeobox mutations); mutants obtained
through random mutagenesis and selection for mutant phenotypes thereafter (defective
phenotype)
- Removal of gene activity
- Disrupt homeostasis based on random mutation -> forward genetic analysis (looking
for mutants with a phenotype, but don't know what the gene is that corresponds)
- Randomly mutate and randomly look for phenotypes, then go back to find gene: forward
- Disrupt the activity of specific gene product to assess its function -> reverse genetic
analysis (interested in what a gene does in an organism…analyze gene function and the
phenotypes that arise once you remove that gene function of interest)
- Understand what that protein/gene does in cell, and go back to eliminate that gene
function: reverse genetic analysis (start with sequence, go back and try to understand
by using feeding RNAi… All the analyzed genes were assigned some function as
- RNAi in C. elegans where we can eliminate gene functions of every single predicted
genes; what are the mutant genotypes and how they affect life
- The disruption construct is introduced into diploid yeast cells to r eplace the
As long as you have flanking sequence with 100% homology, introduce into that cell and
- Presence of the dominant s electable marker confers d rug resistance (G-418) so the
cells can grow on drug.
- When allowed to sporulate, the haploid progeny (spores) will either have a wild type
chromosome or a recombinant chromosome.
- The effects of the gene replacement can then be assessed ie…viability or growth rate
- Use the same kind of properties (homology directed replacement), not only for yeast but
for other organisms
HR can also be performed in the pluripotent stem cells of mouse
- homologous recombination could also occur in pluripotent embryonic mouse cells (ES
cells).
- Use homology directed recombination to engineer/alter chromosomes such that you can
replace a section/whole gene with some dominant selectable marker that allows you to
- Inner cell mass = embryonic cells are characteristic of the cells of this inner cell mass;
gives rise to every tissue in the body (pluripotent); can make an animal all on their own;
Embryonic stem cells = can contribute to every single tissue in a growing embryo;
resistance gene) and flanking sequence that binds that dominant selectable marker; you
need very very large flanking sequences with 100% homology (direct homologous
the gene correctly, and do not include the tk HSV thus are resistant to the anti-drug
environment
- (2) construct gets integrated r andomly into sections that have nothing to do with the
targeted gene; in this case, everything will go in including the tk gene meaning the cells
- 2nd selection: Ganciclovir is toxic in the presence of the herpes virus tk gene, so it will
non-homologlous)
- Only ES cells that have undergone Homologous Recombination (HR) can survive the
two selection steps.
- Of those cells retain tk, therefore probably not correct, will die in the presence in the
drug; Select for those cells that went through homology driven gene replacement event
- ES cells are then used to populate the blastocyst of an acceptor mouse; This mouse
strain has to be another coat color that is recessive.
- Take cells out of culture, and introduce into mouse blastocyst; those cells will contribute
to the final formation of a mouse embryo
- The blastocyst is transferred to a surrogate mouse mother
- ES cells came from a mouse that actually had brown coat color (dominant color) =
- In the blastocysts = black hair = wild type embryonic cells = not manipulated
- The progeny will be a mixture of both genotypes if the cells were viable and the process
worked properly.
- (1) black = unaffected b y injecting manipulated cells
- (2) chimeric = brown segments come from initial ES cells that were manipulated
(mosaic of different genotypes put together)
- Hope that some of the affected cells ended up being germ cells; so you cross them…
eventually to get a homozygous pure animal that's been manipulated for that one
particular gene segment of interest; often homozygous die early or the gene loss is
- This means that the implanted cells contribute to various tissues. The hope is that one
of the tissues they contribute to is the germ line!
- 2 diff phenotypes (manipulated AND non-manipulated cells from the mom); This will
then be transmitted to their germ cells; Cross these animals back and forth until you
- Often mice and born with no detectable phenotype; often genes are redundant, such
that one gene product may be compensated with another
Transgenic mice are simpler to make and can provide important information
- Transgenes integrate randomly into the genome s o positional effects may affect gene
expression
- Transgenic reporter genes are very important for understanding expression patterns of
genes
- Can express transgenes under endogenous promoter or heterologous control (inducible
promoters..ie heat shock or using heavy metals) - Can be used to to edit the
genome
- Get a fertilized oocyte, inject DNA construct into one of the pronuclei before they fuse
; reimplant t hat injected zygote into a foster mother; give rise to pups that at a very
surprising rate will integrate that trans gene into their chromosome (10-30%) - Induce
- Segments of bacteriophage DNA sequence are integrated into the genome of some
Regions are transcribed into primary RNA that is bound by tracr/trRNA. Cas9 recognises
structures in the tracrRNA and is recruited to foreign DNA segments that are recognized
by the crRNA. Cas9 has been very well characterized in Streptococcus pyogenes. It
Regularly Interspaced Short Palindromic Regions; sequences within the cluster share
- Bacteriophage DNA was somehow acquired by the bacteria , but into a cluster, and
interspaced
- Adaptive immune response based on acquiring chunks of DNA from enemy, and using it
against it
- Trans acting crispr RNA interacts with these interspersed regions through
complementarity
- Helps to mature these sessions of this large RNA to give rise to individuals crispr RNAs
- Hockey stick structures from diagram = stem loop structures; recognized by cas9 = form
a complex
- Watson crick base spring between segment of RNA and cas 9; Cas 9 nuclease will
generate a dsDNA cut within the bacteriophage target, debilitating the genome,
- These clustered, regularly interspersed short palindromic regions, now known as CRISPR, are in fact transcribed in the bacteria to make a
primary transcript that has all these repeat regions and RNAs that correspond to these bacteriophage genes, We'll call them CRISPR RNAs.
The bacteria also has a transactivating CRISPR RNA that's shown here by this little hockey stick that interacts through complementarity
- by interacting with those repeats, it will eventually help to mature that primary transcript into individual CRISPR RNAs
- The hockey stick is a specific RNA as a sequence recognized by a protein in the bacteria = CAS 9 (RNA binding protein that interacts
- i n doing so, it will use the CRISPR RNA to take it to a target DNA on an invading bacteriophage. And when that CRISPR RNA recognizes the
sequences that are complementary to it, CAS 9 will carry out a killer double stranded nucleolytic cleavage of the DNA; bacteria has this
Genomic Engineering/Editing
- i t was found that you could actually generate an RNA such that you could eliminate the necessity of a trans activating crna; ou could make
a single RNA that had these same kinds of stem loops and you could add on an RNA sequence that recognizes almost any DNA target and
- specificity that's conferred by the sequence relies on a protospace or adjacent motif (PAM motif); In order to have this endonucleolytic
cleavage to work, you have to engineer your guide RNA segment such that it complements or base pairs with sequences just next to this
- providing that that Pam sequence is positioned correctly, you'll get 2 endonucleolytic cleavages:
- One catalyzed by the RuvC domain; and the other by the HNH domain
- you need to introduce CAS 9, which is not present in most of our cells, and you need to introduce this engineered single guide RNA with
this interesting stem loop that's going to bring CAS 9 into the reaction. So you need at least two different transgenes here, one that's going
to make your engineered single guide RNA and one transgene that's going to make CAS 9.
- The newly optimized CAS 9 always have this NLS sequence and then you can drive these things in any cells you want with the promoters
that you've defined based on that cell type that you're interested in
- A combination of the crRNA and the tracrRNA can be expressed as a single guide RNA
(sgRNA) that will recruit the endonuclease Cas9 to a region of the genome. Specificity is
(NGG). Cas9 will generate a DSB 3nt upstream (5’) of the PAM sequence in the target
DNA
- A separate transgene is required to express the Cas9 endonuclease so that both the
sgRNA and Cas9 are present in the same cell (nuclei).The RNA structure of the sgRNA
will be recognized and bound by Cas9 and the complex will be directed to the target
DNA site. Cas9 uses two separate endonuclease domains to cut the DNA strands at
(palindromic repeats that allows it to fold onto itself and form these stem loops; cas9
- (2) Cas9 expression: species specific promoter, Cas9 coding gene, NLS (so it can enter
the nucleus)
the double strand break causes a problem. So the cell recognizes it right away.
- it ends up making insertions and deletions giving rise to nonsense mutations or frame
shifts (to re-anneal it quickly)
- use the template to restore the integrity of that DNA segment; template used has to
- can repair double stranded break with a new DNA segment = homology driven repair
mechanism
- OR it can be used to change DNA homology; insert bits of DNA template that will be
taken up because of this recombination event (can alter the genes that encode proteins