Module 1: Review of DNA, RNA, and Protein Structure and Function
COMPONENTS OF A DNA STRUCTURE
A gene encodes a protein. It is a series of DNA molecule whose sequence of building blocks specifies the
sequence of amino acids in a particular protein. The activity of protein imparts the phenotype. The different
building blocks combine to form nucleic acids which enable them to carry information
A single building block of DNA is called nucleotide.
One deoxyribose sugar
One phosphate group
One nitrogenous base
At CG parers
:
o Purines – bases adenine and guanine with two-ring structure
o Pyrimidines – bases cytosine and thymine with single-ring structure
o Information-containing part
o DNA sequences are measured in numbers of base pairs (kilobase and megabase)
NUCLEONDES
Nucleotides join into long polynucleotide chains when strong attachments called phosphodiester bonds form
between the deoxyribose sugars and the phosphates, creating a continuous sugar-phosphate backbone. Two
such polynucleotide chains align head-to-toe. The carbons of deoxyribose are numbered 1 to 5 starting with
the carbon found by moving clockwise from the oxygen. Imagine a chain that runs from the number 5 carbon at
the top to the number 3 carbon. But the chain that aligns with it runs from the number 3 to the number 5
carbon. These ends are labelled as 5’ and 3 ‘, pronounced as 5 prime and 3 prime respectively.
Complementary base pairing is responsible for the symmetrical DNA double helix form. This happens when
nucleotides that contain A pair with those containing T, and subsequently nucleotides containing G pair with
those carrying C. The width of the double helix is same all throughout due to the inherent characteristic of
number of rings of purine and pyrimidine bases. The chemical attraction responsible for holding the DNA base HYDRO
pairs together is the hydrogen bonds. The said bonds between complementary base pairing are highly
responsible for the highly symmetrical DNA double helix structure. Although this reaction may seem weak, they
impart great strength over the many bases of a DNA molecule. DNA forms a double helix when the antiparallel,
base-paired strands twist about one another in a regular fashion.
DNA molecules are extremely long. Several types of proteins are present to compress DNA without damaging
or tangling it. Scaffold proteins form frameworks that guide DNA strands. Histones are proteins where DNA
coils around forming structures that resemble beads on a string. This DNA bead is called a nucleosome. DNA
wraps at several levels until it is compacted into a chromatid (a chromosome consisting of one double helix in
the unreplicated form).
A nucleosome is composed of eight histone proteins (a pair of each of four types) and 147 nucleotides of DNA
entwined around them. A fifth type of histone is called anchor protein that attaches the nucleosomes to liker
regions of DNA. Chemical modification of the histones controls when a particular DNA sequences unwind and
become accessible for the cell to use to guide protein synthesis.
Loop formation is not random. The genome parts drawn together include genes that function together.
Chromatin loops rarely overlap and affects swaths of DNA sequence. Researchers are investigating chromatin
loops and folds that function as hidden switches that might trigger cancer and other diseases.
Module 1: Review of DNA, RNA, and Protein Structure and Function
DNA REPLICATION
DNA must be replicated or copied so that the information it holds can be maintained or passed to future cell
generations. Watson and Crick envisioned the two strands of DNA unwinding and separating. This will expose
unpaired bases of DNA and this would result to attraction of complements from free, unattached nucleotides
available in the cell from nutrients. This route is called semi-conservative because each DNA double helix
conserves half of the original. In 1957, Matthew Meselson and Franklin Stahl demonstrated the
semi-conservative mechanism of DNA replication by “density shift” experiment. They labelled replicating DNA
from bacteria with a dense form of nitrogen and traced the distribution pattern of the nitrogen along strand of
each daughter double helix.
Steps of DNA replication include the following:
■ During S phase of the cell cycle, DNA first unwinds and then locally separates due to the protein
helicase. The hydrogen bonds holding the base pairs together break.
■ Two identical nucleotide chains are built from one as the new bases form new complementary pairs.
The site where DNA is locally opened is called a replication fork.
■ The helicase opens up a localized are enabling other enzymes to guide assembly of a new DNA strand.
During this phase, binding proteins hold the two strands apart.
■ Primase then attracts complementary RNA nucleotides to build a short piece of RNA, called RNA
primer, at the start of each segment of DNA to be replicated.
■ The RNA primer then attracts DNA polymerase which brings in DNA nucleotides complementary to the
exposed based on the parental strand (acts as a template).
■ New bases are added one at a time starting at the RNA primer.
■ New DNA strand grows as hydrogen bonds form between the complementary bases and the
sugar-phosphate backbone links the newly-incorporated nucleotides into a strong chain.
■ RNA primer is removed once replication is under way and replaces it with the correct DNA bases.
DNA polymerase adds nucleotides to the exposed 3’ end of the sugar in the growing strand. Replication adds
nucleotides in a 5’ to 3’ direction because this is the only chemical configuration in which DNAP can function.
Replication is discontinuous, which means that it is accomplished in small pieces from the inner part of the
replication fork outward. Ligase then seals the sugar-phosphate backbones of the pieces building a new
strand. These pieces are called Okazaki fragments. DNA polymerase also proofreads by excising mismatched
bases and inserting corrected ones. Other enzymes that help in DNA replication are annealing helicase that
rewinds unwound DNA sections. It is said that human DNA replicates at about 50 bases per second.
RNA TRANSCRIPTION
First of all, accessing genetic information is selective and efficient. That’s why there are transcription factors
that regulate which genes are transcribed in a particular cell type under particular conditions. These factors
respond to signals from outside the cell such as hormones, growth factors or external triggers like lack of
oxygen.
Steps of RNA transcription include the following
Initiation
o Transcription factors and RNA polymerase are attracted to a promoter which is a sequence that signals the
start of a gene.
Module 1: Review of DNA, RNA, and Protein Structure and Function
o The transcription factor TATA binding protein is chemically attracted to a DNA sequence called TATA box.
After the binding, it attracts others in groups including the RNA polymerase.
Elongation
o Enzymes unwind the DNA double helix and free RNA nucleotides bond with exposed complementary bases
on the DNA template strand. Take note that the sequence of RNA formed is almost similar to DNA except with
uracil in place of thymine.
o RNA polymerase adds the RNA nucleotides depending on the sequence the DNA specifies. If the DNA
moves along in a 3’ to 5’ direction, expect that the RNA molecule is synthesized in a 5’ to 3’ direction.
Termination
o The RNA polymerase complex also recognizes the end of the gene. It occurs in two modes depending on the
presence of protein factor ρ (rho) thus classified as ρ-dependent and ρ-independent.
o The ρ protein destabilizes the elongation complex leading to the release of both the template DNA ad
completed RNA chain.
RNA Processing
Before mRNA is translated to proteins, it first undergoes several steps of RNA processing. After mRNA
transcription, a short sequence of modified nucleotides called a cap is added to the 5’ end of the molecule. The
cap is consisting of a backwardly inserted guanine which attracts an enzyme that adds methyl group. At the 3’
end, a special polymerase adds 200 adenines forming a poly A tail which is necessary for commencement of
protein synthesis. There are certain transcribed parts of the mRNA called introns (intervening sequences) are
also removed. Introns associate with certain proteins to form small ribonucleoproteins or snurps. Four snurps
come together that cut introns out. The ends of the remaining RNA are spliced and the remaining fragments
are called exons. Snurps also attach the exons to each other to form a mature mRNA that exits the nucleus.
RNA STRUCTURES
RNA functions as a bridge between gene and protein. The bases of an RNA sequence are complementary to
those of one strand of a double helix, which is called the template strand. RNA polymerase builds an RNA
molecule from the template strand. The other non-template strand of the DNA double helix is called the coding
strand.
Both DNA and RNA are nucleic acids composed of sequences of nitrogen-containing bases joined by
sugar-phosphate backbones. However, RNA is single-stranded whereas DNA is double stranded. Sometimes,
RNA fold and loop upon themselves to take on a double-stranded character. RNA can also pair with
complementary strands of DNA or another RNA to form a double helix. Another difference is that DNA
pyrimidine base thymine is not present in the RNA and rather replaced by uracil. DNA has deoxyribose as its
sugar while RNA has ribose. Their function also differs. DNA stores and maintains genetic information while
RNA controls how information is used. RNA can function as an enzyme unlike DNA.
There are three major types of RNA which have major distinctive conformations.
Messenger RNA (mRNA) carries the information that specifies a particular protein. Each set of three
consecutive mRNA bases forms a genetic code word or codon that specifies what amino acid to be
Module 1: Review of DNA, RNA, and Protein Structure and Function
manufactured. Differentiated cells can carry out specialized functions because they express certain subsets of
genes that produce mRNA molecules of transcripts.
Ribosomal RNA (rRNA) enters the scene as they associate with certain proteins to form ribosome. The said
organelle functions like a machine to assemble amino acids to form proteins.
Transfer RNA (tRNA) has a cloverleaf conformation with three loops. Each loop of the tRNA contains three
bases that form the anticodon which is complementary to the mRNA codon. The end of the tRNA opposite the
anticodon bonds to a specific type of amino acid. For example, a tRNA with an anticodon sequence of GAA
always pick up the amino acid phenylalanine through the help of binding enzymes.
Other types of RNA
Beyond the primary role of RNA in protein synthesis, several varieties of RNA exist that are involved in
post-transcriptional modification, DNA replication, and gene regulation. Some forms of RNA are only found in
particular forms of life, such as in eukaryotes or bacteria.
Small nuclear RNA (snRNA)
snRNA is involved in the processing of pre-messenger RNA (pre-mRNA) into mature mRNA. They are very
short, with an average length of only 150 nucleotides.
Regulatory RNAs
A number of types of RNA are involved in regulation of gene expression, including micro RNA (miRNA), small
interfering RNA (siRNA) and antisense RNA (aRNA).
1. miRNA is found in eukaryotes, and acts through RNA interference (RNAi). miRNA can break down
mRNA that it is complementary to, with the aid of enzymes. This can block the mRNA from being
translated, or accelerate its degradation.
2. siRNA are often produced by breakdown of viral RNA, though there are also endogenous sources of
siRNAs. They act similarly to miRNA. An mRNA may contain regulatory elements itself, such as
riboswitches, in the 5' untranslated region or 3' untranslated region; these cis-regulatory elements
regulate the activity of that mRNA.
3. Antisense RNA (aRNA) is a single-stranded RNA that binds to a messenger RNA (mRNA) to block its
translation into protein. This process is called hybridization
Transfer-messenger RNA (tmRNA)
Found in many bacteria and plastids. tmRNA tag the proteins encoded by mRNAs that lack stop codons for
degradation, and prevents the ribosome from stalling due to the missing stop codon.
Ribozymes (RNA enzymes)
RNAs are now known to adopt complex tertiary structures and act as biological catalysts. Such RNA enzymes
are known as ribozymes, and they exhibit many of the features of a classical enzyme, such as an active site, a
binding site for a substrate and a binding site for a cofactor, such as a metal ion.
One of the first ribozymes to be discovered was RNase P, a ribonuclease that is involved in generating tRNA
molecules from larger, precursor RNAs. RNase P is composed of both RNA and protein; however, the RNA
moiety alone is the catalyst.
Module 1: Review of DNA, RNA, and Protein Structure and Function
Double-stranded RNA (dsRNA)
This type of RNA has two strands bound together, as with double-stranded DNA. dsRNA forms the genetic
material of some viruses.
PROTEIN TRANSLATION
Amino acids are building blocks of proteins. Each amino acid has characteristic biochemical properties
determined by the nature of its amino acid side chain. Amino acids are grouped according to their polarity as
follows:
● Nonpolar
● Polar
● Negatively-charged
● Positively-charged
●
Before mRNA is translated to the amino acids they encode, certain requirements must be met first.
● Francis Crick and his colleagues showed that the codon should have three nitrogenous bases to
encode a certain amino acid.
○ Ex: GCC is the codon that encodes for the amino acid alanine.
● The genetic code does not overlap.
○ Ex: AUGCCCAAG
■ Given this code, we have three amino acids as specified by the code AUG, CCC, and
AAG. It does not overlap and does not encode for seven amino acids (AUG, UGC, GCC,
CCC, CCA, CAA and AAG). start
AUG :
● The genetic code includes controls. codon
○ Ex: Codon AUG which codes for methionine signals start of translation. VGA VAG VAA Stop
, ,
:
codon
○ Codons UGA, UAA, and UAG signify stop of translation.
● All species use the same mRNA codons to specify the same amino acids and therefore the same
genetic code. This explains that life evolved from a common ancestor. There are certain different
codons that specify the same amino acid and they are called synonymous codons.
Steps of protein translation include the following:
● The mRNA leader sequence forms hydrogen bonds with a short sequence of rRNA in a small ribosomal
subunit. The first mRNA codon to specify an amino acid is always AUG which attracts a tRNA to carry
methionine. This will serve as the initiation complex.
● A large ribosomal subunit bonds to the initiation complex. The codon adjacent to the start codon (AUG)
then bonds to its complementary anticodon. The part of the ribosome that holds the mRNA and tRNA
together can be described as having two sites. The p (peptide) site holds the growing amino acid chain
and the A (acceptor) site next to it holds the next amino acid to be added to the chain.
● The amino acids link through peptide bond with the help of rRNA that acts as a ribozyme.
● The first tRNA is released to pick up another amino acid of the same type and be used again.
● The ribosome moves down the mRNA by one codon and another tRNA brings in its amino acid.
● Elongation of RNA halts when the A site of the ribosome encounters a stop codon (UGA, UAG or UAA).
● A protein release factor starts to free the polypeptide.
● The last tRNA leaves the ribosomes, the ribosomal units separate and the new polypeptide is released.
Module 1: Review of DNA, RNA, and Protein Structure and Function
After translation, several outcomes may happen to the polypeptide. Some are cleaved, have sugars added or
aggregate. The cell either uses or secretes the formed protein. Then, the said biomolecules folds into a specific
conformation to function.
● Primary structure – amino acid sequence
● Secondary structure – refers to local folding of polypeptide backbone into helical, pleated sheets or
random conformations.
● Tertiary structure – three-dimensional structure of the polypeptide
● Quaternary structure – non-covalent association of discrete polypeptide subunits into a multi-subunit
protein.
Chaperone proteins help the correct conformation arise. Ubiquitin attaches to misfolded proteins and enables
them to refold or escorts them to proteasomes for dismantling. This is done to avoid mutations which may lead
to development of diseases.
ENZYMES THAT METABOLIZE DNA AND RNA
Once DNA is polymerized, it is not static. It needs to be tapped selectively to make RNA and protect itself from
damage. In prokaryotes, protective systems prevent their DNA from being infected with bacteriophage and
other viruses. In sexually reproducing organism, the mixing of DNA parental DNA generates genetic diversity in
the offspring. Here are some important enzymes to metabolize DNA:
Deoxyribonucleases or endonucleases
● Breaks the sugar-phosphate backbone of DNA
● Most of them are derived from bacteria where they function as part of the primitive immune system to
cleave foreign DNA entering the bacterial cell.
● Examples are BamHI from Bacillus amyloliquefaciens, HindIII from Haemophilus influenzae, SmaI from
Serratia marcescens
● Four general types
Type 1 have both nuclease and methylase activity
Type 2 cleaves the DNA directly at the binding site and the most commonly used
Module 1: Review of DNA, RNA, and Protein Structure and Function
Type 3 have also nuclease and methylase activity as well as helicase ability
Type 4 have cutting and methyltransferase functions
■ Example is BseMII from Bacillus stearothermophilus
■ Cuts one strand of DNA 10 bp following the recognition sequence
■ Also adds methyl groups to both of the adenine residues in the target sequence
DNA ligase
● Catalyzes the formation of a phosphodiester bond between adjacent 3’-hydroxyl and 5’phosphoryl
nucleotide ends
Exonucleases
● Degrades DNA from 3’ end to 5’ end
● Used in making stepwise deletions in linearized DNA or to modify DNA ends after cutting with
restriction enzymes
● Exonuclease I from E. coli degrades single-stranded DNA from the 3’ end into mononucleotides. It can
be used to remove single-stranded excess primers from double-stranded reaction products of DNA
copying or amplification procedures.
● Exonuclease III from E. coli removes 5’ mononucleotides from the 3’ end of double-stranded DNA in the
presence of manganese and magnesium. It is used in research setting to create nested deletions in
double stranded DNA or to produce single-stranded DNA for dideoxy sequencing.
● Exonuclease VII from E. coli digests single-stranded DNA from either the 5’ or 3’ end.
● Nuclease Bal31 from Alteromonas espejiani can degrade single and double-stranded DNA from both
ends
● Mung bean nuclease from Mung bean sprouts also digests single-stranded DNA and RNA. It is used to
remove overhangs from restriction enzymes to produce blunt ends for cloning. It is also used to resolve
hairpins or folds in RNA.
● S1 nuclease from Aspergillus oryzae and certain Neurospora species hydrolyze single-stranded DNA
or RNA into 5’ mononucleotides.
● Other important nucleases are Micrococcal nuclease, DNAse 1 from bovine pancreas, and DNA pol I
from E. coli.
Helicases
● They are responsible for smooth transcription, replication and recombination without tangling of DNA.
● The nicking and re-closing of DNA relieve topological stress in highly coiled DNA.
● Topoisomerase type relaxes supertwisted DNA.
● Gyrase type untangles DNA through double-strand breaks and separate linked rings of DNA.
Module 1: Review of DNA, RNA, and Protein Structure and Function
Metyltransferases
● Catalyze the addition of methyl groups to nitrogen bases (usually adenine and cytosine
On the other hand, RNA is also degraded by enzymes in the same fashion with that of DNA. Here are some
important enzymes to metabolize RNA:
Ribonucleases
● RNase H digests the RNA strand in a DNA-RNA hybrid.
● RNase I cleaves single-stranded RNA.
● RNase III digests double-stranded RNA.
● RNase P removes precursor molecules from tRNA molecules.
● RNase A, RNase T1 and RNase T2 cleave single-stranded RNA at specific residues.
RNA Helicases
● Catalyze the unwinding of RNA and removes proteins from RNA-protein complexes.
MUTATIONS
A mutation is a change in a gene’s base sequence that is rare in a population and can cause mutant
phenotypes. It disrupts the function or abundance of a protein or introduces a new function. Most
loss-of-function mutations are recessive and most gain-of-functions are dominant.
Examples of common mutations are sickle cell disease and beta thalassemia. The former occurs from a single
base mutation in the beta globin gene and the latter occurs from too few beta globin chains. Many medical
conditions result from mutation in collagen genes because of the precise conformation of the proteins and its
pervasiveness in the body.
A spontaneous mutation arises due to chemical damage or to an error in DNA replication. The spontaneous
mutation rate is characteristic of a gene, and mutation is more likely to occur in regions of DNA sequence
repeats. In gonadal mosaicism, only some gametes have a spontaneous mutation. Mutagens are agents that
delete, substitute, or add DNA bases causing mutation.
The following are types of mutations:
● Point mutation
○ Alters a single DNA base
○ Could be a transition (purine to purine/ pyrimidine to pyrimidine) or a transversion (purine to
pyrimidine or vice versa)
● Missense mutation
○ Substitutes one amino acid for another
● Nonsense mutation
○ Substitutes a stop codon for a codon that specifies an amino acid leading to shortening of
encoded protein
○ This type of mutation is stopped by nonsense-mediated decay that destroys mRNAs that have
premature into stop codons.
Module 1: Review of DNA, RNA, and Protein Structure and Function
● Splice-site mutations
○ Adds or deletes amino acids and can shift the reading fram
● Deletion mutation
○ Removes genetic material
● Insertion mutation
○ Adds genetic material
● Frameshift mutation
○ Alters the sequence of amino acids
● Tandem Duplication
○ Additional of a copy of a gene next to the original
● Pseudogenes
○ When a duplicate of a gene mutates that disrupts chromosome pairing
● Transposons
○ “Jumping genes” that disrupts the site they jump from, shut off transcription of the gene they
jump into, or alter the reading frame if they are not multiple of three bases.
● Expanding repeats
○ Adds stretches of the same amino acid to a protein that leads to lengthening of the DNA
○ DNA expanding repeats attract each other which causes replication errors
%Mol G/C Content
(and more recently as low as 20% in the Carsonella genome)
The formula: