Biology 102 - General Biology
Biomolecules - Part 3
The Central Dogma of Molecular Biology
DNA makes RNA makes Protein

The relationship of DNA, RNA, and Proteins
DNA (deoxyribonucleic nucleic acid) is the genetic material found in
all living cells DNA (and many viruses). DNA is always found in the cells
as a double helix. In prokaryotic cells (and mitochondria and chloroplasts),
there is a single circular DNA molecule. In eukaryotic cells there are
several pairs of chromosomes and although they are more complex, each
chromosome contains a single, very long molecule of DNA. The genes are
within the DNA molecules.
Mendel, the father of genetics, knew nothing of chromosomes or DNA. He
deduced the laws of inheritance purely from observations of the progeny
of his pea plants. The discovery that DNA is the genetic material occurred
in the first half of the 20th century. Experiments with bacteria
and viruses showed that DNA was the genetic material and not protein as
many people in the mid 1900's believed. At the time, people reasoned that
proteins were more complex with 20 different kinds of subunits than nucleic
acids with only 4 different subunits. (They forgot that the Morse code
which consists of dots and dashes can code for all the 26 letters of our
alphabet.) Scientists were able to show that with bacterial viruses, the
viral DNA entered the host cell and was able to direct the synthesis of
complete viral particles, both the DNA and the protein capsid, while the
viral protein remained outside the host cell.
The structure of DNA was deduced in 1953 by Watson and Crick using the
accumulated biochemical knowledge of DNA and the X-ray diffraction pictures
of Rosalind Franklin. It was already known that DNA was composed of four
subunits called nucleotides which were composed of phosphate, sugar and
one of four different bases. These bases were adenine, thymine, cytosine
and guanine, which are referred to as A, T, C, and G. It was also known
that in every DNA molecule, the number of A's always equaled the number
of T's, and the number of C's always equaled the number of G's.
Part of Watson and Crick's success was due to their thinking as biologists.
They reasoned out the structure from their knowledge of biochemistry,
physical chemistry and the role of DNA in the cell. They knew the structure
of DNA would have to be able to explain how the molecule replicated and
how it coded for proteins. (It had previously been shown by biochemical
geneticists that genes code for proteins.) They proposed that the reason
A = T and C = G was because they formed complementary base pairs between
two strands of a double helix. The bases formed the rungs of the ladder-like
molecule. The outside strands of the ladder were repetitions of sugar
and phosphate, sugar and phosphate, one after the other.
It was further proposed that replication was accomplished by the separation
of the two parent strands, each acting as a template to attract the complementary
bases of new nucleotides to form a new half of the molecule. Thus, DNA
molecules relied on complementary base pairing for replication with each
new double stranded molecule having one parent strand and one newly assembled
strand. This is called "semi-conservative" replication since one of the
parent strands is "conserved" in the new DNA molecule. The bonds between
the sugars (ribose) and phosphates are strong covalent bonds however the
hydrogen bonds between the bases are relatively weak (individually) and
can be broken to open the DNA molecule for replication and transcription.
Watson and Crick proposed that the genetic code could be found in the
sequence of bases in one of the two strands. (Only one strand carries
the genetic message and is "read"...transcribed into mRNA).
A little later, the "code was broken" and found to be a three-letter non-overlapping
code. The three "letters" are the bases (nucleotides) and each sequence
of three bases is called a codon. Codons code for amino acids. (In tables,
codons are written as RNA codons, using the complementary sequence in
RNA nucleotides.) If the code is read with 3 bases at a time, there are
4 X 4 X 4 = 64 combinations of three bases. There are only 20 amino acids,
so this means the code is "redundant" since more than one codon can code
for the same amino acid. The codon, AUG, signals the "start" of translation
and three different codons signal the termination of translation (UAA,
UGA, UAG). [Probably the original code was a two letter code with 4 X
4 = 16 different doublet codons (a number closer to 20) but when a few
"fancier" amino acids were added to the cell's repertoire, more codons
were needed. When you look at the table of codons you will see that the
"third" base is less important and many amino acid codons use any of the
bases in the third place, only the first two bases are important in these
cases.] The linear sequence of bases in the DNA (gene), codes for the
linear sequence of amino acids in the polypeptide (protein) for which
it codes. Since proteins usually contain hundreds of amino acids, genes
can be very long stretches of DNA (many kB or kilobases).
Many enzymes are devoted to the process of replication and the repair
of mistakes in the structure of DNA. If these "mistakes" in replication
or damage to DNA caused by chemical and physical agents (e.g., UV light,
X-rays, tobacco, and other mutagens) are not repaired, the result is a
mutation either in your gametes (sex cells) or in your body cells. If
a mutation occurs in your body cells, it can result in cancer. Carcinogens
(agents that cause cancer) are, in fact, mutagens.
In eukaryotic cells, replication and transcription occur in the nucleus
on DNA templates and translation occurs in the cytosol on ribosomes (either
free or on the RER). In prokaryotic cells all three occur in the cytosol.
Proteins, often composed of more than one polypeptide chain, are the
work horses of the cells. Each polypeptide chain is coded for by a different
gene. Not all genes in each cell are "turned on." The cells of the tissues
and organs of your body make only the proteins required for its function.
Therefore, not all the DNA is transcribed in every cell. There are special
regulatory proteins that have the job of controlling which genes (stretches
of DNA) will be transcribed in the various tissues.
In eukaryotic cells, DNA never leaves the nucleus so the genetic messages
the DNA contains must be "transcribed" for export to the ribosomes where
they will be translated into proteins (polypeptides). Transcription is
rather like replication. It also occurs on a DNA template and like replication,
the DNA opens up but only one of the two strands is copied into a messenger
RNA molecule. The copying process uses complementary base pairing just
as replication uses complementary base pairing. However, the resulting
RNA molecule is single stranded and uses uracil (U) instead of T to pair
with A. It is released into the cytoplasm.
The messenger RNA (mRNA) is thus a copy of a gene. It leaves the nucleus
through nuclear pores and it travels to the ribosomes in the cytoplasm.
The ribosomes are composed of ribosomal RNAs (synthesized in the nucleolus
in eukaryotic cells) and ribosomal proteins. They are rather complex structures
composed of a small and large subunit. Ribosomal RNA is also transcribed
from rRNA genes. Another type of RNA is the transfer RNAs (tRNAs). They,
too, are transcribed from tRNA genes. Transfer RNAs are small molecules
compared to rRNAs and mRNAs. They have the very important function of
reading the codons of the mRNA and of bringing the correct amino acid,
corresponding to that specific codon, into alignment on the ribosome.
There are enzymes that specifically attach the correct amino acid to the
correct tRNA with the appropriate anticodon. The tRNAs are all the same
dimension from head to foot. They read the codon with their "head" end
and align the amino acids at their "foot" end. Enzymes will zip up the
amino acids into the polypeptide chain.
Nucleic Acids and Nucleotides
To understand the nucleic acids and proteins, it is important
to understand how they are related. The Central Dogma of molecular biology
states that: DNA makes DNA (replication); DNA makes RNA (transcription)
and RNA directs the synthesis of proteins (translation). DNA is the genetic
material in all cells and in eukaryotic cells it stays in the nucleus
in the form of chromosomes. RNA molecules are copies of genes, much like
a blueprint of a house. The RNA that codes for a protein is appropriately
called messenger RNA (mRNA). Messenger RNA goes out to the cytoplasm of
the cell where the sequence of nucleotides is translated into a sequence
of amino acids to form a unique protein.
DNA is a double stranded molecule made of four subunits
called nucleotides. These nucleotides are composed of a nitrogenous base
(A = adenine, T = thymine, C = cytosine, G = guanine) attached to a sugar
called deoxyribose and the sugar is attached to a phosphate group which
is negatively charged. The double stranded DNA helix is like a twisted
ladder. The sides of the ladder are repeating sugar-phosphate and the
rungs of the ladder are the bases, A, T, C, G. The rungs always are A
matched to T and C matched to G. This specific complementary base pairing
is a chemical necessity and is responsible for the faithfulness of DNA
replication each time a cell divides. The double strandedness also gives
the molecule stability but the bonds between the bases are weak bonds
and can be broken for replication and transcription to occur. The genetic
code is found in the sequence of bases in only one of the two strands...only
one of the sides of the ladder contains the genetic code.
RNAs are single stranded molecules also composed of four
different nucleotides. However, the nucleotides in RNA contain the sugar
ribose instead of deoxyribose and they contain uracil instead of thymine.
The RNA is synthesized from the "sense" strand of the DNA molecule. There
are three kinds of RNA in the cell messenger RNA (mRNA), transfer RNA
(tRNA) and ribosomal RNA (rRNA). These will be the subject of a future
lecture.
ATP is adenosine triphosphate, it is a nucleotide and also
an "energy" coenzyme. It is a very important molecule in energy
metabolism and works with a large variety of enzymes. It is the common
dollar bill used by all cells to store and transfer energy in all cellular
processes.
THE CENTRAL DOGMA
DNA (THE GENES) CODE FOR MESSENGER RNA WHICH GETS TRANSLATED
INTO PROTEINS
This figure shows the steps in process of decoding the
genes: the DNA code of a gene (top) is transcribed and processed into
a messenger RNA (mRNA) (2nd and 3rd row) and finally translated into a
protein (bottom red line)
Proteins and Amino Acids
Polypeptide is the name given to a chain of amino acids
synthesized from one mRNA. The term polypeptide refers to the fact that
amino acids are linked by what is called a peptide bond and, of course,
poly means many. Proteins are usually composed of more than one polypeptide
chain. Each polypeptide chain is coded for by a gene. The sequence of
nucleotides (ATCG) in the gene determines the sequence of amino acids
in the protein for which it codes.
Proteins have a variety of functions. The sequence of the
amino acids in each individual protein determines its function. A very
large number are enzymes. Enzymes catalyze reactions and make them go
much, much faster than they could if the enzyme were not around. The sugar
on your table would eventually break down to CO2 and H2O
but it would take a very, very long time. When you put sugar into your
body, the sugar is broken down within minutes to provide ATP and to release
CO2 and H2O. Enzymes are responsible for all the
metabolic reactions that occur in all the cells of our bodies. Other proteins
act as carriers, an example is hemoglobin which has four polypeptide chains.
Some proteins are cell membrane receptors for protein hormones. Others
are antibodies, the molecules that are made by special white blood cells
(lymphocytes) to fight off foreign organisms and molecules. Antibodies
are proteins that contain four polypeptide chains. All antibodies have
a similar structure except for the regions which bind to the foreign cell,
virus, or other molecule. Each antibody-producing cell produces only one
kind of antibody and once stimulated it "remembers" and will produce its
antibodies whenever it meets the same invader. That is why we get booster
shots for vaccinations...to help our cells remember to make antibodies
to the organisms or molecule for which we were inoculated. The sequence
of amnio acids in each antibody molecule determines what antigen (foreign
molecule) it will bind to. The DNA binding proteins turn genes on and
off, in that way they regulate what each cell is producing. Not all cells
make the same proteins and even the same cell may make different proteins
at different times. Some proteins like collagen, found in connective tissues,
and keratin, found in hair, are purely structural proteins. These examples
are not exhaustive of the many functions served by proteins.
Proteins are composed of 20 different amino acids. They
are the "letters" of the protein "words" much as our
alphabet is used to form the words of our language. However, protein "words"
are much, much longer than the words we use. Proteins contain 100's of
amino acids while our language words seldom contain more than 10 letters.
The 20 amnio acids all have individual names and individual
properties and scientists use single letter abbreviations for each of
the 20 different amnio acids.
The sequence of amino acids is called its primary
structure (the bonds between the amino acids are called peptide
bonds) and the sequence is determined by the genetic code. The final shape
of any particular protein depends on the sequence of amino acids it contains.
Except for the purely structural proteins, most proteins are globular
but they often contain regions of alpha helices and sometimes, beta sheets
within them. The alpha helix and beta sheet are referred to as secondary
structure (the bonds that maintain the secondary structure are
hydrogen bonds). The 3-D shape of a protein is the result of it folding
on itself. This is referred to as tertiary structure
(the bonds involved include hydrogen bonds, electrostatic bonds, Van der
Waal's forces, and covalent disulfide bonds). As stated earlier, most
functional proteins have more than one polypeptide chain. The association
of more than one polypeptide chain to form the functional protein is referred
to as quaternary structure (the bonds are the same as
in the tertiary structure).
When the bonds involved in its structure are disrupted the
protein is said to be denatured. When you fry an egg you denature the
proteins with heat, when you whip the egg with an egg beater you also
denature it. Another alternative is to put the egg in orange juice which
is acid and stir. Each of these methods break the weaker bonds of secondary,
tertiary, and quaternary structure but they do not break the strong peptide
bonds between the amino acids. However you cannot "renature"
the egg's proteins...they cannot assume their original shape. The enzymes
in your digestive tract will break down the bonds between the amino acids
so they can be absorbed into your blood stream and sent to your cells
to build your proteins.
An interesting example of denaturing/renaturing proteins
is making curly hair from straight or straight hair from curly. Hair is
made of proteins and heat (curling iron) can break the weaker bonds. When
the hair cools it takes the shape of your curling or straightening device.
When hair is permed the first chemical agent used breaks down the weaker
bonds and the hair is then held in the shape desired (curly or straight)
until the second chemical agent is added which causes the bonds to reform
in the desired pattern.
Although proteins are macromolecules, they are still quite
small compared to the size of a cell. A cell will contain many thousands
of different proteins that carry out its work.

An Amino Acid (not to scale; proteins below)
Primary, secondary, tertiary and quaternary structure
of proteins.
ABCDEF are several different proteins
containing alpha helices and beta sheets.
Their tertiary structure gives them a globular shape
(not to scale, remember, proteins are each made out of
100's of amino acids)
|