As Many Exceptions As Rules: The Language Of Our DNA

Biology concepts – nitrogenous base, nucleoside, nucleotide, DNA, RNA, second messengers, G protein coupled receptors, cAMP, cGTP, cyclic dinucleotides

Grammar isn’t easy. Small changes can lead to large

differences in meaning. It is like this with the terminology

in molecular biology as well. TK is a tyrosine kinase, while

TK1 is a thymidine kinase. Thymine is a nitrogenous base

of DNA while thiamine is a vitamin. Not knowing the

difference can keep you from that PhD you’ve been wanting.

It may be that English grammar is the only subject that can approach the number of exceptions one finds in biology. When do you use "who" instead of "whom;" “its” has only got an apostrophe when it isn’t possessive; I before E except after C; plural nouns add “s” but you take away the “s” to make a plural verb; their vs. there vs. they’re. It’s exasperating – English grammar should be taught only to those over 25 years of age, when one is mature enough to handle the stress.

Today we are going to talk about the building blocks of DNA and RNA – they can be as confusing as grammar. Terms and structures will look and sound similar, but their functions are very different. We’ll try to minimize the confusing details and maximize the amazing differences.

The basic building block of a nucleic acid is the nucleotide. This is a complex molecule made up of one or more phosphate groups, a ribose or dexoyribose sugar, and one of five nitrogenous bases (A, C, G, T, or U – those are the 5 – for now). Already it's a little confusing, but we can add more complexity; if you have just the base and sugar, it is called a nucleoside, not a nucleotide. Let’s use one base as an example.

Adenine (A) is the name of one nitrogenous base. If it is bound to a ribose, it is called adenosine (A), if it is bound to dexoyribose, it is called deoxyadenosine (dA). If you add a phosphate, you get the nucleotide, but the name depends on how many phosphates; one phosphate = adenosine monophosphate (AMP) or deoxyadenosine monophosphate (dAMP), 2 phosphates = adenosine or deoxyadenosine diphosphate (ADP or dADP), 3 phosphates = the triphosphate (ATP or dATP).

The other nitrogenous bases use the same system – mostly. Cytosine (C) and guanine (G) form cytidine or guanosine nucleosides or nucleotides. The exceptions are thymine (T) and uracil (U). T is formed from dUMP by adding a methyl (-CH₃) group, but not from UMP. Therefore, you don’t really find thymidine, only deoxythymidine. Since they know it only comes in one form, scientists go ahead and call it thymidine - thanks a lot.

As alluded to in the text, base plus sugar equals nucleoside. Add

a phosphate, or two, or three and you have nucleotides. The sugar

can be ribose or dexoyribose, the difference being the OH at the

second carbon position. On the right are the possible bases, the purines

have two rings, the pyrimidines have one. Notice how adding a

methyl group to uracil makes thymidine or how taking away an

amine group from cytosine makes uracil. These will be important later.

In addition to the modification of U to make T, there is the removal of the 2’-OH to make deoxyribose out of ribose. This removal is made after the nucleoside is formed. Together, the modification of U to dU and the modification of dU to dT are strong evidence that RNA predates DNA and supports the RNA world hypothesis that we talked about two weeks ago.

We said above that nucleotides are the building blocks of DNA and RNA. Specifically, it's the triphosphate nucleotides (NTP or dNTP, where N means any of the bases) that are used for incorporation into the growing chains of RNA and DNA. The energy for the bond comes from releasing two of the phosphates, so the nucleotides in DNA and RNA are bonded through one phosphate linkage.

The building of nucleic acids comes from pools of NTPs and dNTs in the cell. Evidence shows that the pool of dNTPs is about 1/10 that of NTPs. This means that there are only enough dNTPs in the cell to support DNA replication for about 30 seconds. This implies that it's the rate of turning NMPs into dNMPs (then to dNTPs) that controls things like cell cycle and cell division; no replication of DNA, no division.

Ribonuclease reductase turns NDPs into dNTPs. It is well

controlled, the catalytic site is where the reaction takes

place, so the NTP goes there. The activity site requires an

ATP to activate or a dATP to inactivate the enzyme (this

keeps the dNTP levels in check). The specificity site says

which NDP can be acted on. When dATP or ATP is bound

at the specificity site, the enzyme accepts UDP and CDP

into the catalytic site; if dGTP is bound, ADP can be acted

on; if dTTP is bound in the specificity site, GDP enters the

catalytic site.

The concentration of dT is especially important, since it only comes from modifying dU. If you add some extra thymidine to cells, they will think that they have enough dNTPs. This turns off the enzyme (ribonuclease reductase) that converts NDPs to dNDPs. As a result, you won’t have enough dNTPs to make DNA and the cell will just stop.

Uses for nucleotides A, G, and C beside inclusion in DNA or RNA are more apparent (nature hates unitaskers). ATP should be near and dear to all our hearts - all our organs for that matter. ATP is the energy currency of the cell. The energy released when two phosphates are lost to incorporate a nucleotide into a growing nucleic acid is the same energy when ATP is hydrolyzed to ADP during an enzyme reaction or relaxation of a muscle.

An adenosine variant, called cyclic AMP (cAMP) is just as crucial as any other biomolecule you can name. An uncountable number (O.K., I’m sure someone knows) of cellular reactions are regulated by the levels of cAMP in the cell.

Cyclic GMP is a signaling compound similar to cAMP. Each controls a varied number of regulatory pathways and second messengers to convey information in the cell. There are also cyclic dinucleotides. Bacteria use c-di-AMP and c-di-GMP as second messengers. This has been know for some time, but a new study shows that these cyclic dinucleotides stimulate specific inflammation in a mammalian host by triggering production of the proinflammatory molecule IL-1beta. This stimulation pathway is via a completely new pathway. These are most definitely important molecules outside the nucleic acids.

cAMP and cGMP are single nucleotides in which the phosphate group

binds to the sugar at two points – it circularizes. Just because they

aren’t shown here, don’t think that cUMP or cCMP don’t exist – they

do, and they are second messengers too. In the case of the cyclic

dinucleotides, the phosphate of each nucleotide is joined to two

different sugar molecules. It is still circular, but in a way that

involves both nucleotides. The cGMP and cAMP are used in higher

organisms, the c-di-GMP and c-di-AMP are used by bacteria for

various operations, everything from gene regulation to virulence.

Cyclic di-GMP may be important for secondary signaling, but GTP and GDP also get into the game. G protein coupled receptors start many of the second messenger systems. There are many types of G protein couple receptors, but that will have to wait for another day.

CTP can act as an enzyme cofactor, especially in the production of one of the phospholipids that is most important in biological membranes (phosphatidylcholine). A similar reaction using CTP as a cofactor is the focus of a new study because the product of the reaction is important in the life cycle of the parasite that causes malaria (P. falcipaurm). The new study shows that the levels of CTP and CDP will regulate the efficiency of the enzyme using CTP, so manipulating these levels might be a target for anti-malarial drugs.

Lastly, uridine (U) is important outside of nucleic acids as well. When combined with an adenosine and four (yes, 4) phosphates, it is called uridine adenosine tetraphosphate (Up₄A). This dinucleotide has recently been identified as an important controlling molecule in vascular endothelium physiology. It causes a contraction in several types of muscle cells in vessel walls, thereby regulating the tension of the walls, called vascular tone. In this way, Up₄A helps manage pressure and its dysfunction is important in many vascular diseases.

As we discussed a couple of weeks ago, DNA is double stranded and the bases are paired - A with T and G with C. Chargaff first showed that the levels of dG and dC and of dA and dT were always the same in a cell. Donahue then showed that they could base pair by hydrogen bonds.

Different amounts of G+C vs. A+T in regions of DNA lead

to different staining of the chromosome regions. GC regions

are more dense, so some stains are excluded and they show

up unstained. This difference in GC content has functional

consequences as well. High GC areas are more gene dense,

and have regulatory regions as well. A new study shows that

in chickens, high GC regions are associated with regulatory

regions of genes – the higher the GC content, the more

expression from that gene.

If you know how much dG is in a cell, then you know how much dC is there. But this doesn’t mean that G+C = A+T. The %GC content is different in different species. P. falciparum is a very low GC organism, only about 20% of the nucleotides of DNA are G or C, while other prokaryotes are up to 78% GC. See the picture caption for more on this subject.

So DNA has dA, dC, dG, and dT, while RNA uses U instead of T. Why? Such a simple question, but not many people bother to ask. There is more than one reason, but they’re all related to long-term protection of genetic information.

The cytosine base can be deaminated (removal of an amine group) to form uracil. In RNA, this mistaken identity would lead to an incorrect translation or perhaps a loss of function of a structural RNA. Fortunately, these are short-term problems because each RNA is short lived. But if U was used in DNA, then how would the repair enzymes know which U’s were correct and which were actually deaminated C’s?

Since dT is used in DNA instead of dU, any dU must be a deaminated C and should be replaced. If it were allowed to remain, then an incorrect U would be copied as an incorrect A (U is like T because it pairs with A) and this would be forever kept in the DNA - a permanent mistake. Not good.

Second, uracil forms a stable product when damaged by radiation, while radiation damage to T’s can be detected and replaced by repair enzymes. So again, using dT in DNA leads to a more stable, more protected, long-term storage molecule.

A third reason for dT in DNA is related to base pairing. U pairs best with A, but it can base pair with G, T, or C. This increases the chances of mismatched pairs in the DNA double strand - not good for keeping information pristine in the long run. Protection against damage is also illustrated by the fact that dT is basically methylated U.

This is a cartoon representation of a tRNA that is charged

with a phenylalanine amino acid. The different loops are

associated with efficiency of action with the template,

binding to the ribosome, and binding of the amino acids.

The T loop actually contains a T base (at grey arrow), it’s

an RNA, but it includes a T – that’s the very definition

of an exception.

Methyl groups have a tendency to protect the bases from enzymes that break down DNA (nucleases). We will talk about this more next week. So again, using dT in DNA is more protective than using uracil.

Whew, good thing we use U for RNA and T for DNA, right. Well….. not always. tRNAs are a huge exception, which we will talk about much more in future posts. Thymidine is found in the T arm or T loop of tRNA; here it is important for binding the tRNA to the ribosome during translation. A DNA nucleotide in an RNA??? What gives?

Remember, T only occurs naturally as dT. T ends up in tRNA by virtue of a modification that methylates a U. Once modified, you can’t tell it from any other T – except that now it is bound to a ribose, not deoxyribose. English grammar seems a lot easier by comparison, doesn’t it.

Abdul-Sater AA, Tattoli I, Jin L, Grajkowski A, Levi A, Koller BH, Allen IC, Beaucage SL, Fitzgerald KA, Ting JP, Cambier JC, Girardin SE, Schindler C. (2013). Cyclic-di-GMP and cyclic-di-AMP activate the NLRP3 inflammasome. EMBO Rep.

Nagy GN, Marton L, Krámos B, Oláh J, Révész Á, Vékey K, Delsuc F, Hunyadi-Gulyás É, Medzihradszky KF, Lavigne M, Vial H, Cerdan R, Vértessy BG. (2013). Evolutionary and mechanistic insights into substrate and product accommodation of CTP:phosphocholine cytidylyltransferase from Plasmodium falciparum FEBS J. DOI: 10.1111/febs.12282

Rao YS, Chai XW, Wang ZF, Nie QH, Zhang XQ. (2013). Impact of GC content on gene expression pattern in chicken Genet Sel Evol. DOI: 10.1186/1297-9686-45-9

For more information and classroom activities, see:

Nucleotide/nucleoside –

http://www.accessexcellence.org/AE/AEC/CC/DNA_model.php

http://www.google.com/url?sa=t&rct=j&q=&esrc=s&source=web&cd=5&ved=0CEsQFjAE&url=http%3A%2F%2Fwww.chatham.edu%2Fpti%2Fcurriculum%2Funits%2F2002%2FZanetti.pdf&ei=7dgsUt_DIaiMyAH_7YFA&usg=AFQjCNF9FQxN5eIV6c74bQ8n56Tn5zXh-g&sig2=Bfn51nbH8TPsxihW7YbGsw

https://www.google.com/search?hl=en&biw=1610&bih=918&q=cGMP%20cAMP&ie=UTF-8&sa=N&tab=iw&ei=stMsUtvlOsXYyAHO5YDYBA#hl=en&q=purine+%22classroom+activity%22

http://butane.chem.uiuc.edu/pshapley/Enlist/Labs/GenTest/GenTest.html

http://tfscientist.hubpages.com/hub/What-is-DNA-Nucleotides-and-Information-Storage

http://pspruett.blogspot.com/2008/01/mit-biology-class-reading-between-lines.html

http://www.phschool.com/science/biology_place/biocoach/dnarep/structure.html

http://www.phschool.com/science/biology_place/biocoach/dnarep/chemstruc.html

http://dl.clackamas.edu/ch106-09/nucleoti.htm

http://www.vivo.colostate.edu/hbooks/genetics/biotech/basics/nastruct.html