of Molecular Biology (2001) 313: 229 - 237
A Standard Reference Frame for the Description
of Nucleic Acid Base-pair Geometry
These preliminary recommendations were made at
the Tsukuba Workshop on Nucleic Acid Structure and Interactions held
on January 12-14, 1999 at the AIST-NIBHT Structural Biology Centre
in Tsukuba, Japan. The meeting was funded by the COE program of the
Science and Technology Agency, Japan and the CREST program of the
Japan Science and Technology Corporation. The meeting was organized
by Masashi Suzuki of the National Institute of Bioscience and Human-Technology
and Helen M. Berman and Wilma K. Olson of the Nucleic Acid Database
Project (supported by National Science Foundation (USA) grant DBI
Participants at the workshop included Manju
Bansal (Indian Institute Science, Bangalore), Helen M. Berman (Rutgers
University), Stephen K. Burley (Rockefeller University), Richard E.
Dickerson (University of California, Los Angeles), Mark Gerstein (Yale
University), Stephen C. Harvey (University of Alabama at Birmingham),
Udo Heinemann (Max-Delbrück-Centrum), Stephen Neidle (Institute
of Cancer Research), Wilma K. Olson (Rutgers University), Zippora
Shakked (Weizmann Institute), Heinz Sklenar (Max-Delbrück-Centrum),
Masashi Suzuki (AIST-NIBHT Structural Biology Centre), Chang-Shung
Tung (Los Alamos National Laboratory), Eric Westhof (Strasbourg),
and Cynthia Wolberger (Johns Hopkins University). The survey of small
molecule crystal structures was performed by John Westbrook and Helen
M. Berman. The optimization of standard base-pair geometry and the
calculation of derived parameters were carried out by Xiang-Jun Lu
and Wilma K. Olson with support from U.S.P.H.S. grant GM20861.
A common point of reference is needed to describe
the three-dimensional arrangements of bases and base pairs in nucleic
acid structures. . For example, parts of a structure, which appear
"normal" according to one computational scheme, may be highly unusual
according to another and vice versa. It is thus difficult to carry out
comprehensive comparisons of nucleic acid structures and to pinpoint
unique conformational features in individual structures. In order to
resolve these issues, a group of researchers who create and use the
different software packages have proposed the standard base reference
frames outlined below for nucleic acid conformational analysis. The
definitions build upon qualitative guidelines established previously
to specify the arrangements of bases and base pairs in DNA and RNA structures
. Base coordinates are derived from a survey of high resolution crystal
structures of nucleic acid analogs stored in the Cambridge Structural
Database . The coordinate frames are chosen so that complementary
bases form an ideal, planar Watson-Crick base pair in the undistorted
reference state with hydrogen bond donor-acceptor distances, C1'×××C1'
virtual lengths, and purine N9C1'×××C1' and pyrimidine N1C1'×××C1'
virtual angles consistent with values observed in the crystal structures
of relevant small molecules. Conformational analyses performed in this
reference frame lead to interpretations of local helical structure that
are essentially independent of computational scheme. A compilation of
base-pair parameters from representative A-DNA, B-DNA, and protein-bound
DNA structures from the Nucleic Acid Database (NDB)  provides
useful guidelines for understanding other nucleic acid structures.
Base coordinates. Models of the five common bases
(A, C, G, T, U) were generated from searches of the crystal structures
of small molecular weight analogse.g., free bases, nucleosides,
and nucleotidesin the most recent version of the Cambridge Structural
Database . The internal geometries and associated uncertainties in
this data set closely match numerical values reported in the recent
survey of nucleic acid base analogs by Clowney et al. . Because the
minor changes in chemical structure have essentially no effect on either
the ideal base-pair frame or the computed rigid body parameters, the
Clowney et al. bases are retained as standards.
Coordinate frame. The right-handed coordinate frame
attached to each base (Figure 1) follows established qualitative
guidelines . The x-axis points in the direction of the major groove
along what would be the pseudo-dyad axis of an ideal Watson-Crick base
pair, i.e., the perpendicular bisector of the C1'×××C1' vector spanning
the base pair. The y-axis runs along the long axis of the idealized
base pair in the direction of the sequence strand, parallel to the C1'×××C1'
vector, and displaced so as to pass through the intersection on the
(pseudo-dyad) x-axis of the vector connecting the pyrimidine Y(C6) and
purine R(C8) atoms. The z-axis is defined by the right-handed rule,
i.e., z = x ¥ y. For right-handed A- and B-DNA,
the z-axis accordingly points along the 5'- to 3'-direction of the sequence
Figure 1 Illustration of idealized base-pair
parameters, dC1'×××C1' and l, used respectively to displace and pivot
complementary bases in the optimization of the standard reference
frame for right-handed A- and B-DNA, with the origin at · and the
x- and y-axes pointing in the designated directions.
The location of the origin depends upon the width
of the idealized base pair, i.e., the C1'×××C1' spacing, dC1'×××C1',
and the pivoting of complementary bases, l, in the base-pair plane (see
Figure 1). The coordinates of the C1' atoms establish the pseudo-dyad
axis, i.e., the line in the base-pair plane where y = 0. The
rotations of each base about a normal axis passing through the C1' glycosyl
atoms determine the Y(C6) and R(C8) positions used to define the line
where x = 0.
Optimization. The atomic coordinates in Table 1
are expressed in the base-pair reference frames which optimize hydrogen-bond
donor-acceptor distances, dHB, and base "pivot" angles, lY and l R,
against corresponding standards (d0 = 3.0 Å and l0 = 54.5°).
The departures from ideality are measured by the sum of the absolute
values of the relative deviations,
where the last term runs over two (T×A) or three
(C×G) hydrogen bonds. (Optimization in terms of the sum of the squares
of the relative deviations of the lY, lR, and dHB yields similar results.)
Virtual distances and angles characterizing the optimized
configurations are detailed in Table 2.
The minor changes in chemical bonding between T versus C and A versus
G in combination with the constraints of two or three hydrogen bonds,
give rise to slightly different standard orientations of T×A and C×G
base pairs (compare dC1'×××C1', l Y, and lR values in Table 2).
Notably, the hydrogen bonds closer to the minor groove edges of all
base pairs are shorter than those nearer the major groove edges, as
is observed in high resolution structures of Watson-Crick base-pair
co-crystal complexes [6,7]. The hydrogen bonds are slightly shorter
on average in the small molecule analogs, which are in turn distorted
to a small degree from the perfectly planar base-pair geometry assumed
here (see  and Table 2 for numerical values).
Minor changes in the imposed configurational constraints
have almost no influence on the preferred base-pair arrangements, e.g.,
the increase of l0 from 54.5° to 55.5° shortens dC1'××× C1'
by less than 0.1 Å and perturbs hydrogen bond lengths by
less than 0.05 Å. The assignment of different rest states
for N×××H-N versus O×××H-N hydrogen bonds consistent with the hydrogen
bonding observed in the crystal structures of small organic compounds
[9-11], e.g., d N××× H-N 3.0 Å and d O××× H-N = 2.9 Å,
fails to reproduce the trends in hydrogen bond lengths noted above.
These differences in standard configurations also have a slight effect
on derived complementary base-pair parameters in representative oligonucleotide
structures, but virtually no effect on base-pair step parameters.
Computational independence. Local complementary
base-pair and dimer step parameters computed with respect to the standard
reference frames are nearly independent of analytical treatment (Figure 2).
The only significant discrepancies in derived values, illustrated here
for the DNA complexed with the TATA-box binding protein (TBP) ,
involve the Rise at highly kinked base-pair steps, which, as noted previously
, reflects an inconsistency in definition. The small differences
in Slide, Tilt, and Twist in this example stem from minor differences
in definition and in the choice of "middle frame."
Figure 2 Comparative analysis of local base-pair
(left) and dimer step (right) parameters (see schematic insets for
definitions) of the DNA associated with the yeast TATA-box binding
protein (TBP) in the 1.8 Å X-ray crystal complex  (NDB
entry:pdt012). Parameters are calculated with the seven different
analysis schemes within 3DNA (Lu & Olson, in preparation) using
the standard reference frame detailed in Tables 1 and 2. Dotted line
connects Rise values computed using the Curves definition . Numerical
values are tabulated at the following URL: http://rutchem.rutgers.edu/~olson/Tsukuba
Base-pair geometry in high resolution
A-DNA and B-DNA crystal structures similarly shows limited dependence
on computational methodology. The average values and dispersion of individual
parameters in Table 3
are representative of numerical values obtained with the algorithms
used in many nucleic-acid-analysis programs. A complete listing of local
A- and B-DNA parameters, expressed in terms of the standard reference
frame and computed within 3DNA (Lu & Olson, in preparation) using
the mathematical definitions of several different programsCEHS/SCHNAaP[13,14]CompDNA
[15,16], Curves [17,18], FREEHELIX , NGEOM [20,21], NUPARM [22,23],
and RNA [24-26], is reported at our website (see below). Since the angular
parameters differ by no more than 0.1° and most distances by 0.02 Å
or less, the general trends in the table can be used in combination
with the characteristic patterns of A- and B-DNA backbone and glycosyl
torsion angles  to classify local, right-handed, double helical
The subtle mathematical differences among nucleic-acid-analysis
programs, however, become critical in the construction of DNA models.
Seemingly minor numerical discrepancies can be magnified in polymeric
chains  and in knowledge-based potentials  derived from the
fluctuations and correlations of structural parameters. Duplex models
and simulations must accordingly be based on the algorithm from which
parameters are derived.
Conformational classification. The average values
of Roll, Twist, and Slide in Table 3
confirm conformational distinctions known since the earliest studies
of A- and B-DNA crystal structures [30,31]. Namely, the transformation
from B- to A-DNA tends to decrease Twist, increase Roll, and reduce
Slide. The standard deviations in recently accumulated crystallographic
data, however, show that only Slide retains the discriminating power
anticipated previously. Values of Slide below 0.8 Å
are typical of most A-DNA dimer steps and those greater than 0.8 Å
are found in the majority of B-forms. Slide is also more variable in
B-DNA vs. A-DNA dimer steps. The observed Twist and Roll angles, by
contrast, show significant overlaps over a broad range of values. Specifically,
Twist angles between 20° and 40° and Roll angles between 0°
and 15° are found in both A- and B-DNA structures. The values of
Twist and Roll are coupled with changes in Slide so that conformational
assignments should be made in the context of all three parameters .
The three remaining step parameters and the six complementary
base-pair parameters are unaffected by helical conformation. The mean
values and scatter of these values are roughly equivalent in high resolution
A- and B-DNA structures (Table 3). The constraints of hydrogen
bonding presumably give rise to the more limited variations in Opening
and Stretch compared to other complementary base-pair angles and distances.
Buckle, while fixed on average at zero, shows more pronounced fluctuations
than Propeller, which is decidedly perturbed from ideal, i.e., 0°,
planar geometry in all double helical structures.
Helical parameters. Parameters relating consecutive
residues with respect to a local helical axis can be computed using
CompDNA [15,16], NUPARM [22,23], RNA [24-26], and 3DNA (Lu & Olson,
in preparation), or in terms of a global axis with CEHS  (as implemented
in the SCHNAaP software package ), NEWHELIX , and Curves [17,18].
These angles and distances depend on how the helical axis is defined,
particularly in deformed segments of the double helical structure .
The local helical parameters of high resolution A- and B-DNA structures
in Table 3 complement the dimeric descriptions of these structures.
The x-displacement shares the same discriminating power as Slide in
differentiating A-DNA from B-DNA, as anticipated from model building
, whereas Inclination and Helical Twist span overlapping ranges
of values. The different mathematical definitions of local helical parameters
yield numerical similarities equivalent to those found with dimer step
parameters. Global helical parameters, which reflect a best-fit linear
or overall curved molecular axis, are not necessarily comparable with
these values (data not shown).
Intrinsic correlations. As is well known [1,25],
dimer step parameters depend on the choice of base-pair reference frame
and can be significantly perturbed by distortions of complementary base-pair
geometry. The base-pair reference frame in most nucleic-acid-analysis
programs is an intermediate between the coordinate frames of the constituent
bases . The origin of this "middle frame" is shifted by
significant distortions in Buckle and Opening, while the long y-axis
is rotated by perturbations of base-pair Shear and Stagger (Figure 3).
These changes, in turn, influence the step parameters describing the
orientation and positions of neighboring base pairs.
Figure 3 Schematic illustrations and scatter
plots of the intrinsic correlations of A- and B-DNA base-pair and
dimer step parameters associated with the standard reference frame.
Large distortions of Buckle and Opening move the origin (·) of the
base-pair reference frame, while significant changes in Shear and
Stagger reposition the long y-axis ()
of the base-pair frame.
The effects of complementary base-pair deformations
on dimer step parameters are most pronounced when perturbations of the
same type, but of the opposite sense, occur in successive residues,
i.e., Buckle, Opening, Shear, or Stagger is negative at base pair i
and positive at base pair i+1 or vice versa. For example, a large negative
difference in the buckle of consecutive base pairs, Buckle
= Buckle(i+1) Buckle(i), sometimes called Cup ,
adds to the computed base-pair Rise of "extreme" dimer steps of high
resolution A- and B-DNA crystal structures (Figure 3). Similarly,
a large positive value of Opening
increases Shift, while large negative values of Stagger
and large positive values of Shear
respectively enhance Tilt and Twist. Conversely, Rise, Shift, Tilt,
and Twist can be depressed, respectively, by large +Buckle,
(Figure 3). On the other hand, Roll and Slide are not appreciably
influenced by base-pair deformations.
Thus, extreme values of base-pair step parameters
may simply reflect distorted or altered, i.e., non-Watson-Crick, base-pairing
schemes. As a result, the computed Rise of a buckled dimer step with
a partially intercalated amino acid side chain in a protein-DNA complex
such as TBP-DNA  may approach the base-pair separation found at
a planar, fully drug-intercalated step. The dispersion of step parameters
is similarly influenced by occasional deformations of complementary
base-pair geometry. That is, Rise, Shift, Tilt, and Twist may appear
intrinsically flexible in sets of structures with distorted base pairing.
Non-Watson-Crick base pairs. Direct application of
the proposed reference frame to the analysis of non-Watson-Crick base
pairs yields numerical parameters characteristic of the particular hydrogen-bonding
scheme. For example, "wobble" G×T and A+×C base pairs are "sheared"
~2 Å relative to the Watson-Crick configuration, the displacement
being positive for the Y×R pair and negative for the R×Y association.
These large displacements, in turn, affect Twist along the lines described
in Figure 3. For example, the G×T mismatches in the d(CGCGAATTTGCG)2
duplex structure (NDB entry: bdl009)  introduce ~15° under-
and overtwisting in the associated CG and GA dimer steps since Shear
is negative at the former step and positive at the latter step. The
same principles apply in RNA structures where the G×U wobble assumes
an important role . On the other hand, Twist can be constrained
to typical A- or B-like values by proper choice of an intrinsic "wobble"
base-pair frame . The latter approach necessitates a carefully chosen
frame for each mode of base pairing. In the future, it may be necessary
to define standards for common non-Watson-Crick base-pairing schemes.
Protein-DNA interactions. Characterizing the geometry
of nucleic acids interacting with proteins, obviously, brings up a whole
new host of geometrical issues. However, the standard description of
base-pair geometry described here can be carried over, to a large degree,
to this problem, and many of the geometrical issues involved in describing
the protein are somewhat simpler than for the DNA, e.g., the description
of helical geometry for an a-helix versus that for the DNA double helix.
Supplementary tables and figures are available at:
Questions regarding the construction of the standard
frames and the computation of local base-pair parameters can be addressed
Wilma K. Olson and Xiang-Jun Lu
610 Taylor Road
Piscataway, NJ 08854-8087, USA
X.-J. & Olson, W. K. (1999) "Resolving the discrepancies among
nucleic acid conformational analyses," J. Mol. Biol. 285,
R. E., Bansal, M., Calladine, C. R., Diekmann, S., Hunter, W. N.,
Kennard, O., von Kitzing, E., Lavery, R., Nelson, H. C. M., Olson,
W. K., Saenger, W., Shakked, Z., Sklenar, H., Soumpasis, D. M., Tung,
C.-S., Wang, A. H.-J. & Zhurkin, V. B. (1989) "Definitions and
nomenclature of nucleic acid structure parameters," J. Mol. Biol.
F. H., Bellard, S., Brice, M. D., Cartwright, B. A., Doubleday, A.,
Higgs, H., Hummelink, T., Hummelink-Peters, B. G., Kennard, O., Motherwell,
W. D. S., Rodgers, J. R. & Watson, D. G. (1979) "The Cambridge
Crystallographic Data Centre: computer-based search, retrieval, analysis
and display of information.," Acta. Crystallogr. B35,
H. M., Olson, W. K., Beveridge, D. L., Westbrook, J., Gelbin, A.,
Demeny, T., Hsieh, S. H., Srinivasan, A. R. & Schneider, B. (1992)
"The nucleic acid database: a comprehensive relational database of
three dimensional structures of nucleic acids," Biophys. J.
L., Jain, S. C., Srinivasan, A. R., Westbrook, J., Olson, W. K. &
Berman, H. M. (1996) "Geometric parameters in nucleic acids: nitrogenous
bases," J. Am. Chem. Soc. 118, 509-518.
S., Takenaka, A. & Sasada, Y. (1984) "A model for interactions
of amino acid side chains with Watson-Crick base pair of guanine and
cytosine. Crystal structrure of 9-(2-carboxyethyl)guanine and its
crystalline complex with 1-methylcytosine," Bull. Chem. Soc. Jpn.
S., Takenaka, A. & Sasada, Y. (1985) "Model for interactions of
amino acid side chains with Watson-Crick base pair of guanine and
cytosine: crystal structure of 9-(2-carbamoylethyl)guanine and 1-methylcytosine
complex," Biochemistry 24, 508-512.
C. C. (1988) "Analysis of conformational parameters in nucleic acid
fragments. II. Co-crystal complexes of nucleic acid bases," Nucleic
Acids Res. 16, 385-393.
A. L. & Foces-Foces, C. (1990) "N-H×××N sp2 hydrogen interactions
in organic crystals," J. Mol. Struct. 238, 367-382.
A. & Filippini, G. (1994) "Geometry of the intermolecular X-H×××Y
(X, Y = N,O) hydrogen bond and the calibration of empirical hydrogen-bond
potentials," J. Phys. Chem. 98, 4831-4837.
B., Baudoux, G. & Durant, F. (1995) "A database study of intermolecular
NH×××O hydrogen bonds for carboxylates, sulfonates, and monohydrogen
phosphonates," Acta Cryst. B51, 103-107.
Y., Geiger, J. H., Hahn, S. & Sigler, P. B. (1993) "Crystal structure
of a yeast TBP/TATA-box complex," Nature 365, 512-520.
Hassan, M. A. & Calladine, C. R. (1995) "The assessment of the
geometry of dinucleotide steps in double-helical DNA: a new local
calculation scheme with an appendix.," J. Mol. Biol. 251,
X.-J., El Hassan, M. A. & Hunter, C. A. (1997) "Structure and
conformation of helical nucleic acids: analysis program (SCHNAaP),"
J. Mol. Biol. 273, 668-680.
A. A., Zhurkin, V. B. & Olson, W. K. (1995) "B-DNA twisting correlates
with base pair morphology," J. Mol. Biol. 247, 34-48.
K. M., Gorin, A. A., Zhurkin, V. B. & Olson, W. K. (1999) "DNA
stretching and compression: large-scale simulations of double helical
structures," J. Mol. Biol. 289, 1301-1326.
R. & Sklenar, H. (1988) "The definition of generalized helicoidal
parameters and of axis of curvature for irregular nucleic acids,"
J. Biomol. Struct. Dynam. 6, 63-91.
R. & Sklenar, H. (1989) "Defining the structure of irregular nucleic
acids: conventions and principles," J. Biomol. Struct. Dynam.
R. E. (1998) "DNA bending: the prevalence of kinkiness and the virtures
of normality," Nucleic Acids Res. 26, 1906-1926.
D. M. & Tung, C.-S. (1988) "A rigorous basepair oriented description
of DNA structures," J. Biomol. Struct. & Dyn. 6,
C.-S., Soumpasis, D. M. & Hummer, G. (1994) "An extension of the
rigorous base-unit oriented description of nucleic acid structures,"
J. Biomol. Struct. Dynam. 11, 1327-1344.
D. & Bansal, M. (1989) "A self-consistent formulation for analyses
and generation of non-uniform DNA structures," J. Biomol. Struct.
Dynam. 6, 93-104.
M., Bhattacharyya, D. & Ravi, B. (1995) "NUPARM and NUCGEN: software
for analysis and generation of sequence dependent nucleic acid structures,"
CABIOS 11, 281-287.
E. P. D., Babcock, M. S. & Olson, W. K. (1993) "Nucleic acids
structure analysis: a users guide to a collection of new analysis
programs," J. Biomol. Struct. Dynam. 11, 597-628.
M. S., Pednault, E. P. D. & Olson, W. K. (1994) "Nucleic acid
structure analysis. Mathematics for local Cartesian and helical structure
parameters that are truly comparable between structures," J. Mol.
Biol. 237, 125-156.
M. S. & Olson, W. K. (1994) "The effect of mathematics and coordinate
system on comparability and "dependencies" of nucleic acid structure
parameters," J. Mol. Biol. 237, 98-124.
B., Neidle, S. & Berman, H. M. (1997) "Conformations of the sugar-phosphate
backbone in helical DNA crystal structures," Biopolymers 42,
W. K., Marky, N. L., Jernigan, R. L. & Zhurkin, V. B. (1993) "Influence
of fluctuations on DNA curvature. A comparison of flexible and static
wedge models of intrinsically bent DNA," J. Mol. Biol. 232,
W. K., Gorin, A. A., Lu, X.-J., Hock, L. M. & Zhurkin, V. B. (1998)
"DNA sequence-dependent deformability deduced from protein-DNA crystal
complexes," Proc. Nat. Acad. Sci., USA 95, 11163-11168.
H. R., Wing, R. M., Takano, T., Broka, C., Tanaka, S., Itakura, K.
& Dickerson, R. E. (1981) "Structure of a B-DNA dodecamer: conformation
and dynamics," Proc. Nat. Acad. Sci., USA 78, 2179-2183.
C. R. & Drew, H. R. (1984) "A base-centered explanation of the
B-to-A transition in DNA," J. Mol. Biol. 178, 773-782.
A. V., Kopka, M. L., Drew, H. R. & Dickerson, R. E. (1982) "Reversible
bending and helix geometry in a B-DNA dodecamer: CGCTAATTCGCG," J.
Biol. Chem. 24, 14686-14707.
X.-J., Babcock, M. S. & Olson, W. K. (1999) "Mathematical overview
of nucleic acid analysis programs," J. Biomol. Struct. Dynam.
K., Privé, G. C. & Dickerson, R. E. (1991) "Analysis of
local helix geometry in three B-DNA decamers and eight dodecamers,"
J. Mol. Biol. 217, 201-214.
W. N., Brown, T., Kneale, G., Anand, N. N., Rabinovich, D. & Kennard,
O. (1987) "The structure of guanosine-thymidine mismatches in B-DNA
at 2.5 Ångstroms resolution," J. Biol. Chem 262,
B. & Westhof, E. (2000) "On the wobble GoU and related pairs,"
RNA 6, 9-15.
G. G., Yanagi, K. & Dickerson, R. E. (1991) "Structure of the
B-DNA dodecamer C-C-A-A-C-G-T-T-G-G and comparison with the isomorphous
decamers C-C-A-A-G-A-T-T-G-G and C-C-A-G-G-C-C-T-G-G," J. Mol.
Biol. 217, 177-199.