DNA Replication

Introduction

  • DNA replication is the biological process by which a cell makes an identical copy of its DNA before cell division.
  • It ensures that each daughter cell receives the same genetic information as the parent cell.
  • This process is fundamental for growth, development, repair, and reproduction in all living organisms.
  • The mechanism of DNA replication is described as semi-conservative: each new double helix contains one original (parental) strand and one newly synthesised strand.

  • Replication begins at specific sites on DNA called origins of replication and proceeds in both directions, forming replication forks.
  • Although the basic principle of replication is universal, the details vary between prokaryotes (such as bacteria, which have a single circular chromosome) and eukaryotes (such as humans, with multiple linear chromosomes).
  • Specialised enzymes and proteins—such as helicases, primase, DNA polymerases, ligase, and topoisomerases—work together in a highly coordinated way to unwind the DNA, synthesise new strands, correct errors, and ensure the process is accurate.


DNA structure


Basic building blocks

  • DNA = polymer of nucleotides. Each nucleotide = deoxyribose sugar + phosphate + base (A, T, C, G).

  • Bases are A (adenine), T (thymine), C (cytosine), G (guanine). A pairs with T (2 H-bonds), C pairs with G (3 H-bonds).

Strand orientation

  • DNA strands are antiparallel: one strand runs 5′ → 3′, the other 3′ → 5′.

  • Polymerases add new nucleotides only to a free 3′-OH, so synthesis is always 5′ → 3′.

Helix features relevant to replication

  • Major and minor grooves: proteins recognize sequences via grooves.

  • Base stacking stabilizes helix.

  • AT-rich regions (fewer H-bonds) are easier to unwind — origins of replication often have AT-rich “DNA-unwinding elements” (DUE).

  • Supercoiling: unwinding ahead of a fork creates positive supercoils; topoisomerases relieve them.

Semi-conservative nature

  • Each daughter DNA has one old (parental) strand and one new strand — this is the direct consequence of complementary pairing and is why replication is accurate.

 


DNA polymerase


General function

  • Catalyzes formation of phosphodiester bond between the 3′-OH of the growing strand and the incoming dNTP (deoxynucleotide triphosphate).

  • Requires a template strand and a primer (short oligo with a free 3′-OH).

Directionality & chemistry

  • Polymerization occurs 5′ → 3′.

  • Two-metal-ion mechanism at active site: metal ions (usually Mg²⁺) help position substrate and stabilize leaving group (pyrophosphate).

Fidelity

  • High fidelity comes from:

    • Base-pairing specificity (proper Watson–Crick pairing).

    • Proofreading (3′→5′ exonuclease) activity in replicative polymerases that removes the incorrectly paired nucleotide.

    • Post-replication mismatch repair corrects remaining errors.

Processivity

  • Processivity = number of nucleotides added per binding event.

  • Sliding clamps (β-clamp in bacteria, PCNA in eukaryotes) increase processivity by tethering polymerase to DNA.

  • Clamp loader (γ-complex in bacteria, RFC in eukaryotes) uses ATP to load the clamp.

Prokaryotic polymerases (examples & roles)

  • DNA Pol I: removes RNA primers (5′→3′ exonuclease) and fills gaps with DNA; has 3′→5′ proofreading too but low processivity.

  • DNA Pol III: main replicative enzyme in bacteria (high processivity with β-clamp).

  • DNA Pol II, IV, V: involved in repair or translesion synthesis (TLS).

Eukaryotic polymerases (major players)

  • Pol α (alpha): primase-associated; lays an RNA–DNA primer (short DNA after RNA primer).

  • Pol δ (delta): elongates lagging strand, does Okazaki fragment synthesis/processing.

  • Pol ε (epsilon): primarily elongates leading strand.

  • Pol γ: replicates mitochondrial DNA.

  • TLS polymerases (η, κ, ι, Rev1, Pol ζ): bypass DNA lesions but lower fidelity.

Polymerase switching

  • At start of replication Pol α initiates primers; then switch to Pol δ/ε for processive elongation. Sliding clamp and clamp loader mediate this exchange.

 


Replication process


Replication consists of initiation, elongation, and termination. Below each is expanded in detail.

A. Initiation — getting started

Where

  • Origins of replication are specific DNA sequences where replication begins.

    • Bacteria: a single origin (OriC).

    • Eukaryotes: many origins distributed along chromosomes.

Key steps (broad)

  1. Origin recognition — origin-binding proteins identify origins.

  2. Origin opening — local unwinding of DNA to form a replication bubble.

  3. Helicase loading & activation — helicase placed on DNA then activated to unwind continuously.

  4. Primer synthesis — primase synthesizes short RNA primers to provide 3′-OH.

  5. Recruitment of polymerases & accessory factors — clamp, clamp loader, polymerases assemble.

Bacterial initiation (detailed)

  • DnaA recognizes DnaA-box sequences within OriC and oligomerizes, causing bending and opening of AT-rich DUE.

  • DnaC helps load DnaB helicase onto unwound DNA (requires ATP).

  • DnaG primase synthesizes RNA primers.

  • SSB (single-strand binding protein) binds exposed ssDNA to prevent reannealing.

  • DNA Pol III holoenzyme (core polymerase + β-clamp + clamp loader) begins DNA synthesis.

Eukaryotic initiation (detailed)

  • In G1 phase, origins are licensed by loading the pre-replication complex (pre-RC):

    • ORC (Origin Recognition Complex) binds origin.

    • Cdc6 and Cdt1 recruit the MCM2–7 helicase (loaded as an inactive double hexamer).

  • In S phase, kinases (S-CDK and DDK) phosphorylate initiation factors:

    • Cdc45 and GINS join MCM to form the active CMG helicase (Cdc45–MCM–GINS).

    • This converts the pre-RC to an active pre-initiation complex (pre-IC) and opens the origin.

  • Pol α-primase synthesizes RNA–DNA primers. Then Pol ε/δ are recruited to begin processive synthesis.

Important control point: Licensing occurs only in G1 — prevents origins from firing more than once per cell cycle.


B. Pre-replication complex (pre-RC)

Definition & function

  • Complex assembled at origins during G1 that marks where replication may start; it “licenses” origins.

  • Composition: ORC + Cdc6 + Cdt1 + MCM2–7.

  • MCM helicase is loaded in an inactive form (double hexamer) around dsDNA.

Why necessary

  • Prevents re-replication: after an origin fires in S-phase, re-loading of MCM is blocked until mitosis — ensures one round per cycle.

Regulation

  • Cdc6 and Cdt1 availability and phosphorylation controlled by CDKs and ubiquitin-mediated degradation.

  • Geminin (in metazoans) inhibits Cdt1 to prevent re-loading during S/G2.


C. Pre-initiation complex (pre-IC)

Activation step

  • In early S-phase, S-CDK and DDK phosphorylate components to convert pre-RC to active replication forks.

  • Cdc45 and GINS join MCM to make CMG, the active helicase that unwinds DNA.

  • Single-stranded region forms; RPA binds ssDNA; primase makes primers.

  • Polymerases, clamps, and other replisome factors assemble.

Note: Many proteins (Sld2, Sld3, Dpb11 in yeast; Treslin/TICRR in metazoans) mediate interactions and are regulated by phosphorylation.


Elongation 


Replication fork

  • Fork is a Y-shaped structure with two single-stranded templates and two nascent strands.

  • Two replication forks move away bidirectionally from each origin.

Leading strand

  • Synthesized continuously in 5′→3′ as the helicase exposes new template.

  • In eukaryotes: mainly Pol ε; in bacteria: Pol III.

Lagging strand

  • Synthesized discontinuously as Okazaki fragments (short stretches).

  • Each fragment requires:

    1. Primase synthesizes RNA primer.

    2. Polymerase extends until hitting previous fragment.

    3. Primer removal and replacement with DNA.

    4. DNA ligase seals the nick.

Processing Okazaki fragments (eukaryotes)

  • RNase H2 removes RNA portion of primers.

  • FEN1 (flap endonuclease 1) removes displaced flaps during strand-displacement synthesis.

  • DNA Pol δ performs strand-displacement synthesis.

  • DNA Ligase I joins fragments.

Lagging strand “trombone” model

  • The lagging template loops so that polymerases moving in the same physical direction can synthesize both strands. Each Okazaki fragment cycle: primer → extension → flap removal → ligation → loop release.

Chromatin reassembly

  • Nucleosomes are disrupted ahead of fork and reassembled behind using old histones + new histone deposition by chaperones (CAF-1, Asf1) to restore chromatin.

Accessory factors

  • PCNA (clamp) tethers polymerases.

  • RFC (replication factor C) loads PCNA using ATP.

  • RPA (replication protein A) binds ssDNA.

  • Topoisomerase I/II relieve torsional strain; in bacteria, gyrase (a type II topo) introduces negative supercoils.

 


Replication fork dynamics 


Coordination

  • Both strands synthesized simultaneously though directionality differs — replisome coordinates leading/lagging polymerases.

  • Continuous recruitment of primase to produce primers on lagging strand at intervals.

Stalling and restart

  • Fork may stall at DNA damage, tightly bound proteins, or secondary structures.

  • Cells stabilize stalled forks using checkpoint proteins (ATR/Chk1 pathway in eukaryotes) and fork-protection proteins to prevent collapse.

  • Fork collapse can cause double-strand breaks; restart often uses homologous recombination (Rad51 in eukaryotes, RecA in bacteria).

Translesion synthesis (TLS)

  • When replication encounters a lesion that blocks high-fidelity polymerases, TLS polymerases insert bases opposite lesions (error-prone) to allow replication to continue; PCNA ubiquitination helps recruit TLS polymerases.

 


DNA replication proteins 


Helicase

  • Unwinds dsDNA into ssDNA. Eukaryotes use MCM2–7, bacteria use DnaB.

Primase

  • Makes short RNA primers (bacterial DnaG; in eukaryotes, primase is part of Pol α complex).

DNA polymerases

  • Main enzymes for DNA synthesis (Pol III in bacteria; Pol δ/ε in eukaryotes).

Sliding clamp

  • β-clamp (bacteria) or PCNA (eukaryotes) — tethers polymerase to DNA increasing processivity.

Clamp loader

  • Loads sliding clamp onto DNA (γ complex in bacteria; RFC in eukaryotes).

Single-strand binding proteins

  • SSB in bacteria; RPA in eukaryotes. Prevent ssDNA secondary structure and protect from nucleases.

Topoisomerases

  • Relieve supercoils ahead of fork: Type I (single-strand cuts) and Type II (double-strand cuts — e.g., gyrase in bacteria, Topo II in eukaryotes).

DNA ligase

  • Seals nicks between Okazaki fragments or at repair sites (LigA in bacteria; Ligase I in eukaryotes).

RNase H & FEN1

  • Remove RNA primers and process DNA flaps.

Telomerase

  • Reverse-transcriptase that extends telomeres using its own RNA template — solves end-replication problem in eukaryotes.

ORC, Cdc6, Cdt1

  • Origin recognition and licensing proteins in eukaryotes.

CMG complex

  • Active helicase complex (Cdc45–MCM–GINS) in eukaryotes.

Checkpoint proteins

  • ATR/ATM, Chk1/Chk2 — detect replication stress / DNA damage and delay cell cycle.

Repair proteins

  • Mismatch repair (MutS/MutL in bacteria; MSH/MLH in eukaryotes), base excision repair, nucleotide excision repair components.

 


Replication machinery 

What is the replisome?

  • A multi-protein complex assembled at each replication fork to carry out coordinated replication of both strands.

Core components

  • Active CMG helicase (eukaryotes) / DnaB (bacteria).

  • DNA polymerases for leading & lagging strands.

  • Sliding clamps and clamp loaders.

  • Primase and SSB/RPA.

  • Additional factors: topoisomerases, ligase, nucleosome chaperones (eukaryotes).

How it works

  • Helicase unwinds; primase lays primer; clamp loader places PCNA/β-clamp; polymerase binds and extends; lagging strand loops and cycles to generate Okazaki fragments.

Coordination

  • Physical interactions or scaffolding proteins coordinate components so the fork moves as a single machine.

 


Termination 


Bacterial termination

  • Circular chromosome replicates until forks meet roughly opposite OriC.

  • Ter sites bound by Tus protein can block fork progression in a polar manner to help coordinate termination.

  • After replication, daughter circular chromosomes can be catenated (interlinked) — topoisomerase IV or Topo II decatenates them.

Eukaryotic termination

  • Termination occurs when two forks converge.

  • Replication intermediates are resolved by nucleases and topoisomerases; replisome components are disassembled.

  • End-replication problem: Because lagging strand synthesis requires priming, the extreme 5′ end cannot be copied — leads to telomere shortening.

  • Telomerase elongates telomeric repeats, adding TTAGGG (in human) repeats using its RNA template; active in germline, stem cells, many cancer cells (absent in most somatic cells).

Telomere protection

  • Telomeres are bound by shelterin complex to protect ends and regulate telomerase.

 


Regulation of DNA replication 


Once-per-cycle control

  • Licensing in G1 (loading of MCM) and activation in S-phase ensure each origin fires once.

  • CDKs (Cyclin-dependent kinases) and DDK regulate activation and prevent re-licensing (phosphorylation + degradation).

  • Geminin inhibits Cdt1 (in metazoans), preventing new MCM loading during S/G2.

S-phase checkpoint

  • ATR-ATRIP senses RPA-coated ssDNA (stalled forks).

  • Activates Chk1, which:

    • Stabilizes stalled forks.

    • Inhibits new origin firing if the cell is under replication stress.

    • Pauses cell cycle progression (via Cdc25 inhibition).

Replication timing and origin efficiency

  • Not all origins fire simultaneously. Origins have different firing times (early vs late S-phase).

  • Origin choice influenced by chromatin structure, transcriptional activity, and cell type.

dNTP pools & metabolic control

  • Balanced dNTP pools (via ribonucleotide reductase) are essential for fidelity.

 


Eukaryotic replication 


Multiple origins

  • Eukaryotic chromosomes are large; multiple origins allow faster complete replication.

Chromatin environment

  • DNA is packaged into nucleosomes. Replication requires:

    • Disassembly of nucleosomes before fork passage.

    • Reassembly behind fork using old histones + deposition of new histones (CAF-1, Asf1).

Replication foci (factories)

  • Replication occurs at discrete nuclear sites where multiple active replisomes cluster — visible as replication foci.

  • Each focus replicates a local region of chromatin; foci dynamics change across S-phase.

Telomeres & heterochromatin

  • Telomeric and heterochromatic regions replicate late; require specialized handling.

Mitochondrial DNA replication

  • Uses different machinery (Pol γ, distinct origins, different regulation).

 


Replication focus 


Definition

  • Replication focus (factory) = nuclear site where several replication forks are active simultaneously.

Properties

  • Observed with fluorescent markers: BrdU/EdU incorporation or PCNA-GFP.

  • Each focus contains many molecules of replication proteins working together — efficient local replication.

Functional significance

  • Spatial organization speeds replication and coordinates chromatin reassembly and repair.

 


Bacterial replication 


OriC in E. coli

  • OriC contains multiple DnaA boxes (DnaA-binding sites) and an AT-rich DUE (easier to open).

  • DnaA-ATP oligomerizes to open the origin and recruit DnaB helicase.

DnaB and DnaC

  • DnaC helps load DnaB onto ssDNA; DnaB unwinds DNA and recruits primase.

DNA Pol III holoenzyme

  • Multi-subunit complex with high processivity (β-clamp).

  • Contains multiple cores so both leading and lagging strands can be synthesized.

Regulation by methylation & SeqA

  • SeqA binds hemimethylated DNA (just after replication) and temporarily prevents immediate re-initiation at OriC in rapidly growing cells.

Termination

  • Tus–Ter system directs forks and helps coordinate termination; decatenation by Topo IV finishes separation.

Plasmid replication

  • Plasmids may use theta replication (similar to chromosomal) or rolling-circle replication (a nicked circular intermediate extends).

 


Problems with DNA replication


1. Base misincorporation → mutations

  • Solutions: polymerase proofreading (3′→5′ exonuclease) + Mismatch repair (MMR) (MutS/MutL/MutH in bacteria; MSH/MLH in eukaryotes).

2. DNA damage blocking replication (UV, alkylation, oxidative damage)

  • Nucleotide Excision Repair (NER) removes bulky lesions.

  • Base Excision Repair (BER) removes small damaged bases.

  • Translesion synthesis (TLS) allows bypass but is error-prone.

3. Replication fork stalling & collapse

  • Forks stall at lesions or protein blocks; checkpoint proteins stabilize forks.

  • Collapse can cause double-strand breaks → homologous recombination (HR) proteins (Rad51/RecA) restart replication.

4. Replication–transcription collisions

  • If transcription machinery is on the same template region, collisions can stall forks and create R-loops (RNA:DNA hybrids); enzymes (RNase H) and helicases resolve these.

5. Secondary structures and repetitive DNA

  • Palindromes, G-quadruplexes, microsatellite repeats can cause polymerase slippage → expansions/contractions.

  • Specialized helicases (WRN, BLM) and polymerases help resolve structured DNA.

6. End-replication problem

  • Telomere shortening leads to replicative senescence. Telomerase extends ends in germline/stem cells.

7. dNTP imbalance

  • Alters fidelity and can cause mismatches. RNR (ribonucleotide reductase) regulates dNTP synthesis.

Diseases linked to replication problems

  • Bloom syndrome, Werner syndrome, and certain cancers are linked to defects in replication/repair proteins.

 


Polymerase Chain Reaction 


Goal

  • Amplify a specific DNA fragment exponentially in vitro.

Main components

  • Template DNA

  • Two primers (forward and reverse) flanking target

  • dNTPs

  • DNA polymerase (heat-stable, e.g., Taq)

  • Buffer with Mg²⁺

Thermal cycling steps

  1. Denaturation (~95°C): separation of dsDNA to ssDNA.

  2. Annealing (50–65°C): primers bind to complementary sequences.

  3. Extension (72°C): DNA polymerase extends primers; new DNA formed.

These three steps are repeated for ~25–40 cycles → exponential amplification (2ⁿ copies roughly).

Primer design rules

  • 18–25 nt length, GC content 40–60%, melting temperature (Tm) ~55–65°C, avoid secondary structure or complementarity between primers (prevents primer-dimers), place primers unique to target.

Enzyme choices

  • Taq polymerase: heat-stable, fast, no 3′→5′ proofreading → errors possible.

  • High-fidelity polymerases (Pfu, Phusion): have proofreading; better for cloning or sequence-accurate work.

Variations

  • qPCR (real-time PCR): monitors amplification in real time using fluorescent dyes (SYBR Green) or probes (TaqMan) — allows quantification.

  • RT-PCR: reverse transcription of RNA to cDNA, then PCR — used for gene expression.

  • Multiplex PCR: multiple primer pairs amplify several targets in one reaction.

  • Touchdown PCR: annealing Tm decreased over early cycles to increase specificity.

  • Hot-start PCR: polymerase activated only at higher temps to reduce nonspecific amplification.

Applications

  • Diagnostics, cloning, genotyping, forensic analysis, expression analysis.