
Introduction
- DNA replication is the biological process by which a cell makes an identical copy of its DNA before cell division.
- It ensures that each daughter cell receives the same genetic information as the parent cell.
- This process is fundamental for growth, development, repair, and reproduction in all living organisms.
- The mechanism of DNA replication is described as semi-conservative: each new double helix contains one original (parental) strand and one newly synthesised strand.
- Replication begins at specific sites on DNA called origins of replication and proceeds in both directions, forming replication forks.
- Although the basic principle of replication is universal, the details vary between prokaryotes (such as bacteria, which have a single circular chromosome) and eukaryotes (such as humans, with multiple linear chromosomes).
- Specialised enzymes and proteins—such as helicases, primase, DNA polymerases, ligase, and topoisomerases—work together in a highly coordinated way to unwind the DNA, synthesise new strands, correct errors, and ensure the process is accurate.
DNA structure
Basic building blocks
-
DNA = polymer of nucleotides. Each nucleotide = deoxyribose sugar + phosphate + base (A, T, C, G).
-
Bases are A (adenine), T (thymine), C (cytosine), G (guanine). A pairs with T (2 H-bonds), C pairs with G (3 H-bonds).
Strand orientation
-
DNA strands are antiparallel: one strand runs 5′ → 3′, the other 3′ → 5′.
-
Polymerases add new nucleotides only to a free 3′-OH, so synthesis is always 5′ → 3′.
Helix features relevant to replication
-
Major and minor grooves: proteins recognize sequences via grooves.
-
Base stacking stabilizes helix.
-
AT-rich regions (fewer H-bonds) are easier to unwind — origins of replication often have AT-rich “DNA-unwinding elements” (DUE).
-
Supercoiling: unwinding ahead of a fork creates positive supercoils; topoisomerases relieve them.
Semi-conservative nature
-
Each daughter DNA has one old (parental) strand and one new strand — this is the direct consequence of complementary pairing and is why replication is accurate.
DNA polymerase
General function
-
Catalyzes formation of phosphodiester bond between the 3′-OH of the growing strand and the incoming dNTP (deoxynucleotide triphosphate).
-
Requires a template strand and a primer (short oligo with a free 3′-OH).
Directionality & chemistry
-
Polymerization occurs 5′ → 3′.
-
Two-metal-ion mechanism at active site: metal ions (usually Mg²⁺) help position substrate and stabilize leaving group (pyrophosphate).
Fidelity
-
High fidelity comes from:
-
Base-pairing specificity (proper Watson–Crick pairing).
-
Proofreading (3′→5′ exonuclease) activity in replicative polymerases that removes the incorrectly paired nucleotide.
-
Post-replication mismatch repair corrects remaining errors.
-
Processivity
-
Processivity = number of nucleotides added per binding event.
-
Sliding clamps (β-clamp in bacteria, PCNA in eukaryotes) increase processivity by tethering polymerase to DNA.
-
Clamp loader (γ-complex in bacteria, RFC in eukaryotes) uses ATP to load the clamp.
Prokaryotic polymerases (examples & roles)
-
DNA Pol I: removes RNA primers (5′→3′ exonuclease) and fills gaps with DNA; has 3′→5′ proofreading too but low processivity.
-
DNA Pol III: main replicative enzyme in bacteria (high processivity with β-clamp).
-
DNA Pol II, IV, V: involved in repair or translesion synthesis (TLS).
Eukaryotic polymerases (major players)
-
Pol α (alpha): primase-associated; lays an RNA–DNA primer (short DNA after RNA primer).
-
Pol δ (delta): elongates lagging strand, does Okazaki fragment synthesis/processing.
-
Pol ε (epsilon): primarily elongates leading strand.
-
Pol γ: replicates mitochondrial DNA.
-
TLS polymerases (η, κ, ι, Rev1, Pol ζ): bypass DNA lesions but lower fidelity.
Polymerase switching
-
At start of replication Pol α initiates primers; then switch to Pol δ/ε for processive elongation. Sliding clamp and clamp loader mediate this exchange.
Replication process
Replication consists of initiation, elongation, and termination. Below each is expanded in detail.
A. Initiation — getting started
Where
-
Origins of replication are specific DNA sequences where replication begins.
-
Bacteria: a single origin (OriC).
-
Eukaryotes: many origins distributed along chromosomes.
-
Key steps (broad)
-
Origin recognition — origin-binding proteins identify origins.
-
Origin opening — local unwinding of DNA to form a replication bubble.
-
Helicase loading & activation — helicase placed on DNA then activated to unwind continuously.
-
Primer synthesis — primase synthesizes short RNA primers to provide 3′-OH.
-
Recruitment of polymerases & accessory factors — clamp, clamp loader, polymerases assemble.
Bacterial initiation (detailed)
-
DnaA recognizes DnaA-box sequences within OriC and oligomerizes, causing bending and opening of AT-rich DUE.
-
DnaC helps load DnaB helicase onto unwound DNA (requires ATP).
-
DnaG primase synthesizes RNA primers.
-
SSB (single-strand binding protein) binds exposed ssDNA to prevent reannealing.
-
DNA Pol III holoenzyme (core polymerase + β-clamp + clamp loader) begins DNA synthesis.
Eukaryotic initiation (detailed)
-
In G1 phase, origins are licensed by loading the pre-replication complex (pre-RC):
-
ORC (Origin Recognition Complex) binds origin.
-
Cdc6 and Cdt1 recruit the MCM2–7 helicase (loaded as an inactive double hexamer).
-
-
In S phase, kinases (S-CDK and DDK) phosphorylate initiation factors:
-
Cdc45 and GINS join MCM to form the active CMG helicase (Cdc45–MCM–GINS).
-
This converts the pre-RC to an active pre-initiation complex (pre-IC) and opens the origin.
-
-
Pol α-primase synthesizes RNA–DNA primers. Then Pol ε/δ are recruited to begin processive synthesis.
Important control point: Licensing occurs only in G1 — prevents origins from firing more than once per cell cycle.
B. Pre-replication complex (pre-RC)
Definition & function
-
Complex assembled at origins during G1 that marks where replication may start; it “licenses” origins.
-
Composition: ORC + Cdc6 + Cdt1 + MCM2–7.
-
MCM helicase is loaded in an inactive form (double hexamer) around dsDNA.
Why necessary
-
Prevents re-replication: after an origin fires in S-phase, re-loading of MCM is blocked until mitosis — ensures one round per cycle.
Regulation
-
Cdc6 and Cdt1 availability and phosphorylation controlled by CDKs and ubiquitin-mediated degradation.
-
Geminin (in metazoans) inhibits Cdt1 to prevent re-loading during S/G2.
C. Pre-initiation complex (pre-IC)
Activation step
-
In early S-phase, S-CDK and DDK phosphorylate components to convert pre-RC to active replication forks.
-
Cdc45 and GINS join MCM to make CMG, the active helicase that unwinds DNA.
-
Single-stranded region forms; RPA binds ssDNA; primase makes primers.
-
Polymerases, clamps, and other replisome factors assemble.
Note: Many proteins (Sld2, Sld3, Dpb11 in yeast; Treslin/TICRR in metazoans) mediate interactions and are regulated by phosphorylation.
Elongation
Replication fork
-
Fork is a Y-shaped structure with two single-stranded templates and two nascent strands.
-
Two replication forks move away bidirectionally from each origin.
Leading strand
-
Synthesized continuously in 5′→3′ as the helicase exposes new template.
-
In eukaryotes: mainly Pol ε; in bacteria: Pol III.
Lagging strand
-
Synthesized discontinuously as Okazaki fragments (short stretches).
-
Each fragment requires:
-
Primase synthesizes RNA primer.
-
Polymerase extends until hitting previous fragment.
-
Primer removal and replacement with DNA.
-
DNA ligase seals the nick.
-
Processing Okazaki fragments (eukaryotes)
-
RNase H2 removes RNA portion of primers.
-
FEN1 (flap endonuclease 1) removes displaced flaps during strand-displacement synthesis.
-
DNA Pol δ performs strand-displacement synthesis.
-
DNA Ligase I joins fragments.
Lagging strand “trombone” model
-
The lagging template loops so that polymerases moving in the same physical direction can synthesize both strands. Each Okazaki fragment cycle: primer → extension → flap removal → ligation → loop release.
Chromatin reassembly
-
Nucleosomes are disrupted ahead of fork and reassembled behind using old histones + new histone deposition by chaperones (CAF-1, Asf1) to restore chromatin.
Accessory factors
-
PCNA (clamp) tethers polymerases.
-
RFC (replication factor C) loads PCNA using ATP.
-
RPA (replication protein A) binds ssDNA.
-
Topoisomerase I/II relieve torsional strain; in bacteria, gyrase (a type II topo) introduces negative supercoils.
Replication fork dynamics
Coordination
-
Both strands synthesized simultaneously though directionality differs — replisome coordinates leading/lagging polymerases.
-
Continuous recruitment of primase to produce primers on lagging strand at intervals.
Stalling and restart
-
Fork may stall at DNA damage, tightly bound proteins, or secondary structures.
-
Cells stabilize stalled forks using checkpoint proteins (ATR/Chk1 pathway in eukaryotes) and fork-protection proteins to prevent collapse.
-
Fork collapse can cause double-strand breaks; restart often uses homologous recombination (Rad51 in eukaryotes, RecA in bacteria).
Translesion synthesis (TLS)
-
When replication encounters a lesion that blocks high-fidelity polymerases, TLS polymerases insert bases opposite lesions (error-prone) to allow replication to continue; PCNA ubiquitination helps recruit TLS polymerases.
DNA replication proteins
Helicase
-
Unwinds dsDNA into ssDNA. Eukaryotes use MCM2–7, bacteria use DnaB.
Primase
-
Makes short RNA primers (bacterial DnaG; in eukaryotes, primase is part of Pol α complex).
DNA polymerases
-
Main enzymes for DNA synthesis (Pol III in bacteria; Pol δ/ε in eukaryotes).
Sliding clamp
-
β-clamp (bacteria) or PCNA (eukaryotes) — tethers polymerase to DNA increasing processivity.
Clamp loader
-
Loads sliding clamp onto DNA (γ complex in bacteria; RFC in eukaryotes).
Single-strand binding proteins
-
SSB in bacteria; RPA in eukaryotes. Prevent ssDNA secondary structure and protect from nucleases.
Topoisomerases
-
Relieve supercoils ahead of fork: Type I (single-strand cuts) and Type II (double-strand cuts — e.g., gyrase in bacteria, Topo II in eukaryotes).
DNA ligase
-
Seals nicks between Okazaki fragments or at repair sites (LigA in bacteria; Ligase I in eukaryotes).
RNase H & FEN1
-
Remove RNA primers and process DNA flaps.
Telomerase
-
Reverse-transcriptase that extends telomeres using its own RNA template — solves end-replication problem in eukaryotes.
ORC, Cdc6, Cdt1
-
Origin recognition and licensing proteins in eukaryotes.
CMG complex
-
Active helicase complex (Cdc45–MCM–GINS) in eukaryotes.
Checkpoint proteins
-
ATR/ATM, Chk1/Chk2 — detect replication stress / DNA damage and delay cell cycle.
Repair proteins
-
Mismatch repair (MutS/MutL in bacteria; MSH/MLH in eukaryotes), base excision repair, nucleotide excision repair components.
Replication machinery
What is the replisome?
-
A multi-protein complex assembled at each replication fork to carry out coordinated replication of both strands.
Core components
-
Active CMG helicase (eukaryotes) / DnaB (bacteria).
-
DNA polymerases for leading & lagging strands.
-
Sliding clamps and clamp loaders.
-
Primase and SSB/RPA.
-
Additional factors: topoisomerases, ligase, nucleosome chaperones (eukaryotes).
How it works
-
Helicase unwinds; primase lays primer; clamp loader places PCNA/β-clamp; polymerase binds and extends; lagging strand loops and cycles to generate Okazaki fragments.
Coordination
-
Physical interactions or scaffolding proteins coordinate components so the fork moves as a single machine.
Termination
Bacterial termination
-
Circular chromosome replicates until forks meet roughly opposite OriC.
-
Ter sites bound by Tus protein can block fork progression in a polar manner to help coordinate termination.
-
After replication, daughter circular chromosomes can be catenated (interlinked) — topoisomerase IV or Topo II decatenates them.
Eukaryotic termination
-
Termination occurs when two forks converge.
-
Replication intermediates are resolved by nucleases and topoisomerases; replisome components are disassembled.
-
End-replication problem: Because lagging strand synthesis requires priming, the extreme 5′ end cannot be copied — leads to telomere shortening.
-
Telomerase elongates telomeric repeats, adding TTAGGG (in human) repeats using its RNA template; active in germline, stem cells, many cancer cells (absent in most somatic cells).
Telomere protection
-
Telomeres are bound by shelterin complex to protect ends and regulate telomerase.
Regulation of DNA replication
Once-per-cycle control
-
Licensing in G1 (loading of MCM) and activation in S-phase ensure each origin fires once.
-
CDKs (Cyclin-dependent kinases) and DDK regulate activation and prevent re-licensing (phosphorylation + degradation).
-
Geminin inhibits Cdt1 (in metazoans), preventing new MCM loading during S/G2.
S-phase checkpoint
-
ATR-ATRIP senses RPA-coated ssDNA (stalled forks).
-
Activates Chk1, which:
-
Stabilizes stalled forks.
-
Inhibits new origin firing if the cell is under replication stress.
-
Pauses cell cycle progression (via Cdc25 inhibition).
-
Replication timing and origin efficiency
-
Not all origins fire simultaneously. Origins have different firing times (early vs late S-phase).
-
Origin choice influenced by chromatin structure, transcriptional activity, and cell type.
dNTP pools & metabolic control
-
Balanced dNTP pools (via ribonucleotide reductase) are essential for fidelity.
Eukaryotic replication
Multiple origins
-
Eukaryotic chromosomes are large; multiple origins allow faster complete replication.
Chromatin environment
-
DNA is packaged into nucleosomes. Replication requires:
-
Disassembly of nucleosomes before fork passage.
-
Reassembly behind fork using old histones + deposition of new histones (CAF-1, Asf1).
-
Replication foci (factories)
-
Replication occurs at discrete nuclear sites where multiple active replisomes cluster — visible as replication foci.
-
Each focus replicates a local region of chromatin; foci dynamics change across S-phase.
Telomeres & heterochromatin
-
Telomeric and heterochromatic regions replicate late; require specialized handling.
Mitochondrial DNA replication
-
Uses different machinery (Pol γ, distinct origins, different regulation).
Replication focus
Definition
-
Replication focus (factory) = nuclear site where several replication forks are active simultaneously.
Properties
-
Observed with fluorescent markers: BrdU/EdU incorporation or PCNA-GFP.
-
Each focus contains many molecules of replication proteins working together — efficient local replication.
Functional significance
-
Spatial organization speeds replication and coordinates chromatin reassembly and repair.
Bacterial replication
OriC in E. coli
-
OriC contains multiple DnaA boxes (DnaA-binding sites) and an AT-rich DUE (easier to open).
-
DnaA-ATP oligomerizes to open the origin and recruit DnaB helicase.
DnaB and DnaC
-
DnaC helps load DnaB onto ssDNA; DnaB unwinds DNA and recruits primase.
DNA Pol III holoenzyme
-
Multi-subunit complex with high processivity (β-clamp).
-
Contains multiple cores so both leading and lagging strands can be synthesized.
Regulation by methylation & SeqA
-
SeqA binds hemimethylated DNA (just after replication) and temporarily prevents immediate re-initiation at OriC in rapidly growing cells.
Termination
-
Tus–Ter system directs forks and helps coordinate termination; decatenation by Topo IV finishes separation.
Plasmid replication
-
Plasmids may use theta replication (similar to chromosomal) or rolling-circle replication (a nicked circular intermediate extends).
Problems with DNA replication
1. Base misincorporation → mutations
-
Solutions: polymerase proofreading (3′→5′ exonuclease) + Mismatch repair (MMR) (MutS/MutL/MutH in bacteria; MSH/MLH in eukaryotes).
2. DNA damage blocking replication (UV, alkylation, oxidative damage)
-
Nucleotide Excision Repair (NER) removes bulky lesions.
-
Base Excision Repair (BER) removes small damaged bases.
-
Translesion synthesis (TLS) allows bypass but is error-prone.
3. Replication fork stalling & collapse
-
Forks stall at lesions or protein blocks; checkpoint proteins stabilize forks.
-
Collapse can cause double-strand breaks → homologous recombination (HR) proteins (Rad51/RecA) restart replication.
4. Replication–transcription collisions
-
If transcription machinery is on the same template region, collisions can stall forks and create R-loops (RNA:DNA hybrids); enzymes (RNase H) and helicases resolve these.
5. Secondary structures and repetitive DNA
-
Palindromes, G-quadruplexes, microsatellite repeats can cause polymerase slippage → expansions/contractions.
-
Specialized helicases (WRN, BLM) and polymerases help resolve structured DNA.
6. End-replication problem
-
Telomere shortening leads to replicative senescence. Telomerase extends ends in germline/stem cells.
7. dNTP imbalance
-
Alters fidelity and can cause mismatches. RNR (ribonucleotide reductase) regulates dNTP synthesis.
Diseases linked to replication problems
-
Bloom syndrome, Werner syndrome, and certain cancers are linked to defects in replication/repair proteins.
Polymerase Chain Reaction
Goal
-
Amplify a specific DNA fragment exponentially in vitro.
Main components
-
Template DNA
-
Two primers (forward and reverse) flanking target
-
dNTPs
-
DNA polymerase (heat-stable, e.g., Taq)
-
Buffer with Mg²⁺
Thermal cycling steps
-
Denaturation (~95°C): separation of dsDNA to ssDNA.
-
Annealing (50–65°C): primers bind to complementary sequences.
-
Extension (72°C): DNA polymerase extends primers; new DNA formed.
These three steps are repeated for ~25–40 cycles → exponential amplification (2ⁿ copies roughly).
Primer design rules
-
18–25 nt length, GC content 40–60%, melting temperature (Tm) ~55–65°C, avoid secondary structure or complementarity between primers (prevents primer-dimers), place primers unique to target.
Enzyme choices
-
Taq polymerase: heat-stable, fast, no 3′→5′ proofreading → errors possible.
-
High-fidelity polymerases (Pfu, Phusion): have proofreading; better for cloning or sequence-accurate work.
Variations
-
qPCR (real-time PCR): monitors amplification in real time using fluorescent dyes (SYBR Green) or probes (TaqMan) — allows quantification.
-
RT-PCR: reverse transcription of RNA to cDNA, then PCR — used for gene expression.
-
Multiplex PCR: multiple primer pairs amplify several targets in one reaction.
-
Touchdown PCR: annealing Tm decreased over early cycles to increase specificity.
-
Hot-start PCR: polymerase activated only at higher temps to reduce nonspecific amplification.
Applications
-
Diagnostics, cloning, genotyping, forensic analysis, expression analysis.