DNA Replication - Allied Guru

Table of Contents

Introduction

DNA replication is the biological process by which a cell makes an identical copy of its DNA before cell division.
It ensures that each daughter cell receives the same genetic information as the parent cell.
This process is fundamental for growth, development, repair, and reproduction in all living organisms.
The mechanism of DNA replication is described as semi-conservative: each new double helix contains one original (parental) strand and one newly synthesised strand.

Replication begins at specific sites on DNA called origins of replication and proceeds in both directions, forming replication forks.
Although the basic principle of replication is universal, the details vary between prokaryotes (such as bacteria, which have a single circular chromosome) and eukaryotes (such as humans, with multiple linear chromosomes).
Specialised enzymes and proteins—such as helicases, primase, DNA polymerases, ligase, and topoisomerases—work together in a highly coordinated way to unwind the DNA, synthesise new strands, correct errors, and ensure the process is accurate.

DNA structure

Basic building blocks

DNA = polymer of nucleotides. Each nucleotide = deoxyribose sugar + phosphate + base (A, T, C, G).
Bases are A (adenine), T (thymine), C (cytosine), G (guanine). A pairs with T (2 H-bonds), C pairs with G (3 H-bonds).

Strand orientation

DNA strands are antiparallel: one strand runs 5′ → 3′, the other 3′ → 5′.
Polymerases add new nucleotides only to a free 3′-OH, so synthesis is always 5′ → 3′.

Helix features relevant to replication

Major and minor grooves: proteins recognize sequences via grooves.
Base stacking stabilizes helix.
AT-rich regions (fewer H-bonds) are easier to unwind — origins of replication often have AT-rich “DNA-unwinding elements” (DUE).
Supercoiling: unwinding ahead of a fork creates positive supercoils; topoisomerases relieve them.

Semi-conservative nature

Each daughter DNA has one old (parental) strand and one new strand — this is the direct consequence of complementary pairing and is why replication is accurate.

DNA polymerase

General function

Catalyzes formation of phosphodiester bond between the 3′-OH of the growing strand and the incoming dNTP (deoxynucleotide triphosphate).
Requires a template strand and a primer (short oligo with a free 3′-OH).

Directionality & chemistry

Polymerization occurs 5′ → 3′.
Two-metal-ion mechanism at active site: metal ions (usually Mg²⁺) help position substrate and stabilize leaving group (pyrophosphate).

Fidelity

High fidelity comes from:
- Base-pairing specificity (proper Watson–Crick pairing).
- Proofreading (3′→5′ exonuclease) activity in replicative polymerases that removes the incorrectly paired nucleotide.
- Post-replication mismatch repair corrects remaining errors.

Processivity

Processivity = number of nucleotides added per binding event.
Sliding clamps (β-clamp in bacteria, PCNA in eukaryotes) increase processivity by tethering polymerase to DNA.
Clamp loader (γ-complex in bacteria, RFC in eukaryotes) uses ATP to load the clamp.

Prokaryotic polymerases (examples & roles)

DNA Pol I: removes RNA primers (5′→3′ exonuclease) and fills gaps with DNA; has 3′→5′ proofreading too but low processivity.
DNA Pol III: main replicative enzyme in bacteria (high processivity with β-clamp).
DNA Pol II, IV, V: involved in repair or translesion synthesis (TLS).

Eukaryotic polymerases (major players)

Pol α (alpha): primase-associated; lays an RNA–DNA primer (short DNA after RNA primer).
Pol δ (delta): elongates lagging strand, does Okazaki fragment synthesis/processing.
Pol ε (epsilon): primarily elongates leading strand.
Pol γ: replicates mitochondrial DNA.
TLS polymerases (η, κ, ι, Rev1, Pol ζ): bypass DNA lesions but lower fidelity.

Polymerase switching

At start of replication Pol α initiates primers; then switch to Pol δ/ε for processive elongation. Sliding clamp and clamp loader mediate this exchange.

Replication process

Replication consists of initiation, elongation, and termination. Below each is expanded in detail.

A. Initiation — getting started

Where

Origins of replication are specific DNA sequences where replication begins.
- Bacteria: a single origin (OriC).
- Eukaryotes: many origins distributed along chromosomes.

Key steps (broad)

Origin recognition — origin-binding proteins identify origins.
Origin opening — local unwinding of DNA to form a replication bubble.
Helicase loading & activation — helicase placed on DNA then activated to unwind continuously.
Primer synthesis — primase synthesizes short RNA primers to provide 3′-OH.
Recruitment of polymerases & accessory factors — clamp, clamp loader, polymerases assemble.

Bacterial initiation (detailed)

DnaA recognizes DnaA-box sequences within OriC and oligomerizes, causing bending and opening of AT-rich DUE.
DnaC helps load DnaB helicase onto unwound DNA (requires ATP).
DnaG primase synthesizes RNA primers.
SSB (single-strand binding protein) binds exposed ssDNA to prevent reannealing.
DNA Pol III holoenzyme (core polymerase + β-clamp + clamp loader) begins DNA synthesis.

Eukaryotic initiation (detailed)

In G1 phase, origins are licensed by loading the pre-replication complex (pre-RC):
- ORC (Origin Recognition Complex) binds origin.
- Cdc6 and Cdt1 recruit the MCM2–7 helicase (loaded as an inactive double hexamer).
In S phase, kinases (S-CDK and DDK) phosphorylate initiation factors:
- Cdc45 and GINS join MCM to form the active CMG helicase (Cdc45–MCM–GINS).
- This converts the pre-RC to an active pre-initiation complex (pre-IC) and opens the origin.
Pol α-primase synthesizes RNA–DNA primers. Then Pol ε/δ are recruited to begin processive synthesis.

Important control point: Licensing occurs only in G1 — prevents origins from firing more than once per cell cycle.

B. Pre-replication complex (pre-RC)

Definition & function

Complex assembled at origins during G1 that marks where replication may start; it “licenses” origins.
Composition: ORC + Cdc6 + Cdt1 + MCM2–7.
MCM helicase is loaded in an inactive form (double hexamer) around dsDNA.

Why necessary

Prevents re-replication: after an origin fires in S-phase, re-loading of MCM is blocked until mitosis — ensures one round per cycle.

Regulation

Cdc6 and Cdt1 availability and phosphorylation controlled by CDKs and ubiquitin-mediated degradation.
Geminin (in metazoans) inhibits Cdt1 to prevent re-loading during S/G2.

C. Pre-initiation complex (pre-IC)

Activation step

In early S-phase, S-CDK and DDK phosphorylate components to convert pre-RC to active replication forks.
Cdc45 and GINS join MCM to make CMG, the active helicase that unwinds DNA.
Single-stranded region forms; RPA binds ssDNA; primase makes primers.
Polymerases, clamps, and other replisome factors assemble.

Note: Many proteins (Sld2, Sld3, Dpb11 in yeast; Treslin/TICRR in metazoans) mediate interactions and are regulated by phosphorylation.

Elongation

Replication fork

Fork is a Y-shaped structure with two single-stranded templates and two nascent strands.
Two replication forks move away bidirectionally from each origin.

Leading strand

Synthesized continuously in 5′→3′ as the helicase exposes new template.
In eukaryotes: mainly Pol ε; in bacteria: Pol III.

Lagging strand

Synthesized discontinuously as Okazaki fragments (short stretches).
Each fragment requires:
1. Primase synthesizes RNA primer.
2. Polymerase extends until hitting previous fragment.
3. Primer removal and replacement with DNA.
4. DNA ligase seals the nick.

Processing Okazaki fragments (eukaryotes)

RNase H2 removes RNA portion of primers.
FEN1 (flap endonuclease 1) removes displaced flaps during strand-displacement synthesis.
DNA Pol δ performs strand-displacement synthesis.
DNA Ligase I joins fragments.

Lagging strand “trombone” model

The lagging template loops so that polymerases moving in the same physical direction can synthesize both strands. Each Okazaki fragment cycle: primer → extension → flap removal → ligation → loop release.

Chromatin reassembly

Nucleosomes are disrupted ahead of fork and reassembled behind using old histones + new histone deposition by chaperones (CAF-1, Asf1) to restore chromatin.

Accessory factors

PCNA (clamp) tethers polymerases.
RFC (replication factor C) loads PCNA using ATP.
RPA (replication protein A) binds ssDNA.
Topoisomerase I/II relieve torsional strain; in bacteria, gyrase (a type II topo) introduces negative supercoils.

Replication fork dynamics

Coordination

Both strands synthesized simultaneously though directionality differs — replisome coordinates leading/lagging polymerases.
Continuous recruitment of primase to produce primers on lagging strand at intervals.

Stalling and restart

Fork may stall at DNA damage, tightly bound proteins, or secondary structures.
Cells stabilize stalled forks using checkpoint proteins (ATR/Chk1 pathway in eukaryotes) and fork-protection proteins to prevent collapse.
Fork collapse can cause double-strand breaks; restart often uses homologous recombination (Rad51 in eukaryotes, RecA in bacteria).

Translesion synthesis (TLS)

When replication encounters a lesion that blocks high-fidelity polymerases, TLS polymerases insert bases opposite lesions (error-prone) to allow replication to continue; PCNA ubiquitination helps recruit TLS polymerases.

DNA replication proteins

Helicase

Unwinds dsDNA into ssDNA. Eukaryotes use MCM2–7, bacteria use DnaB.

Primase

Makes short RNA primers (bacterial DnaG; in eukaryotes, primase is part of Pol α complex).

DNA polymerases

Main enzymes for DNA synthesis (Pol III in bacteria; Pol δ/ε in eukaryotes).

Sliding clamp

β-clamp (bacteria) or PCNA (eukaryotes) — tethers polymerase to DNA increasing processivity.

Clamp loader

Loads sliding clamp onto DNA (γ complex in bacteria; RFC in eukaryotes).

Single-strand binding proteins

SSB in bacteria; RPA in eukaryotes. Prevent ssDNA secondary structure and protect from nucleases.

Topoisomerases

Relieve supercoils ahead of fork: Type I (single-strand cuts) and Type II (double-strand cuts — e.g., gyrase in bacteria, Topo II in eukaryotes).

DNA ligase

Seals nicks between Okazaki fragments or at repair sites (LigA in bacteria; Ligase I in eukaryotes).

RNase H & FEN1

Remove RNA primers and process DNA flaps.

Telomerase

Reverse-transcriptase that extends telomeres using its own RNA template — solves end-replication problem in eukaryotes.

ORC, Cdc6, Cdt1

Origin recognition and licensing proteins in eukaryotes.

CMG complex

Active helicase complex (Cdc45–MCM–GINS) in eukaryotes.

Checkpoint proteins

ATR/ATM, Chk1/Chk2 — detect replication stress / DNA damage and delay cell cycle.

Repair proteins

Mismatch repair (MutS/MutL in bacteria; MSH/MLH in eukaryotes), base excision repair, nucleotide excision repair components.

Replication machinery

What is the replisome?

A multi-protein complex assembled at each replication fork to carry out coordinated replication of both strands.

Core components

Active CMG helicase (eukaryotes) / DnaB (bacteria).
DNA polymerases for leading & lagging strands.
Sliding clamps and clamp loaders.
Primase and SSB/RPA.
Additional factors: topoisomerases, ligase, nucleosome chaperones (eukaryotes).

How it works

Helicase unwinds; primase lays primer; clamp loader places PCNA/β-clamp; polymerase binds and extends; lagging strand loops and cycles to generate Okazaki fragments.

Coordination

Physical interactions or scaffolding proteins coordinate components so the fork moves as a single machine.

Termination

Bacterial termination

Circular chromosome replicates until forks meet roughly opposite OriC.
Ter sites bound by Tus protein can block fork progression in a polar manner to help coordinate termination.
After replication, daughter circular chromosomes can be catenated (interlinked) — topoisomerase IV or Topo II decatenates them.

Eukaryotic termination

Termination occurs when two forks converge.
Replication intermediates are resolved by nucleases and topoisomerases; replisome components are disassembled.
End-replication problem: Because lagging strand synthesis requires priming, the extreme 5′ end cannot be copied — leads to telomere shortening.
Telomerase elongates telomeric repeats, adding TTAGGG (in human) repeats using its RNA template; active in germline, stem cells, many cancer cells (absent in most somatic cells).

Telomere protection

Telomeres are bound by shelterin complex to protect ends and regulate telomerase.

Regulation of DNA replication

Once-per-cycle control

Licensing in G1 (loading of MCM) and activation in S-phase ensure each origin fires once.
CDKs (Cyclin-dependent kinases) and DDK regulate activation and prevent re-licensing (phosphorylation + degradation).
Geminin inhibits Cdt1 (in metazoans), preventing new MCM loading during S/G2.

S-phase checkpoint

ATR-ATRIP senses RPA-coated ssDNA (stalled forks).
Activates Chk1, which:
- Stabilizes stalled forks.
- Inhibits new origin firing if the cell is under replication stress.
- Pauses cell cycle progression (via Cdc25 inhibition).

Replication timing and origin efficiency

Not all origins fire simultaneously. Origins have different firing times (early vs late S-phase).
Origin choice influenced by chromatin structure, transcriptional activity, and cell type.

dNTP pools & metabolic control

Balanced dNTP pools (via ribonucleotide reductase) are essential for fidelity.

Eukaryotic replication

Multiple origins

Eukaryotic chromosomes are large; multiple origins allow faster complete replication.

Chromatin environment

DNA is packaged into nucleosomes. Replication requires:
- Disassembly of nucleosomes before fork passage.
- Reassembly behind fork using old histones + deposition of new histones (CAF-1, Asf1).

Replication foci (factories)

Replication occurs at discrete nuclear sites where multiple active replisomes cluster — visible as replication foci.
Each focus replicates a local region of chromatin; foci dynamics change across S-phase.

Telomeres & heterochromatin

Telomeric and heterochromatic regions replicate late; require specialized handling.

Mitochondrial DNA replication

Uses different machinery (Pol γ, distinct origins, different regulation).

Replication focus

Definition

Replication focus (factory) = nuclear site where several replication forks are active simultaneously.

Properties

Observed with fluorescent markers: BrdU/EdU incorporation or PCNA-GFP.
Each focus contains many molecules of replication proteins working together — efficient local replication.

Functional significance

Spatial organization speeds replication and coordinates chromatin reassembly and repair.

Bacterial replication

OriC in E. coli

OriC contains multiple DnaA boxes (DnaA-binding sites) and an AT-rich DUE (easier to open).
DnaA-ATP oligomerizes to open the origin and recruit DnaB helicase.

DnaB and DnaC

DnaC helps load DnaB onto ssDNA; DnaB unwinds DNA and recruits primase.

DNA Pol III holoenzyme

Multi-subunit complex with high processivity (β-clamp).
Contains multiple cores so both leading and lagging strands can be synthesized.

Regulation by methylation & SeqA

SeqA binds hemimethylated DNA (just after replication) and temporarily prevents immediate re-initiation at OriC in rapidly growing cells.

Termination

Tus–Ter system directs forks and helps coordinate termination; decatenation by Topo IV finishes separation.

Plasmid replication

Plasmids may use theta replication (similar to chromosomal) or rolling-circle replication (a nicked circular intermediate extends).

Problems with DNA replication

1. Base misincorporation → mutations

Solutions: polymerase proofreading (3′→5′ exonuclease) + Mismatch repair (MMR) (MutS/MutL/MutH in bacteria; MSH/MLH in eukaryotes).

2. DNA damage blocking replication (UV, alkylation, oxidative damage)

Nucleotide Excision Repair (NER) removes bulky lesions.
Base Excision Repair (BER) removes small damaged bases.
Translesion synthesis (TLS) allows bypass but is error-prone.

3. Replication fork stalling & collapse

Forks stall at lesions or protein blocks; checkpoint proteins stabilize forks.
Collapse can cause double-strand breaks → homologous recombination (HR) proteins (Rad51/RecA) restart replication.

4. Replication–transcription collisions

If transcription machinery is on the same template region, collisions can stall forks and create R-loops (RNA:DNA hybrids); enzymes (RNase H) and helicases resolve these.

5. Secondary structures and repetitive DNA

Palindromes, G-quadruplexes, microsatellite repeats can cause polymerase slippage → expansions/contractions.
Specialized helicases (WRN, BLM) and polymerases help resolve structured DNA.

6. End-replication problem

Telomere shortening leads to replicative senescence. Telomerase extends ends in germline/stem cells.

7. dNTP imbalance

Alters fidelity and can cause mismatches. RNR (ribonucleotide reductase) regulates dNTP synthesis.

Diseases linked to replication problems

Bloom syndrome, Werner syndrome, and certain cancers are linked to defects in replication/repair proteins.

Polymerase Chain Reaction

Goal

Amplify a specific DNA fragment exponentially in vitro.

Main components

Template DNA
Two primers (forward and reverse) flanking target
dNTPs
DNA polymerase (heat-stable, e.g., Taq)
Buffer with Mg²⁺

Thermal cycling steps

Denaturation (~95°C): separation of dsDNA to ssDNA.
Annealing (50–65°C): primers bind to complementary sequences.
Extension (72°C): DNA polymerase extends primers; new DNA formed.

These three steps are repeated for ~25–40 cycles → exponential amplification (2ⁿ copies roughly).

Primer design rules

18–25 nt length, GC content 40–60%, melting temperature (Tm) ~55–65°C, avoid secondary structure or complementarity between primers (prevents primer-dimers), place primers unique to target.

Enzyme choices

Taq polymerase: heat-stable, fast, no 3′→5′ proofreading → errors possible.
High-fidelity polymerases (Pfu, Phusion): have proofreading; better for cloning or sequence-accurate work.

Variations

qPCR (real-time PCR): monitors amplification in real time using fluorescent dyes (SYBR Green) or probes (TaqMan) — allows quantification.
RT-PCR: reverse transcription of RNA to cDNA, then PCR — used for gene expression.
Multiplex PCR: multiple primer pairs amplify several targets in one reaction.
Touchdown PCR: annealing Tm decreased over early cycles to increase specificity.
Hot-start PCR: polymerase activated only at higher temps to reduce nonspecific amplification.

Applications

Diagnostics, cloning, genotyping, forensic analysis, expression analysis.

M	T	W	T	F	S	S
						1
2	3	4	5	6	7	8
9	10	11	12	13	14	15
16	17	18	19	20	21	22
23	24	25	26	27	28	29
30	31