Wednesday, July 1, 2026
SAVED POSTS
  • Login
  • Register
RathBiotaClan
No Result
View All Result
  • HOME
  • HEALTH SCIENCE

    TRENDING ON HEALTH (TOP)

    For Women on Antidepressants, Creatine Showed a Possible Extra Boost

    Did the iPhone Quietly Reshape When and Whether Americans Have Children?

    For People Antidepressants Never Helped, a 30-Minute Home Session Is Now FDA-Approved

    Scientists Say Your Next Tube of Toothpaste Could Be Made From Human Hair

    NOW ON AIR (RBC)

    Three minimum tile paths from bacterial artificial
    BIOINFORMATICS

    Minimal Tiling Clone Sets and Fingerprinting in Physical Mapping of DNA

    July 1, 2026
    Building config from cloned genome fragments
    BIOINFORMATICS

    Building Contigs from Cloned Genome Fragments: Coverage, Assembly & Statistical Analysis

    July 1, 2026
    37% of US Teens Face Cyberbullying
    SCIENCE FEATURED

    37% of US Teens Face Cyberbullying, 15% Online Sexual Harassment in National Study

    June 30, 2026
    For Women on Antidepressants, Creatine Showed a Possible Extra Boost
    HEALTH SCIENCE

    For Women on Antidepressants, Creatine Showed a Possible Extra Boost

    June 30, 2026
  • NEUROSCIENCE
    • PHYSIOLOGY
    • IMMUNOLOGY
    • CANCER
  • DISCOVERIES
    • SPOTLIGHTS
    • STUDENT PORTAL
    • SCIENCE FEATURED
  • MOLECULAR BIOLOGY
    • GENETICS
    • BIOTECHNOLOGY
    • BIOINFORMATICS
    • BIOCHEMISTRY
    • BIOPHYSICS
  • ZOOLOGY & ECOLOGY
    • ENVIRONMENTAL SCIENCE
    • ECOLOGY
    • EVOLUTION
  • MICRO & PLANT SCIENCE
    • MICROBIOLOGY
    • CELL BIOLOGY
    • DEVELOPMENTAL BIOLOGY
  • PSYCHOLOGY
RathBiotaClan
RathBiotaClan
No Result
View All Result
Home BIOINFORMATICS

Building Contigs from Cloned Genome Fragments: Coverage, Assembly & Statistical Analysis

Detailed Note on the process of building contigs from cloned genome fragments. Describe the calculation of the number of clones required, the concept of coverage, the construction of restriction maps and contigs, and the statistical analysis of progress in contig assembly.

Shibasis Rath by Shibasis Rath
July 1, 2026
in BIOINFORMATICS, STUDENT PORTAL
Reading Time: 7 mins read
0
A A
0
Building config from cloned genome fragments

Building config from cloned genome fragments

The analysis of large genomes usually begins by breaking them into smaller pieces because the genomes of free-living (non-viral) organisms are much larger than the insert sizes that most cloning vectors can accept. Microbial genomes are generally larger than 0.5 × 10⁶ bp and mammalian genomes exceed 10⁹ bp, while cloning vectors such as lambda or cosmid vectors carry inserts of the order of 10⁴ bp and BAC or YAC vectors carry 10⁵ to 10⁶ bp. The genome is therefore represented by a genomic library, which is a collection of clones containing the genome in the form of overlapping inserts in a specified vector. The process of assembling these overlapping clones into continuous segments is called building contigs.

Determination of Number of Clones Required and Concept of Coverage

The number of clones (N) required in a genomic library depends on three parameters: genome length G (in bp), insert length L (in bp), and the probability f_c that any chosen base pair is represented in the library.

The probability that a particular base is recovered in one clone is L/G. Therefore, the probability that it is not covered by one clone is 1 – L/G. After N independent clones, the probability that a base is not covered is (1 – L/G)^N. Hence:

1 – f_c = (1 – L/G)^N

Solving for N (after taking logarithm on both sides):

READ ALSO

Minimal Tiling Clone Sets and Fingerprinting in Physical Mapping of DNA

Restriction Fragment Length Distribution in Lambda DNA: Poisson Model

Figure: Coverage of a genome of length G by a genomic library containing clones with insert length L. Some genomic regions (shaded) are represented multiple times, while others are represented only once or are absent from the library. Genome coverage refers to the average number of times each nucleotide position is represented by cloned inserts. The broken lines mark the boundaries of the three contigs, which are continuous, gap-free assemblies formed from overlapping clones.

N = log(1 – f_c) / log(1 – L/G)

Example: For E. coli genome (G = 4.6 × 10⁶ bp) using cosmid clones (L = 4 × 10⁴ bp) at f_c = 0.95, N = 343 clones. The total DNA represented is 343 × 4 × 10⁴ = 14 × 10⁶ bp, which is about three times the genome size. Some portions are represented multiple times and others not at all, but on average each position is covered approximately three times.

Coverage (c) is the average number of times any base pair is contained in the inserts of the library and is defined as:

c = NL / G

This can be related to the probability by the Poisson approximation:

1 – f_c ≈ e^{-c} or f_c = 1 – e^{-c}

The coverage required for different probabilities is:

Coverage (c)Probability (f_c)
10.632
20.865
30.950
40.982
50.993

For 99% probability, five genome equivalents are needed. Mammalian genomes require nearly half a million lambda clones for reasonable completeness and more for 99% coverage. Handling such libraries is labour-intensive and requires robotics and databases. At higher coverage, random selection becomes inefficient for gap closure, so directed experimental strategies are adopted.

Building Restriction Maps and Contigs from Cloned Fragments

Cloning disrupts the natural ordering of DNA segments. Inserts are generated by shearing or restriction digestion and are usually size-selected before ligation (lambda vectors accept 9–20 kb, cosmids ~40 kb). The genomic positions of clones are initially unknown, so the complete restriction map is built piece by piece using the “bottom-up” approach.

Restriction maps of individual clones are determined using incomplete digestion. If two clones share some of the same restriction sites (producing restriction fragments of identical length), they overlap. The maps are joined at the overlap and extended on both sides to form a contig — a genome segment represented by two or more overlapping clones.

Figure: Mapping a large genome by assembling overlapping representative clones. (A) Clones X and Y are identified as overlapping because they share common restriction fragments, allowing their individual restriction maps to be merged into a single contig. (B) Cloned inserts may overlap sufficiently for the overlap to be detected (clones 1 and 2), overlap too little for reliable detection (clones 3 and 4), or exist as isolated singletons (clone 5) with no overlap information from neighboring clones. Contigs, singletons, and selected clones with undetected overlaps together form genomic islands. (C)After overlapping clones are assembled across an entire genome or genomic region, multiple clones typically cover each position (average coverage c ≈ 5). To eliminate redundancy, a set of minimally overlapping clones (shown by heavy lines) is selected while maintaining complete genome coverage. The ordered sequence of these clones (indicated by arrows) is known as the minimal tiling path.

In a large genomic library, nearly all clonable portions (heterochromatin is difficult to clone) are represented, often many times. The goal is to place each clone in its correct genomic location. From clones with multiple overlaps (e.g., at coverage c ≈ 5), a minimal tiling path is selected a subset of minimally overlapping clones (shown by heavy lines) that still spans the whole genome, reducing redundancy. The path through these clones is called the minimal tiling path.

Cloned inserts may overlap sufficiently for detection, overlap insufficiently (undetectable), or exist as singletons with no neighbours. Contigs, singletons, and undetected overlaps together constitute islands.

Progress in Contig Assembly and Statistical Analysis

Contig assembly depends on recognising overlaps. Some overlaps are too short to detect (shorter than average restriction fragment size), may be missed due to experimental error in fragment length measurement, or may produce similar-sized fragments from unrelated regions. Therefore, matches among several fragments are required before declaring overlap.

Parameters used for analysis:

  • N = number of clones
  • L = insert length
  • G = genome length
  • Ω = minimum overlap required for detection
  • θ = Ω / L (fractional overlap)
  • c = NL / G (coverage)

Overlaps are detected only if they exceed Ω = θL. The expected number of islands (Γ) is derived using Poisson distribution by considering clone end positions:

Γ = (G/L) × (c × e^{-(1-θ)c})

The curve of number of islands versus coverage shows that the number first rises (new singletons dominate), then falls as contigs grow and merge by overlapping ends. New clones falling inside existing contigs do not increase island number. Eventually, the number approaches 1 (complete genome in one contig), but the final gap closure is slow because the probability of clones landing in small gaps decreases.

The expected number of singletons is:

Number of singletons = N × e^{-2(1-θ)c}

Reality Check Example (Kohara et al., 1987): A complete restriction map of E. coli was produced using 1025 lambda clones (L = 1.55 × 10⁴ bp, G = 4.7 × 10⁶ bp). Coverage c = 3.38 and θ ≈ 0.19 (based on six consecutive sites from eight enzymes with ~32 sites per insert).

Calculated: ~66.3 islands and ~4.3 singletons. Experimental: 70 islands, of which 7 were singletons.

The excellent agreement between theory and experiment confirms the model. Similar early success was reported by Olson et al. (1986).

Figure: Expected number of genomic islands as a function of genome coverage for different values of the fractional overlap (θ) required to reliably detect overlaps between cloned DNA fragments. Increasing genome coverage generally reduces the number of islands by improving clone connectivity, whereas higher overlap detection thresholds (θ) increase the number of islands because fewer clone overlaps are recognized.

Conclusion

Building contigs from cloned genome fragments is a methodical process that starts with proper library construction and coverage calculation, proceeds through restriction mapping and overlap detection, and is guided by Poisson statistics to monitor assembly progress. It efficiently converts random clones into ordered contigs and a minimal tiling path, providing a complete physical map of the genome. This strategy is highly effective for large-scale genome analysis.

  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Facebook (Opens in new window) Facebook
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Print (Opens in new window) Print
Shibasis Rath

Shibasis Rath

"𝓒𝓸𝓷𝓷𝓮𝓬𝓽𝓲𝓷𝓰 𝓡𝓮𝓼𝓮𝓪𝓻𝓬𝓱 𝓣𝓸 𝓡𝓮𝓪𝓵𝓲𝓽𝔂" 𝓲𝓼𝓷'𝓽 𝓙𝓾𝓼𝓽 𝓪 𝓜𝓸𝓽𝓽𝓸 - 𝓘𝓽'𝓼 𝓜𝔂 𝓜𝓲𝓼𝓼𝓲𝓸𝓷

Related Posts

Three minimum tile paths from bacterial artificial
BIOINFORMATICS

Minimal Tiling Clone Sets and Fingerprinting in Physical Mapping of DNA

July 1, 2026
Restriction Fragment Length Distribution in Lambda DNA: Poisson Model for CSIR NET
BIOINFORMATICS

Restriction Fragment Length Distribution in Lambda DNA: Poisson Model

June 25, 2026
Modeling the Number of Restriction Sites in DNA
BIOINFORMATICS

Modeling the Number of Restriction Sites in DNA

June 25, 2026

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

I agree to the Terms & Conditions and Privacy Policy.

POPULAR NEWS

Chewing gum releases thousands of microplastic particles directly into your mouth with every piece you chew

Chewing gum releases thousands of microplastic particles directly into your mouth with every piece you chew

by Shibasis Rath
May 8, 2026
0

Microplastics are turning up in places researchers never expected: deep-sea sediments, Arctic ice, and human blood. Now, a UCLA pilot...

woman in white tank top lying on bed

New Studys Says Gen Z is the least sexually active young cohort in modern recorded history

by Shibasis Rath
January 24, 2026
0

A generation that grew up with dating apps in their pockets, pornography a tap away, and sex discussed more openly...

grayscale photo of girl in polka dot long sleeve shirt

Yelling Isn’t Just Yelling: How a Hostile Home Rewires a Child’s Brain for Constant Alert

by Shibasis Rath
March 8, 2026
0

To a parent in the heat of the moment, a raised voice may feel like simple frustration. To a child...

a group of gen Z kids walking down a street

Is Gen Z the First Generation Less Intelligent Than Their Parents?

by Shibasis Rath
February 5, 2026
0

Gen Z intelligence decline is emerging as a serious concern among neuroscientists and education researchers. For over a century, each...

Whole Brain Emulation Achieved: Scientists Run a Fruit Fly Brain in Simulation

by Shibasis Rath
March 9, 2026
0

Scientists have copied an entire biological brain neuron by neuron and synapse by synapse and made it control a simulated...

EDITOR CHOICE‘S

  • All
  • NEWS
  • SPOTLIGHTS
Three minimum tile paths from bacterial artificial

Minimal Tiling Clone Sets and Fingerprinting in Physical Mapping of DNA

by Shibasis Rath
July 1, 2026
0

Minimal tiling clone sets represent an efficient approach in genome mapping where the entire genomic sequence is covered by a...

Building config from cloned genome fragments

Building Contigs from Cloned Genome Fragments: Coverage, Assembly & Statistical Analysis

by Shibasis Rath
July 1, 2026
0

The analysis of large genomes usually begins by breaking them into smaller pieces because the genomes of free-living (non-viral) organisms...

37% of US Teens Face Cyberbullying

37% of US Teens Face Cyberbullying, 15% Online Sexual Harassment in National Study

by Shibasis Rath
June 30, 2026
0

Young people aged 15 to 24 report substantial risks when using the internet, including cyberbullying that leads some to skip...

For Women on Antidepressants, Creatine Showed a Possible Extra Boost

For Women on Antidepressants, Creatine Showed a Possible Extra Boost

by Shibasis Rath
June 30, 2026
0

Creatine for depression has become a topic of growing interest in nutritional psychiatry, and a new systematic review of randomized...

RathBiotaClan – RBC

RathBiotaClan – Connecting Research To Reality

Your trusted source for life science news, biology research & discoveries. Covering neuroscience, genetics, ecology, and more — connecting research to reality.

About Us

Privacy Policies

Contact Us

Editorial Standard

Latest Posts

  • Minimal Tiling Clone Sets and Fingerprinting in Physical Mapping of DNA
  • Building Contigs from Cloned Genome Fragments: Coverage, Assembly & Statistical Analysis
  • 37% of US Teens Face Cyberbullying, 15% Online Sexual Harassment in National Study
  • For Women on Antidepressants, Creatine Showed a Possible Extra Boost

SHIBASIS RATH

Contact Mail

rathbiotaclan@gmail.com

No Result
View All Result
MSME (Udyam) Certified Science Platform
Govt. of India

Get Us On PlayStore

playstore app for rathbiotaclan
  • About Us
  • Advertise With Us
  • Cancellation and Refund Policy
  • Contact Us
  • Contribute
  • Editorial Standards
  • Home
  • Pricing Details
  • Privacy Policies
  • Shipping Policy
  • Terms & Conditions

© 2026 RathBiotaClan. All rights reserved.

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • HOME
  • HEALTH SCIENCE
  • NEUROSCIENCE
    • PHYSIOLOGY
    • IMMUNOLOGY
    • CANCER
  • DISCOVERIES
    • SPOTLIGHTS
    • STUDENT PORTAL
    • SCIENCE FEATURED
  • MOLECULAR BIOLOGY
    • GENETICS
    • BIOTECHNOLOGY
    • BIOINFORMATICS
    • BIOCHEMISTRY
    • BIOPHYSICS
  • ZOOLOGY & ECOLOGY
    • ENVIRONMENTAL SCIENCE
    • ECOLOGY
    • EVOLUTION
  • MICRO & PLANT SCIENCE
    • MICROBIOLOGY
    • CELL BIOLOGY
    • DEVELOPMENTAL BIOLOGY
  • PSYCHOLOGY
  • Login
  • Sign Up
SAVED POSTS

© 2026 RathBiotaClan. All rights reserved.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.