Sunday, July 5, 2026
SAVED POSTS
  • Login
  • Register
RathBiotaClan
No Result
View All Result
  • HOME
  • HEALTH SCIENCE

    TRENDING ON HEALTH (TOP)

    Fick Method Underestimates Heart Problems in Children After Heart Transplant, Study Finds

    For Women on Antidepressants, Creatine Showed a Possible Extra Boost

    Did the iPhone Quietly Reshape When and Whether Americans Have Children?

    For People Antidepressants Never Helped, a 30-Minute Home Session Is Now FDA-Approved

    NOW ON AIR (RBC)

    Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment
    BIOINFORMATICS

    Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment

    July 4, 2026
    BIOINFORMATICS

    Gaps in Sequence Alignment and Their Role in cDNA Matching

    July 4, 2026
    Local Alignment: Finding Substrings of High Similarity
    BIOINFORMATICS

    Local Alignment: Finding Substrings of High Similarity | Notes

    July 4, 2026
    BIOINFORMATICS

    Weighted Edit Distance Explained: Operation-Weight & Alphabet-Weight (Notes)

    July 4, 2026
  • NEUROSCIENCE
    • PHYSIOLOGY
    • IMMUNOLOGY
    • CANCER
  • DISCOVERIES
    • SPOTLIGHTS
    • STUDENT PORTAL
    • SCIENCE FEATURED
  • MOLECULAR BIOLOGY
    • GENETICS
    • BIOTECHNOLOGY
    • BIOINFORMATICS
    • BIOCHEMISTRY
    • BIOPHYSICS
  • ZOOLOGY & ECOLOGY
    • ENVIRONMENTAL SCIENCE
    • ECOLOGY
    • EVOLUTION
  • MICRO & PLANT SCIENCE
    • MICROBIOLOGY
    • CELL BIOLOGY
    • DEVELOPMENTAL BIOLOGY
  • PSYCHOLOGY
RathBiotaClan
RathBiotaClan
No Result
View All Result
Home BIOINFORMATICS

Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment

Shibasis Rath by Shibasis Rath
July 4, 2026
in BIOINFORMATICS, STUDENT PORTAL
Reading Time: 9 mins read
0
A A
0
Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment

In alignment problems, the use of gaps in the objective function helps in finding alignments that satisfy an expected shape, as seen in examples like cDNA matching. However, the effectiveness of the gap concept depends critically on how gaps are weighted. There are four general types of gap weights: constant, affine, convex, and arbitrary.

1. Constant Gap Weight

This is the simplest choice of gap weight. Here, each individual space is free, and each gap (regardless of the number of spaces in it) is given a fixed weight of Wg.

If Wm and Wms denote the weights for matches and mismatches respectively, the operator-weight version of the problem is stated as:

Find an alignment A to maximize [Wm(# matches) − Wms(# mismatches) − Wg(# gaps)].

ADVERTISEMENT

If alphabet-dependent weights are used instead (a similarity function for matches/mismatches), the objective becomes maximizing the sum of similarity scores s(x, y) over aligned characters, minus Wg times the number of gaps, where s(x, −) = s(−, x) = 0 for every character x.

READ ALSO

Gaps in Sequence Alignment and Their Role in cDNA Matching

Local Alignment: Finding Substrings of High Similarity | Notes

2. Affine Gap Weight

This is a generalization of the constant gap model, obtained by adding a weight Ws for each space within the gap.

ADVERTISEMENT
  • Wg is called the gap initiation weight — it represents the cost of starting a gap.
  • Ws is called the gap extension weight — it represents the cost of extending the gap by one space.

The operator-weight version of the objective becomes:

ADVERTISEMENT

Find an alignment to maximize [Wm(# matches) − Wms(# mismatches) − Wg(# gaps) − Ws(# spaces)].

This model is called the affine gap weight model because the weight contributed by a single gap of length q is given by the affine function:

Wg + qWs

The constant gap weight model is simply the affine model with Ws = 0. (The affine model is sometimes also called the linear weight model.)

The alphabet-weight version again sets s(x, −) = s(−, x) = 0, with the objective of maximizing the sum of similarity scores minus Wg(# gaps) minus Ws(# spaces).

The affine gap weight model is the most commonly used gap model in molecular biology literature, although there is considerable disagreement about the correct values of Wg and Ws (and also Wm and Wms). For example, the widely used search program FASTA uses default settings of Wg = 10 and Ws = 2 for aligning amino acid strings.

3. Convex Gap Weight

It has been suggested that certain biological phenomena are better modeled by a gap weight function where each additional space in a gap contributes less to the total weight than the preceding space — that is, a function with a negative second derivative. Such a function is called convex (some call it concave), as opposed to affine.

An example of such a function is:

Wg + logₑ q

where q is the length of the gap. Some biologists have further suggested that a gap function which initially increases to a maximum value and then decreases to near zero would best reflect a combination of different biological phenomena responsible for insertion or deletion of DNA.

4. Arbitrary Gap Weight

This is the most general gap weight model, where the weight of a gap is an arbitrary function w(q) of its length q. The constant, affine, and convex weight models are all special cases of this arbitrary weight model.

Time Bounds for Gap Choices

The time needed to solve the alignment problem optimally depends on the type of gap weight chosen:

  • If w(q) is a totally arbitrary function of gap length, the optimal alignment can be found in O(nm² + n²m) time, where n and m (m ≥ n) are the lengths of the two strings.
  • If w(q) is convex, the time can be reduced to O(nm log m).
  • If w(q) is affine (or constant), the time bound reduces to O(nm) — the same time bound as for alignment without any gap concept at all.

Recurrences for Arbitrary Gap Weights

To align strings S1 and S2, prefixes S1[1..i] and S2[1..j] are considered. Any alignment of these two prefixes falls into one of three types:

  1. Alignments where S1(i) is aligned to a character strictly to the left of S2(j) — the alignment ends with a gap in S1.
  2. Alignments where S1(i) is aligned strictly to the right of S2(j) — the alignment ends with a gap in S2.
  3. Alignments where S1(i) and S2(j) are aligned directly opposite each other (whether matching or mismatching).

These three cases cover all possibilities. Accordingly:

  • E(i, j) = maximum value of any type 1 alignment
  • F(i, j) = maximum value of any type 2 alignment
  • G(i, j) = maximum value of any type 3 alignment
  • V(i, j) = maximum of E(i, j), F(i, j), G(i, j)

The recurrences are:

V(i, j) = max[E(i, j), F(i, j), G(i, j)]

G(i, j) = V(i−1, j−1) + s(S1(i), S2(j))

E(i, j) = max₀≤k<j [V(i, k) − w(j − k)]

F(i, j) = max₀≤k<i [V(k, j) − w(i − k)]

Base cases — if all spaces (including end spaces) are counted in the objective function, the optimal value is found in cell (n, m), with:

V(i, 0) = −w(i), V(0, j) = −w(j) E(i, 0) = −w(i), F(0, j) = −w(j)

Here G(0,0) = 0, but G(i, j) is undefined when exactly one of i or j is zero, and V(0,0) is naturally taken as zero.

When end gaps are free, the optimal alignment value is the maximum value found in row n or column m, with base case V(0, j) = V(i, 0) = 0.

Time Analysis: Filling the (n+1) × (m+1) table row by row, evaluating G(i, j) requires examining one cell, evaluating E(i, j) requires examining j cells of row i, and evaluating F(i, j) requires examining i cells of column j. This gives O(m²) work per row for E-values and O(n²) work per column for F-values, leading to an overall time bound of O(nm² + n²m).

This increase over the gap-free case (O(nm)) occurs because V(i, j) now depends on looking j cells to the left and i cells above, rather than just the three adjacent cells.

Recurrences for Affine (and Constant) Gap Weights

The key insight for greater efficiency in the affine case is that the increase in gap weight contributed by each additional space is a constant Ws, independent of the gap’s length so far — that is, w(q+1) − w(q) = Ws for any gap length q > 0. This differs from the arbitrary case, where no such fixed relationship exists between w(q) and w(q+1).

Because of this, when evaluating E(i, j) or F(i, j), it is not necessary to track exactly where the gap began — only whether a gap has already started or a new one is being initiated.

Base cases — when end gaps are included:

V(i, 0) = E(i, 0) = −Wg − iWs

V(0, j) = F(0, j) = −Wg − jWs

When end gaps are free: V(i, 0) = V(0, j) = 0.

General recurrences:

V(i, j) = max[E(i, j), F(i, j), G(i, j)]

G(i, j) = V(i−1, j−1) + Wm (score depending on match/mismatch)

E(i, j) = max[E(i, j−1), V(i, j−1) − Wg] − Ws

F(i, j) = max[F(i−1, j), V(i−1, j) − Wg] − Ws

Explanation of E(i, j): Since S1(i) is aligned to the left of S2(j), either:

  • S1(i) is exactly one place to the left of S2(j), so a new gap begins opposite S2(j), giving E(i, j) = V(i, j−1) − Wg − Ws, or
  • S1(i) is to the left of S2(j−1), so the same gap already in progress is now opposite S2(j) as well, giving E(i, j) = E(i, j−1) − Ws.

F(i, j) is explained analogously, and G(i, j) simply represents aligning S1(i) directly opposite S2(j).

The optimal alignment value is found in cell (n, m) if end spaces are counted; otherwise it is the maximum value in row n or column m.

Note: It may be asked why V(i, j−1), and not G(i, j−1), is used in the recurrence for E(i, j). Using G(i, j−1) − Wg would miss alignments having a gap in S2 ending opposite character i of S1, followed immediately by a gap in S1. Although the fully expanded recurrence — max[E(i, j−1), G(i, j−1) − Wg, V(i, j−1) − Wg] − Ws — would also be correct, the middle term is redundant since it is already covered by V(i, j−1) − Wg.

Time Analysis: Since V(i, j), E(i, j), F(i, j), and G(i, j) at each cell are each evaluated using only a constant number of references to previously computed values, arithmetic operations, and comparisons, the entire (n+1) × (m+1) table can be filled in O(nm) time — the same as the gap-free alignment model.

Conclusion

Thus, gap weighting in alignment problems ranges from the simplest constant model to the most general arbitrary model, with affine weights forming the standard choice in practice due to their balance between biological interpretability and computational efficiency. While arbitrary and convex gap weights allow more flexible modeling of biological insertion/deletion phenomena, they come at increased computational cost, whereas the affine (and constant) model achieves the same O(nm) time bound as alignment without gaps at all.

  • Share on WhatsApp (Opens in new window) WhatsApp
  • Share on Facebook (Opens in new window) Facebook
  • Share on Reddit (Opens in new window) Reddit
  • Share on X (Opens in new window) X
  • Print (Opens in new window) Print
Shibasis Rath

Shibasis Rath

"𝓒𝓸𝓷𝓷𝓮𝓬𝓽𝓲𝓷𝓰 𝓡𝓮𝓼𝓮𝓪𝓻𝓬𝓱 𝓣𝓸 𝓡𝓮𝓪𝓵𝓲𝓽𝔂" 𝓲𝓼𝓷'𝓽 𝓙𝓾𝓼𝓽 𝓪 𝓜𝓸𝓽𝓽𝓸 - 𝓘𝓽'𝓼 𝓜𝔂 𝓜𝓲𝓼𝓼𝓲𝓸𝓷

Related Posts

Gaps in Sequence Alignment and Their Role in cDNA Matching
BIOINFORMATICS

Gaps in Sequence Alignment and Their Role in cDNA Matching

July 4, 2026
Local Alignment: Finding Substrings of High Similarity
BIOINFORMATICS

Local Alignment: Finding Substrings of High Similarity | Notes

July 4, 2026
Weighted Edit Distance Explained: Operation-Weight & Alphabet-Weight (Notes)
BIOINFORMATICS

Weighted Edit Distance Explained: Operation-Weight & Alphabet-Weight (Notes)

July 4, 2026

Leave a Reply Cancel reply

Your email address will not be published. Required fields are marked *

I agree to the Terms & Conditions and Privacy Policy.

POPULAR NEWS

Chewing gum releases thousands of microplastic particles directly into your mouth with every piece you chew

Chewing gum releases thousands of microplastic particles directly into your mouth with every piece you chew

by Shibasis Rath
May 8, 2026
0

Microplastics are turning up in places researchers never expected: deep-sea sediments, Arctic ice, and human blood. Now, a UCLA pilot...

woman in white tank top lying on bed

New Studys Says Gen Z is the least sexually active young cohort in modern recorded history

by Shibasis Rath
January 24, 2026
0

A generation that grew up with dating apps in their pockets, pornography a tap away, and sex discussed more openly...

grayscale photo of girl in polka dot long sleeve shirt

Yelling Isn’t Just Yelling: How a Hostile Home Rewires a Child’s Brain for Constant Alert

by Shibasis Rath
March 8, 2026
0

To a parent in the heat of the moment, a raised voice may feel like simple frustration. To a child...

a group of gen Z kids walking down a street

Is Gen Z the First Generation Less Intelligent Than Their Parents?

by Shibasis Rath
February 5, 2026
0

Gen Z intelligence decline is emerging as a serious concern among neuroscientists and education researchers. For over a century, each...

72-Hour Fasting Can Reset Your Entire Immune System, USC Study Shows

72-Hour Fasting Can Reset Your Entire Immune System, USC Study Shows

by Shibasis Rath
February 28, 2026
0

A 72-hour fast can trigger a powerful immune system reset. Scientists call this stem cell regeneration. The process clears old...

EDITOR CHOICE‘S

  • All
  • NEWS
  • SPOTLIGHTS
Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment

Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment

by Shibasis Rath
July 4, 2026
0

In alignment problems, the use of gaps in the objective function helps in finding alignments that satisfy an expected shape,...

Gaps in Sequence Alignment and Their Role in cDNA Matching

Gaps in Sequence Alignment and Their Role in cDNA Matching

by Shibasis Rath
July 4, 2026
0

In the study of string alignment and dynamic programming, the basic elements used to evaluate an alignment are matches, mismatches,...

Local Alignment: Finding Substrings of High Similarity

Local Alignment: Finding Substrings of High Similarity | Notes

by Shibasis Rath
July 4, 2026
0

Sequence comparison is one of the fundamental problems in computational biology and string algorithms. While global alignment compares two strings...

Weighted Edit Distance Explained: Operation-Weight & Alphabet-Weight (Notes)

Weighted Edit Distance Explained: Operation-Weight & Alphabet-Weight (Notes)

by Shibasis Rath
July 4, 2026
0

Edit distance measures how many insertions, deletions, and substitutions are required to transform one string into another. The basic edit...

ADVERTISEMENT

RathBiotaClan – RBC

RathBiotaClan – Connecting Research To Reality

Your trusted source for life science news, biology research & discoveries. Covering neuroscience, genetics, ecology, and more — connecting research to reality.

About Us

Privacy Policies

Contact Us

Editorial Standard

Latest Posts

  • Choices for Gap Weights: Constant, Affine & Arbitrary Gap Weights in Sequence Alignment
  • Gaps in Sequence Alignment and Their Role in cDNA Matching
  • Local Alignment: Finding Substrings of High Similarity | Notes
  • Weighted Edit Distance Explained: Operation-Weight & Alphabet-Weight (Notes)

SHIBASIS RATH

Contact Mail

rathbiotaclan@gmail.com

No Result
View All Result
MSME (Udyam) Certified Science Platform
Govt. of India

Get Us On PlayStore

playstore app for rathbiotaclan
  • About Us
  • Advertise With Us
  • Cancellation and Refund Policy
  • Contact Us
  • Contribute
  • Editorial Standards
  • Home
  • Pricing Details
  • Privacy Policies
  • Shipping Policy
  • Terms & Conditions

© 2026 RathBiotaClan. All rights reserved.

Welcome Back!

Sign In with Google
OR

Login to your account below

Forgotten Password? Sign Up

Create New Account!

Sign Up with Google
OR

Fill the forms below to register

*By registering into our website, you agree to the Terms & Conditions and Privacy Policy.
All fields are required. Log In

Retrieve your password

Please enter your username or email address to reset your password.

Log In

Add New Playlist

No Result
View All Result
  • HOME
  • HEALTH SCIENCE
  • NEUROSCIENCE
    • PHYSIOLOGY
    • IMMUNOLOGY
    • CANCER
  • DISCOVERIES
    • SPOTLIGHTS
    • STUDENT PORTAL
    • SCIENCE FEATURED
  • MOLECULAR BIOLOGY
    • GENETICS
    • BIOTECHNOLOGY
    • BIOINFORMATICS
    • BIOCHEMISTRY
    • BIOPHYSICS
  • ZOOLOGY & ECOLOGY
    • ENVIRONMENTAL SCIENCE
    • ECOLOGY
    • EVOLUTION
  • MICRO & PLANT SCIENCE
    • MICROBIOLOGY
    • CELL BIOLOGY
    • DEVELOPMENTAL BIOLOGY
  • PSYCHOLOGY
  • Login
  • Sign Up
SAVED POSTS

© 2026 RathBiotaClan. All rights reserved.

This website uses cookies. By continuing to use this website you are giving consent to cookies being used. Visit our Privacy and Cookie Policy.