Generic placeholder image

Current Bioinformatics

Editor-in-Chief

ISSN (Print): 1574-8936
ISSN (Online): 2212-392X

Research Article

Validating the Distinctiveness of the Omicron Lineage within the SARSCov-2 based on Protein Language Models

In Press, (this is not the final "Version of Record"). Available online 27 April, 2024
Author(s): Ke Dong and Jingyang Gao*
Published on: 27 April, 2024

DOI: 10.2174/0115748936291075240409080924

Price: $95

Abstract

Introduction: Variants of concern were identified in severe acute respiratory syndrome coronavirus 2, namely Alpha, Beta, Gamma, Delta, and Omicron. This study explores the mutations of the Omicron lineage and its differences from other lineages through a protein language model.

Methods: By inputting the severe acute respiratory syndrome coronavirus 2 wild-type sequence into the protein language model evolving pre-trained models-1v, this study obtained the score for each position mutating to other amino acids and calculated the overall trend of a new variant of concern mutation scores.

Results: It is found that when the proportion of unobserved mutations to observed mutations is 4:15, Omicron still generates a large number of newly emerging mutations. It was found that the overall score for the Omicron family is low, and the overall ranking for the Omicron family is low.

Conclusion: Mutations in the Omicron lineage are different from amino acid mutations in other lineages. The findings of this paper deepen the understanding of the spatial distribution of spike protein amino acid mutations and overall trends of newly emerging mutations corresponding to different variants of concern. This also provides insights into simulating the evolution of the Omicron lineage.


Rights & Permissions Print Cite
© 2025 Bentham Science Publishers | Privacy Policy