Note! Please note that this article is currently in the "Article in Press" stage and is not the final "Version of record". While it has been accepted, copy-edited, and formatted, however, it is still undergoing proofreading and corrections by the authors. Therefore, the text may still change before the final publication. Although "Articles in Press" may not have all bibliographic details available, the DOI and the year of online publication can still be used to cite them. The article title, DOI, publication year, and author(s) should all be included in the citation format. Once the final "Version of record" becomes available the "Article in Press" will be replaced by that.
Abstract
Accurate prediction of breeding values is challenging due to the genotype-phenotype relationship is crucial and necessary for producing crops with elite genotypes. This paper is about investigating and predicting the phenotypic trait Height and Yeild in a genotype.
Background: Most of the existing studies focus on genetic methods or Machine learning models, in this, we implemented a hybrid combination of genetic methods and machine learning models that accurately predicted phenotypic trait yield, height and subpopulation.
Methodology: Our proposed methodology for genomic prediction of yield in Oryza sativa (rice) involves a two-level classification approach. First, we classify biological sequences and cluster them using the UPGMA algorithm on a phylogenetic tree. Then, we use advanced machine learning techniques like Random Forest, and K-Nearest Neighbours to predict GEBVs with 85- 95% accuracy on rice subpopulations.
Results: we achieved an accuracy of 93% when compared with other stated literature in this paper.
Conclusion: This approach overcomes limitations and effectively enhances crop breeding by capturing the genotype-phenotype relationship.