Evaluation of predictive accuracy of gene expression in pigs using machine learning models
2025
Tianle ZHOU | Jinyan TENG | Zhiting XU | Zhe ZHANG
ObjectiveThe goal was to compare the performance of various machine learning models in predicting gene expression in pigs utilizing single nucleotide polymorphisms (SNPs), and to investigate the relationship between cis-heritability (cis-h2), the number of cis-SNPs and the prediction accuracy of different models.MethodBased on the protein encoding genes of pigs derived from muscle tissue of the PigGTEx project, we trained 18 distinct machine learning models by employing cis-SNPs located within a ±1 Mb window from the transcription start sites of genes. Subsequently, we evaluated the prediction accuracy of each model.ResultThere was a positive correlation between the prediction accuracy of machine learning models and the cis-h2 of genes. Notably, the elastic net regression model and the Lasso regression model exhibited the highest overall prediction accuracy, with the means of R2 being 0.0362 and 0.0358, respectively. Furthermore, there was a positive correlation between the prediction accuracy of these machine learning models and the number of cis-SNPs around the genes within certain range.ConclusionThe accuracy of utilizing machine learning models to predict gene expression in pigs is largely influenced by both cis-h2 and the number of cis-SNPs of genes. Therefore, selecting an appropriate machine learning model tailored to the specific cis-h2 and the number of cis-SNPs of different genes can be advantageous in enhancing the accuracy for predicting pig gene expression levels.
Show more [+] Less [-]AGROVOC Keywords
Bibliographic information
This bibliographic record has been provided by Directory of Open Access Journals