Detection of viral contamination in cell lines using ViralCellDetector
2025
Rama Shankar | Shreya Paithankar | Suchir Gupta | Bin Chen | Bin Chen | Bin Chen
Background and aimsCell lines are widely used in biomedical research to investigate various biological processes, including gene expression, cancer progression, and drug responses. However, cross-contamination with bacteria, mycoplasma, and viruses remains a persistent challenge. While the detection of bacterial and mycoplasma contamination is relatively straightforward, identifying viral contamination is more difficult. To address this issue, we developed ViralCellDetector, a tool designed to detect viral contamination by mapping RNA-seq data to a comprehensive viral genome library.MethodsViralCellDetector processes RNA-seq data from any host species by first aligning reads to the host reference genome, followed by mapping the unmapped reads to the NCBI viral genome database. Viral presence is determined using stringent criteria based on the number of mapped reads and viral genome coverage. To further enable the detection of viral contamination from unknown sources, we identified host genes that are differentially expressed during viral infection and used these markers to train a machine learning model for classification.ResultsUsing ViralCellDetector, we found that approximately 10% (110 samples) of RNA-seq datasets involving MCF7 cells were likely contaminated with viruses. The tool demonstrated high sensitivity in detecting viral sequences. Furthermore, the machine learning model effectively distinguished infected from non-infected samples based on human gene expression profiles, achieving an AUC of 0.91 and an accuracy of 0.93.ConclusionOur mapping-based approach enables robust detection of viral contamination in RNA-seq data from any host organism, while the marker-based approach accurately identifies viral infections specifically in human cell lines. This capability can help researchers detect and avoid the use of contaminated cell lines, thereby improving the reliability of experimental outcomes.
اظهر المزيد [+] اقل [-]الكلمات المفتاحية الخاصة بالمكنز الزراعي (أجروفوك)
المعلومات البيبليوغرافية
تم تزويد هذا السجل من قبل Directory of Open Access Journals