Journal of Applied Biotechnology Reports

Journal of Applied Biotechnology Reports

Combining Machine Learning Algorithms with Meta-Analysis and WGCNA to Identify Biomarker-Responsive Genes to Environmental Stresses in Thermus thermophilus HB8

Document Type : Original Article

Authors
1 Department of Cell and Molecular Biology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran
2 Department of Molecular Biosciences, Wenner-Gren Institute, Stockholm University, SE 106 91 Stockholm, Sweden
Abstract
Introduction: Thermus thermophilus is a thermophilic bacterium known for its resilience in extreme environments. Investigating its transcriptomic responses to environmental stresses can uncover critical adaptive mechanisms.
Materials and Methods: This study analyzed transcriptomic data from 10 microarray datasets, including 63 samples (36 stress-exposed and 27 controls). Stress conditions included copper, cold, zinc, iron, heat, salt, H2O2, tetracycline, diamide, and alkylation. Differentially expressed genes (DEGs) were identified through meta-analysis, followed by Gene Ontology (GO) enrichment analysis. Weighted gene co-expression network analysis (WGCNA) was employed to detect stress-associated gene modules. Machine learning approaches—decision tree, logistic regression, random forest, adaptive boosting, SVM-RFE, and XGBoost—were used to prioritize key genes. 
Results: Meta-analysis revealed 54 upregulated and 196 downregulated genes under stress. GO analysis highlighted significant enrichment in ion transport, localization processes, and transmembrane transporter activity. WGCNA identified two stress-related modules, cyan and lightcyan. SVM-RFE and XGBoost outperformed other machine learning models with superior accuracy, precision, recall, and F1-scores. TTHA0798 emerged as a hub gene consistently identified across machine learning and DEG/WGCNA analyses.
Conclusions: This study provides a comprehensive analysis of the stress responses of T. thermophilus, identifying TTHA0798 as a key hub gene. The integration of transcriptomic data, co-expression analysis, and machine learning offers valuable insights into the adaptive mechanisms of this extremophile, paving the way for further functional studies. 
Keywords

Volume 12, Issue 4
Autumn 2025
Pages 1852-1864

  • Receive Date 26 November 2024
  • Revise Date 09 January 2025
  • Accept Date 01 March 2025