Identification of Key Responsive Genes to some Abiotic Stresses in Arabidopsis thaliana at the Seedling Stage based on Coupling Computational Biology Methods and Machine Learning

Document Type : Original Article


1 Department of Plant Sciences and Biotechnology, Faculty of Life Sciences and Biotechnology, Shahid Beheshti University, Tehran, Iran

2 Department of Biotechnology, Institute of Applied Sciences and Humanities, GLA University, Mathura, India


Introduction: Abiotic limitations, like water deficit, high temperature, salinity, and cold are some of the main barrier agents to plant growth throughout the world. To obtain a comprehensive view of a plant’s response to abiotic stresses, we applied robust bioinformatics approaches including the integration of meta-analysis, weighted gene co-expression network analysis (WGCNA), and machine learning.
Materials and Methods: In this paper, 32 samples from four different stresses were chosen for analysis. Cross-platform combination method was used to conduct meta-analysis. To find gene co-expression modules related to stress conditions WGCNA analysis was performed. Machine learning methods were applied to validate the most important hub genes.
Results: Meta-analysis detected 275 differential expression genes (DEGs) and WGCNA showed 28 distinct modules under those stresses. Seven potential hub genes (At1g07430 (HAI2), At5g52300 (LTI65), At1g60190 (PUB19), At5g50360, At1g77120 (ADH1), At1g56600 (GolS2), and At5g57050 were detected by network analysis and validated by machine learning methods. These genes are involved in different pathways of cellular response to abiotic stresses.
Conclusions: Analysis indicates that among the hub genes, At5g50360 was identified as a novel candidate gene. As such, the At5g50360 can be used in plant breeding programs for the development of abiotic stress-tolerant crops.