An Insight Into SARS-CoV-2 Phylogenetics and Genomics for Sixty Isolates Occurring in India

Document Type : Original Article


1 Department of Biotechnology, Koneru Lakshmaiah Education Foundation (Deemed to be University), Green Fields, Vaddeswaram-522502, Guntur, Andhra Pradesh, India

2 Eduard-Zintl-Institut für Anorganische und Physikalische Chemie, Centre of Smart Interfaces, Technische Universität Darmstadt, Alarich-Weiss-Straße 10, 64287, Darmstadt, Germany

3 Department of Chemical Engineering, Federal University of São Carlos, Rod. Washington Luiz, São Carlos-13565905, Brazil


Introduction: Analysis of genome sequences to search for encoded proteins and motifs is the most widely used technique for the prediction of new drug and vaccine targets. It can effectively leverage computational techniques to deliver effective and pragmatic advantages in the search for new drugs and vaccines.
Materials and Methods: We examine the diversity and evolution of Severe Acute Respiratory Syndrome Coronavirus-2 (SARS-CoV-2) isolates from different geographical parts of India using phylogenetic tree analysis. A dataset of 172 Indian SARS-CoV-2 genome sequences was collected from a database and a phylogenetic tree was constructed.
Results: From the phylogenetic analysis, we identified 6 different clusters and from each cluster, we have chosen 10 genome sequences to find open reading frames (ORFs) and common encoded proteins. We found 4 encoded proteins that are common among the 60 genome sequences and they correspond to ORF7a protein, membrane glycoprotein, surface glycoprotein, and nucleocapsid phosphoproteins. Our results highlight that there are 6 conserved motifs with a high frequency of occurrence suggesting that potentially use in further study.
Conclusions: The encoded proteins and their detected sequential motifs might be useful for screening potential drugs and vaccine candidates of SARS-CoV-2 Indian isolates in the current epidemic situation.