The comparison of 303,250 human SARS-CoV-2 spike protein sequences with the reference protein sequence Wuhan-Hu-1, showed ∼96.5% of the spike protein sequence has undergone the mutations till date, since outbreak of the COVID-19 pandemic disease that was first reported in December 2019. A total of 1,269,629 mutations were detected corresponding to 1,229 distinct mutation sites in the spike proteins comprising 1,273 amino acid residues. Thereby, ∼3.5% of the human SARS-CoV-2 spike protein sequence has remained invariant in the past two years. Considering different mutations occur at the same mutation site, a total of 4,729 distinct mutations were observed and are catalogued in the present work. The WHO/CDC, U.S.A., classification and definitions for the current variants being monitored (VBM) and variant of concern (VOC) are assigned to the SARS-CoV-2 spike protein mutations identified in the present work along with a list of other amino acid substitutions observed for the variants. All 195 amino acid residues in receptor binding domain (Thr333-Pro527) were associated with mutations in SARS-CoV-2 spike protein sequence including Lys417, Tyr449, Tyr453, Ala475, Asn487, Thr500, Asn501 and Gly502 that make interactions with the ACE-2 receptor ≤3.2 Å distance as observed in the crystal structure complex available in the Protein Data Bank (PDB code:6LZG). However, not all these residues were mutated in the same spike protein. Especially, Gly502 mutated only in two spike protein sequences and Tyr449 mutated only in seven spike protein sequences among the spike protein sequences analysed constitute potential sites for the design of suitable inhibitors/drugs. Further, forty-four invariant residues were observed that correspond to ten domains/regions in the SARS-CoV-2 spike protein and some of the residues exposed to the protein surface amongst these may serve as epitope targets to develop monoclonal antibodies.

Fuente: Current Research in Structural Biology
Available online 9 February 2022