key: cord-0027738-b5bopll5
authors: Anjum, Farah; Joshia, Namrata; Mohammad, Taj; Shafie, Alaa; Alhumaydhi, Fahad A.; Aljasir, Mohammad A.; Shahwan, Moyad J. S.; Abdullaev, Bekhzod; Adnan, Mohd; Elasbali, Abdelbaset Mohamed; Pasupuleti, Visweswara Rao; Hassan, Md Imtaiyaz
title: Impact of Single Amino Acid Substitutions in Parkinsonism-Associated Deglycase-PARK7 and Their Association with Parkinson’s Disease
date: 2022-02-05
journal: J Pers Med
DOI: 10.3390/jpm12020220
sha: f6f53a5b4150c6dae660905f950da5bc33da45eb
doc_id: 27738
cord_uid: b5bopll5

Parkinsonism-associated deglycase-PARK7/DJ-1 (PARK7) is a multifunctional protein having significant roles in inflammatory and immune disorders and cell protection against oxidative stress. Mutations in PARK7 may result in the onset and progression of a few neurodegenerative disorders such as Parkinson’s disease. This study has analyzed the non-synonymous single nucleotide polymorphisms (nsSNPs) resulting in single amino acid substitutions in PARK7 to explore its disease-causing variants and their structural dysfunctions. Initially, we retrieved the mutational dataset of PARK7 from the Ensembl database and performed detailed analyses using sequence-based and structure-based approaches. The pathogenicity of the PARK7 was then performed to distinguish the destabilizing/deleterious variants. Aggregation propensity, noncovalent interactions, packing density, and solvent accessible surface area analyses were carried out on the selected pathogenic mutations. The SODA study suggested that mutations in PARK7 result in aggregation, inducing disordered helix and altering the strand propensity. The effect of mutations alters the number of hydrogen bonds and hydrophobic interactions in PARK7, as calculated from the Arpeggio server. The study indicated that the alteration in the hydrophobic contacts and frustration of the protein could alter the stability of the missense variants of the PARK7, which might result in disease progression. This study provides a detailed understanding of the destabilizing effects of single amino acid substitutions in PARK7.

Parkinsonism-associated deglycase-PARK7/DJ-1 (PARK7) is a multifunctional protein that plays a crucial role in inflammatory diseases, immune disorders, and cell protection against oxidative stress. It is localized in the nucleus, cytoplasm and mitochondria [1] . It belongs to the superfamily PfpI/Hsp31/DJ-1 with a conserved exposed cysteine residue [2] . It has various functions such as a peroxidase, a protease, glyoxalase, a chaperone for synuclein, and an apoptosis inhibitor [2] . It prevents the cell from oxidative stress, associated with various complex disorders such as Parkinson's disease, cancer, asthenozoospermia, and Alzheimer's disease [3, 4] . Parkinson's disease is an untreatable, unstoppable growing neurodegenerative disorder, positioned as the second most common neurodegenerative disorder after Alzheimer's disorder [5] . It is progressively noticeable as a multicentred disorder that affects the nervous system and results in rest tremor, postural tremor, muscle stiffness, and bradykinesia [5] .

Apart from Parkinson's disease, PARK7 has been involved in various complexities such as cancer and infertility [6, 7] . A study in the Netherlands and Italy found that families have early-onset autosomal recessive Parkinson's disease due to the homozygous mutations in the PARK7 gene [8] . PARK7 is found in the cytoplasm and nucleus, whose expression has been seen in many tissues, including brain, eye, and endocrine tissues [8] . In the subcortical region, high expression of PARK7 mRNA may be crucial for basal ganglia function [8] . On overexpression models, the subcellular distribution shows that PARK7 is mainly traced in the nucleus and the cytoplasm, whereas its lesser amount exists in mitochondria under stress conditions [6] . In the mouse brain, the endogenous poll of PARK7 in the intermembrane space and mitochondrial matrix was also shown by subcellular fractionation and immunogold electron microscopy [6] .

In humans, the PARK7 gene is found at chromosome 1p36, which encodes a 189 amino acid residue long protein that shows structural similarity to the THiJ and PfpI bacterial proteins involved in protease activity and thiamine synthesis, respectively [8] . The PARK7 structure has been analyzed in detail, and the crystal structure of both the monomer and dimer was reported by an independent report [6] . PARK7 as a monomer comprises 8 alpha-helix, and 11 beta-strands ordered asymmetrically in a helix-strand-helix sandwich similar to the Rossman fold [9] . Eight pairs of hydrogen bonds and various van der Waal interactions form dimerization of PARK7 [6, 9] . Proteolytic activity is displayed at a conserved cysteine residue by its homolog PH1704. C106, a highly conserved cysteine residue of PARK7, has a possible function as a protease. The main catalytic triad of Cys-His-Glu in the PH1704 active site is absent in PARK7, along with C106.

PARK7 loses its functionality when it becomes mutated, and it is related to mitochondria dysfunction, resulting in the early onset of Parkinson's disease [10, 11] . There is inadequacy in sperm motility in humans and other species, and infertility was also observed due to the PARK7 mutations [10] . Mutation-associated loss of function in PARK7 is related to the mitochondria's abnormal morphology and dynamics, alteration in calcium homeostasis, and increased sensitivity to oxidative stress [10] . PARK7 was recognized as a part of the novel glyoxalase family examined as detoxifying proteins [10] . PARK7 acts as both a causative and carcinogenic gene and plays a crucial role in oxidative stress [10] . Because mutations in PARK7 cause various complex diseases, we studied several variants of PARK7 using state-of-the-art computational approaches [12] [13] [14] [15] [16] . We took 152 mutations of the whole protein to explore their consequences in disease progression. The present study will offer an in-depth analysis of mutations on the structure of PARK7 and their possible implication in Parkinson's disease.

The protein sequence of PARK7 was downloaded from the UniProt database (ID: Q99497). A list of nonsynonymous single nucleotide polymorphisms (nsSNPs) was prepared from the Ensembl [17] , dbSNP [18] , HGMD [19] , and ClinVar [20] databases. The duplicate variants were removed from the list. The PARK7 protein structure was downloaded from the RCSB Protein Data Bank (PDB ID: 1P5F). We used multiple tools for sequence-based and structure-based predictions to enhance the confidence score of the predicted results [21] [22] [23] [24] . The overview of the computational aspects to predict the pathogenic mutations in PARK7 is illustrated in Figure 1. 

The protein sequence of PARK7 was downloaded from the UniProt database (ID: Q99497). A list of nonsynonymous single nucleotide polymorphisms (nsSNPs) was prepared from the Ensembl [17] , dbSNP [18] , HGMD [19] , and ClinVar [20] databases. The duplicate variants were removed from the list. The PARK7 protein structure was downloaded from the RCSB Protein Data Bank (PDB ID: 1P5F). We used multiple tools for sequence-based and structure-based predictions to enhance the confidence score of the predicted results [21] [22] [23] [24] . The overview of the computational aspects to predict the pathogenic mutations in PARK7 is illustrated in Figure 1 . Overview of the computational aspects to predict the pathogenic mutations of the PARK7 protein at the sequential, structural, and functional levels.

The SIFT (http://sift.jcvi.org/, accessed on 1 December 2021) tool is used to examine whether a mutation in a protein is deleterious or not based on the physical characteristics of the amino acid. It also considers the sequence homology of a protein. If the SIFT score is less than or equal to 0.05, then the mutation is predicted as deleterious. The SIFT tool predicts the effect of these missense variants on the protein. A total of 152 missense variants were examined for PARK7 and categorized as deleterious/neutral or with unknown significance. Figure 1 . Overview of the computational aspects to predict the pathogenic mutations of the PARK7 protein at the sequential, structural, and functional levels.

The SIFT (http://sift.jcvi.org/, accessed on 1 December 2021) tool is used to examine whether a mutation in a protein is deleterious or not based on the physical characteristics of the amino acid. It also considers the sequence homology of a protein. If the SIFT score is less than or equal to 0.05, then the mutation is predicted as deleterious. The SIFT tool predicts the effect of these missense variants on the protein. A total of 152 missense variants were examined for PARK7 and categorized as deleterious/neutral or with unknown significance.

PolyPhen-2 (http://genetics.bwh.harvard.edu/pph2/, accessed on 1 December 2021) is another sequence-based tool that predicts the damaging probability of mutation by considering the physical and comparative properties of the sequences. It provides the PSIC (Position-Specific Independent Count) score for the missense variants and then calculates the score divergence with the wild-type.

PROVEAN (http://provean.jcvi.org/, accessed on 1 December 2021) also predicts the deleterious impact in a protein. Here, if the calculated score is less than −2.5 for a mutation, it is considered damaging, whereas a mutation with scores greater than −2.5 is considered neutral.

Mutation Assessor (http://mutationassessor.org/r3/, accessed on 1 December 2021) is another sequence-based tool used to analyze the effect of missense mutations in a protein. It is based on evolutionarily conserved residues and a multiple sequence alignment approach. This tool accepts UniProt accession ID as input for protein sequence. It categories the missense mutations as neutral, low, or medium for damaging effects. It provides an FI score for each mutation. If the FI score is greater than 2.00, the missense mutation is considered damaging.

PON-P2 (http://structure.bmc.lu.se/PON-P2/, accessed on 1 December 2021) is another sequence-based approach that predicts pathogenic missense mutations using a machine learning technique. This tool categorizes the missense variants of a protein into unknown, neutral, and pathogenic categories. It gives results in less amount of time. It uses physical properties, evolutionary sequence conservation, and biochemical properties of the protein. The missense variants data to PON-P2 can be submitted in different file formats.

SDM2 (http://marid.bioc.cam.ac.uk/sdm2, accessed on 1 December 2021) is a structurebased tool that estimates the change in protein stability between the wild-type and the mutant. It accepts the PDB file format as input. The SDM2 server predicts the OSP (residueoccluded packing density), RSA (relative side-chain solvent accessibility), and residue depth for the mutant and wild-type protein. As a score, if the ∆∆G is >0 for a given mutant, SDM2 predicts it destabilizing.

mCSM (http://biosig.unimelb.edu.au/mcsm/, accessed on 1 December 2021) is a web server used to predict destabilizing mutations of a protein using a graph-based approach. This tool provides better insights into the missense mutations associated with different disorders. A missense mutation is considered as destabilizing if the mCSM score (∆∆G) < 0.

PhD-SNP (http://gpcr.biocomp.unibo.it/cgi/predictors/PhD-SNP/PhD-SNP.cgi/, accessed on 1 December 2021) is a structure-based tool that differentiates disease-causing nsSNPs from neutral. This is an SVM-based tool that depends on the local environment of substitution for prediction. PhD-SNP accepts FASTA file format or PDB ID of the protein as an input. This tool classifies mutation into disease or neutral. Its prediction is based on sequence-based, profile-based, and hybrid methods.

Rhapsody (http://rhapsody.csb.pitt.edu/, accessed on 1 December 2021) is a web server that predicts the mutant protein's sequence conservation and structural properties. Rhapsody predicts residue-averaged pathogenicities of the missense mutations better than EVmutation and PolyPhen-2. Rhapsody accepts the PDB ID of a protein as an input. This tool can submit the batch query (up to 10,000) of missense variants, i.e., less time-consuming.

The replacement of a large amino acid with a smaller one can alter residue depth and solvent accessibility and influence the formation of active-site cavities in proteins [25] . Apart from the studies mentioned above, we calculated the OSP, RSA, and residue depth for the mutant and wild-type protein using the SDM2 web-based server. SDM2 is a freely accessible and user-friendly interface for studying proteins. It uses an environment-specific missense mutation table to evaluate RSA, OSP, and residue depth. RSA value is calculated using Lee and Richard's method. RSA, residue depth, and OSP are considered important properties of the protein structure to predict the stability of the protein.

SODA (http://protein.bio.unipd.it/soda/, accessed on 1 December 2021) is a sequenceand structure-based approach for the prediction of the solubility of a protein. This tool estimates the protein's aggregation, disorder, helix, and strand propensity, which rise due to the missense mutations. SODA accepts both FASTA as well as PDB file format for input.

Arpeggio server (http://biosig.unimelb.edu.au/arpeggioweb/, accessed on 1 December 2021) is a web-based tool that calculates the number of interatomic interactions like van der Waal interactions, hydrogen bonds, aromatic interactions, and hydrophobic interactions of a protein structure. Arpeggio can estimate about 15 types of interatomic interactions. This tool accepts PDB file format as input. It provides a downloadable list of the number of different types of interactions.

Conservation plays a vibrant role in the structure and function of any protein. Conservation analyses can be performed using multiple sequence alignments of similar proteins. A web tool named ConSurf (https://consurf.tau.ac.il/, accessed on 1 December 2021) measures the degree of conservation of the sequence. We have used ConSurf-DB, which has pre-calculated evolutionary profiles of various proteins with known structures.

The Frustratometer web tool (http://frustratometer.qb.fcen.uba.ar/, accessed on 1 December 2021) was used to examine the residual frustration in PARK7 and its mutants. It evaluates the energy of a protein structure and compares it to the energy of a set of 'decoy' states. The single and configurational residual indexes for all three systems were computed. In this calculation, contact is highly frustrated/destabilizing if the value of the Z-score is <0.78. In contrast, contact is minimally frustrated/stabilizing if the value of the Z-score is >0.78. At the same time, if the energy lies in between these two values, the contact is considered neutral.

At the beginning of the study, a total of 152 reported mutations were retrieved from the Ensembl (https://asia.ensembl.org/index.html, accessed on 1 December 2021) database. This study is based on the sequence-and structure-based analyses of these mutations retrieved from the Ensembl. A multilevel approach was operated to predict the functional and structural effect of mutations on the PARK7 protein. Sequence-based and structural-based approaches have been operated to obtain the high confidence diseased mutations [26] [27] [28] . All of the mutations were sequence-based analyzed using five web servers, which were PON-2, Mutation Assessor, SIFT, PolyPhen2, and PROVEAN. Only the high confidence mutations (predicted to be 'deleterious' by at least three predictors) of the PARK7 protein were taken in the structure-based analysis. Then, the stability of the selected mutations was predicted using SDM2, mCSM, and MUpro. The input file format in all of these structure-based stability prediction tools was the PBD coordinates file of the protein. For further study, the mutations whose stability decreased were selected for the disease phenotype analysis using Rhapsody, PMut, PhD-SNP, and MutPred2. The packing density and accessible surface area, degree of solubility, and aggregation propensity were also studied using different approaches. The PARK7 nsSNPs retrieved from the Ensembl database were categorized into four types, graphed in Figure 2 .

based approaches have been operated to obtain the high confidence diseased mutations [26] [27] [28] . All of the mutations were sequence-based analyzed using five web servers, which were PON-2, Mutation Assessor, SIFT, PolyPhen2, and PROVEAN. Only the high confidence mutations (predicted to be 'deleterious' by at least three predictors) of the PARK7 protein were taken in the structure-based analysis. Then, the stability of the selected mutations was predicted using SDM2, mCSM, and MUpro. The input file format in all of these structure-based stability prediction tools was the PBD coordinates file of the protein.

For further study, the mutations whose stability decreased were selected for the disease phenotype analysis using Rhapsody, PMut, PhD-SNP, and MutPred2. The packing density and accessible surface area, degree of solubility, and aggregation propensity were also studied using different approaches. The PARK7 nsSNPs retrieved from the Ensembl database were categorized into four types, graphed in Figure 2 . 

Multiple tools based on different algorithms and approaches were used to identify diseased mutations since using only one tool can provide some false positives. Multiple tools were used to avoid any false prediction, warranting more accuracy of the outcomes. For the sequence-based analysis, PON-2, Mutation Assessor, SIFT, PolyPhen2, and PROVEAN were used. All 152 mutations of the human PARK7 were firstly analyzed using sequence-based tools (Table S1) 

Multiple tools based on different algorithms and approaches were used to identify diseased mutations since using only one tool can provide some false positives. Multiple tools were used to avoid any false prediction, warranting more accuracy of the outcomes. For the sequence-based analysis, PON-2, Mutation Assessor, SIFT, PolyPhen2, and PROVEAN were used. All 152 mutations of the human PARK7 were firstly analyzed using sequence-based tools (Table S1) From the above-mentioned sequence-based analysis output, structure-based stability prediction using SDM2, MUpro, and mCSM was carried out. The structure-based approach was performed on the 76 variants with high confidence deleteriousness from the sequence-based predictions (Table S2 ). Out of the 76 variants, SDM2, mCSM, and MUpro predicted 54 (71.05%), 73 (96.05%) and 73 (96.05%) missense variants as destabilizing, respectively ( Figure 4) . Furthermore, to raise the confidence level, we selected those variants identified as diseased by at least three different sequence-based approaches and at least From the above-mentioned sequence-based analysis output, structure-based stability prediction using SDM2, MUpro, and mCSM was carried out. The structure-based approach was performed on the 76 variants with high confidence deleteriousness from the sequencebased predictions (Table S2) . Out of the 76 variants, SDM2, mCSM, and MUpro predicted 54 (71.05%), 73 (96.05%) and 73 (96.05%) missense variants as destabilizing, respectively ( Figure 4) . Furthermore, to raise the confidence level, we selected those variants identified as diseased by at least three different sequence-based approaches and at least two different structure-based approaches. After investigating by using this approach, 69 (45.39% of the total) variants were selected, which were identified as diseased/deleterious or destabilizing by using both approaches. The disease phenotype identification was carried out on these 69 missense variants. Figure 3 . Deleterious, unknown, and neutral missense mutations' distribution identified by sequence-based approaches for the entire sequence of the PARK7 protein.

From the above-mentioned sequence-based analysis output, structure-based stability prediction using SDM2, MUpro, and mCSM was carried out. The structure-based approach was performed on the 76 variants with high confidence deleteriousness from the sequence-based predictions (Table S2) . Out of the 76 variants, SDM2, mCSM, and MUpro predicted 54 (71.05%), 73 (96.05%) and 73 (96.05%) missense variants as destabilizing, respectively ( Figure 4) . Furthermore, to raise the confidence level, we selected those variants identified as diseased by at least three different sequence-based approaches and at least two different structure-based approaches. After investigating by using this approach, 69 (45.39% of the total) variants were selected, which were identified as diseased/deleterious or destabilizing by using both approaches. The disease phenotype identification was carried out on these 69 missense variants. 

We identified the disease phenotype associated with the selected mutations using the Rhapsody and PhD-SNP web servers. Based on the pathogenicity score obtained by these 

We identified the disease phenotype associated with the selected mutations using the Rhapsody and PhD-SNP web servers. Based on the pathogenicity score obtained by these web servers, we predicted the disease phenotype of the selected variants ( Table 1) . The Rhapsody and PhD-SNP web servers categorized the missense variants into 'neutral' or 'diseased/deleterious' mutations. From the 69 missense variants, we identified that 25 mutations were predicted to be diseased in both predictors. 

Change in solvent accessibility is considered one of the critical parameters to understand the structural features of proteins [25] . Any replacement of a large amino acid with a smaller one can alter residue depth and solvent accessibility and influence the formation of cavities [29] . Analysis of solvent accessibility provides information about the packing density before and after achieving mutation. Using the SDM2 server, we calculated the OSP, RSA, and residue depth for 25 mutants of PARK7 and its wild-type structure ( Table 2 ). The change in OSP, RSA, and residue depth of the selected variants resulted in the identification of 21 mutations that reduce the structural stability and integrity of the PARK7 protein.

The solubility of a protein influences its function to a great extent. Diseases such as Parkinson's disease, amyloidosis, and Alzheimer's disease are caused by the aggregation of insoluble parts of the proteins [30] [31] [32] [33] [34] . To predict the solubility of PARK7, we calculated the variants' solubility using SODA (Solubility based on Disorder and Aggregation). Mutations alter the aggregation, disorder, helix, and strand propensity of the protein variants, and these parameters were predicted by SODA. From the disease phenotype prediction of 21 high confidence missense variants, 12 variants raised the solubility of the PARK7 protein, whereas the rest decreased the solubility of the protein (Table 3) . 

It is known that the alteration in the hydrophobic contacts of a protein can alter its stability [35] . Missense mutations in PARK7 protein can induce large alterations; thus, they can affect the structural stability of the protein. Using the Arpeggio server, we calculated the number of hydrogen bonds, ionic interactions, van der Waal interactions, electrostatic, and hydrophobic interactions of the PARK7 mutants ( Table 4 ). The effect of mutation is shown by decreasing and increasing the number of different types of bonds. All selected 21 missense variants were examined with the help of the Arpeggio server. We found that PARK7 loses several contacts, especially hydrophobic and van der Waals interactions, after obtaining most of the selected mutations compared to the wild-type protein (Table 4) .

Parkinson's disease is specified by dopaminergic dysfunction. Mutation in PARK7 has been related to early-onset Parkinson's disease [6, 36] . After an extensive literature survey, we discovered that out of 21 mutations, there are 8 pathogenic mutations (L10P, G13E, E16G, A104T, A107P, T154A, L166P, and L172Q) that have been widely explored in various research works. Here, two mutations (L166P and L172Q) are damaging, but their structural consequences are not studied much. L172Q mutation resulting from an SNP does not reduce the expression of PARK7 mRNA but somehow destabilizes the protein to a point where it is barely detectable by western blot [37, 38] . L172Q is highly unstable, rapidly degraded by the proteasome, behaves very similar to L166P, and possibly retains chaperoning activity [6, 39] . Studies reported that wild-type and missense mutants (i.e., M26I, R98Q, A104T, and D149A) of PARK7 were found to be stable proteins, whereas only the L166P mutation was unstable in cells [6, [39] [40] [41] . At the same time, the L166P mutant was degraded by proteasome-mediated endoproteolytic cleavage in vitro [40] . Taken together, deletion or point mutation in PARK7 results in the loss of function, which might give rise to disease development.

Conservation analysis of amino acids in a protein provides better insights into the residual evolution [42, 43] . Here, in conservation analysis, the result showed that the amino acid stretches in PARK7 ranges from 1-30, 67-75, 102-127, 145-169, 181-182, and 186-189 were highly conserved, while the stretches from amino acids 38-66, 76-101, 128-144, and 170-180 were less conserved. The analysis suggested that amino acids L166 and L172 are relatively conserved, and any mutations to these sites may result in the structural destabilization of PARK7 ( Figure 5 ). M26I, R98Q, A104T, and D149A) of PARK7 were found to be stable proteins, whereas only the L166P mutation was unstable in cells [6, [39] [40] [41] . At the same time, the L166P mutant was degraded by proteasome-mediated endoproteolytic cleavage in vitro [40] . Taken together, deletion or point mutation in PARK7 results in the loss of function, which might give rise to disease development.

Conservation analysis of amino acids in a protein provides better insights into the residual evolution [42, 43] . Here, in conservation analysis, the result showed that the amino acid stretches in PARK7 ranges from 1-30, 67-75, 102-127, 145-169, 181-182, and 186-189 were highly conserved, while the stretches from amino acids 38-66, 76-101, 128-144, and 170-180 were less conserved. The analysis suggested that amino acids L166 and L172 are relatively conserved, and any mutations to these sites may result in the structural destabilization of PARK7 ( Figure 5 ). 

It is a well-established fact that the energy landscape of proteins is funneled toward the native ensemble, characterized by global minima [44] . Frustration analysis helps identify the frustration levels and their locations in protein structure, which can help understand how mutations can affect protein conformations and structural stability [44, 45] . Towards this, the frustration energetics were investigated in PARK7 and its mutants (L166P and L172Q). We comprehensively explored the local frustration ion indices in all three systems ( Figure 6 ). The frustration indices showed that the C-terminus (residues 175-188) increased frustration (highlighted as dotted rectangles) when PARK7 was mutated. However, the frustration was found to slightly decrease on the N-terminus in the first few residues spanning 1-5. Overall, the frustration indices suggested that mutations L166P and L172Q alter the PARK7 frustration, which might be responsible for the instability of the 

It is a well-established fact that the energy landscape of proteins is funneled toward the native ensemble, characterized by global minima [44] . Frustration analysis helps identify the frustration levels and their locations in protein structure, which can help understand how mutations can affect protein conformations and structural stability [44, 45] . Towards this, the frustration energetics were investigated in PARK7 and its mutants (L166P and L172Q). We comprehensively explored the local frustration ion indices in all three systems ( Figure 6 ). The frustration indices showed that the C-terminus (residues 175-188) increased frustration (highlighted as dotted rectangles) when PARK7 was mutated. However, the frustration was found to slightly decrease on the N-terminus in the first few residues spanning 1-5. Overall, the frustration indices suggested that mutations L166P and L172Q alter the PARK7 frustration, which might be responsible for the instability of the protein. 

SNPs are examined as the most successive hereditary variations related to various human diseases. Broad investigation of SNPs can offer understandings to comprehend the disease-causing component and help discover successful therapies for various complex diseases. In the current study, we examined various mutations in PARK7. The sequence-based and structural-based approaches showed that out of 152 mutations in the PARK7 protein, 76 are considered destabilizing and deleterious. Out of these 76 mutations, 25 were found to be pathogenic. Aggregation propensity was carried out to examine 21 mutations of reduced stability and found that 9 pathogenic mutants accumulate and become insoluble. The structural alteration that occurred by the gain or loss of noncovalent interatomic interactions significantly impacts the amino acid. It may be mutated and become pathogenic, as shown by the extensive structural analysis approach. This study presents a thorough understanding of the pathogenic mutations and their possible effect on disease progression. After an extensive literature survey and analysis, two mutations (L166P & L172Q) were selected and explored in detail. The study suggested that L166P and L172Q mutations can alter the structure and function of PARK7, which might be responsible for the disease's progression. The detailed understanding of PARK7 mutations will help make therapeutic strategies for associated diseases, including Parkinson's disease.

Supplementary Materials: The following are available online at www.mdpi.com/xxx/s1. Table S1 : Sequence-based analysis of 152 mutations of PARK7 protein, Table S2 

SNPs are examined as the most successive hereditary variations related to various human diseases. Broad investigation of SNPs can offer understandings to comprehend the disease-causing component and help discover successful therapies for various complex diseases. In the current study, we examined various mutations in PARK7. The sequencebased and structural-based approaches showed that out of 152 mutations in the PARK7 protein, 76 are considered destabilizing and deleterious. Out of these 76 mutations, 25 were found to be pathogenic. Aggregation propensity was carried out to examine 21 mutations of reduced stability and found that 9 pathogenic mutants accumulate and become insoluble. The structural alteration that occurred by the gain or loss of noncovalent interatomic interactions significantly impacts the amino acid. It may be mutated and become pathogenic, as shown by the extensive structural analysis approach. This study presents a thorough understanding of the pathogenic mutations and their possible effect on disease progression. After an extensive literature survey and analysis, two mutations (L166P & L172Q) were selected and explored in detail. The study suggested that L166P and L172Q mutations can alter the structure and function of PARK7, which might be responsible for the disease's progression. The detailed understanding of PARK7 mutations will help make therapeutic strategies for associated diseases, including Parkinson's disease.

Supplementary Materials: The following are available online at https://www.mdpi.com/article/10 .3390/jpm12020220/s1. Table S1 : Sequence-based analysis of 152 mutations of PARK7 protein, Table  S2 : Structure-based analysis of 76 mutations of PARK7 protein.

Mitochondrial localization of DJ-1 leads to enhanced neuroprotection

Parkinsonism-associated protein DJ-1/Park7 is a major protein deglycase that repairs methylglyoxal-and glyoxal-glycated cysteine, arginine, and lysine residues

Common mechanisms of onset of cancer and neurodegenerative diseases

DJ-1 activates SIRT1 through its direct binding to SIRT1

The second brain and Parkinson's disease

DJ-1 linked parkinsonism (PARK7) is associated with Lewy body pathology

DJ-1/Park7 protein

The expression of DJ-1 (PARK7) in normal human CNS and idiopathic Parkinson's disease

The 1.1-Å resolution crystal structure of DJ-1, the protein mutated in autosomal recessive early onset Parkinson's disease

Transcriptional regulation of DJ-1

The DJ-1L166P mutant protein associated with early onset Parkinson's disease is unstable and forms higher-order protein complexes

Structural genomics approach to investigate deleterious impact of nsSNPs in conserved telomere maintenance component 1

Investigation of conformational dynamics of Tyr89Cys mutation in protection of telomeres 1 gene associated with familial melanoma

Genomic variations in the structural proteins of SARS-CoV-2 and their deleterious impact on pathogenesis: A comparative genomics approach

Impact of Deleterious Mutations on Structure, Function and Stability of Serum/Glucocorticoid Regulated Kinase 1: A Gene to Diseases Correlation

Impact of amino acid substitution in the kinase domain of Bruton tyrosine kinase and its association with X-linked agammaglobulinemia

The NCBI database of genetic variation

Human gene mutation database (HGMD ® ): 2003 update

Public archive of interpretations of clinically relevant variants

Investigating single amino acid substitutions in PIM1 kinase: A structural genomics approach

Impact of single amino acid substitution on the structure and function of TANK-binding kinase-1

Impact of glioblastoma multiforme associated mutations on the structure and function of MAP/microtubule affinity regulating kinase 4

Structural Analysis and Conformational Dynamics of STN1 Gene Mutations Involved in

A review of methods available to estimate solvent-accessible surface areas of soluble proteins in the folded and unfolded states

Structural and functional impact of non-synonymous SNPs in the CST complex subunit TEN1: Structural genomics approach

Investigation of deleterious effects of nsSNPs in the POT1 gene: A structural genomics-based approach to understand the mechanism of cancer development

Investigating architecture and structure-function relationships in cold shock DNA-binding domain family using structural genomics-based approach

Depth dependent amino acid substitution matrices and their use in predicting deleterious mutations

Trehalose Restrains the Fibril Load towards alpha-Lactalbumin Aggregation and Halts Fibrillation in a Concentration-Dependent Manner

Heparin Accelerates the Protein Aggregation via the Downhill Polymerization Mechanism: Multi-Spectroscopic Studies to Delineate the Implications on Proteinopathies

Exploring the aggregation-prone regions from structural domains of human TDP-43

Protein aggregation, misfolding and consequential human neurodegenerative diseases

Protein aggregation and neurodegenerative diseases: From theory to therapy

Stability of protein structure and hydrophobic interaction

The genetics of P arkinson's disease: Progress and therapeutic implications

L166P mutant DJ-1, causative for recessive Parkinson's disease, is degraded through the ubiquitin-proteasome system

Molecular basis for the structural instability of human DJ-1 induced by the L166P mutation associated with Parkinson's disease

Mitochondrial LonP1 protease is implicated in the degradation of unstable Parkinson's disease-associated DJ-1/PARK 7 missense mutants

Reduced protein stability of human DJ-1/PARK7 L166P, linked to autosomal recessive Parkinson disease, is due to direct endoproteolytic cleavage by the proteasome

Reduced anti-oxidative stress activities of DJ-1 mutants found in Parkinson's disease patients

Insights into SARS-CoV-2 genome, structure, evolution, pathogenesis and therapies: Structural genomics approach

structure and evolutionary analysis of cold shock domain proteins, a member of OB fold family

Impact of Gln94Glu mutation on the structure and function of protection of telomere 1, a cause of cutaneous familial melanoma

Frustration in biomolecules

The authors declare no conflict of interest.