key: cord-0885770-6ygwx0rh authors: Whitefield, Cassidy; Hong, Nansook; Mitchell, Joshua A.; Jackson, Colin J. title: Computational design and experimental characterisation of a stable human heparanase variant date: 2022-02-15 journal: RSC Chem Biol DOI: 10.1039/d1cb00239b sha: 58d512fa45fa1fb9e67a187934edebd7d08d7ea3 doc_id: 885770 cord_uid: 6ygwx0rh Heparanase is the only human enzyme known to hydrolyse heparin sulfate and is involved in many important physiological processes. However, it is also unregulated in many disease states, such as cancer, diabetes and Covid-19. It is thus an important drug target, yet the heterologous production of heparanase is challenging and only possible in mammalian or insect expression systems, which limits the ability of many laboratories to study it. Here we describe the computational redesign of heparanase to allow high yield expression in Escherchia coli. This mutated form of heparanase exhibits essentially identical kinetics, inhibition, structure and protein dynamics to the wild type protein, despite the presence of 26 mutations. This variant will facilitate wider study of this important enzyme and contributes to a growing body of literature that shows evolutionarily conserved and functionally neutral mutations can have significant effects on protein folding and expression. Heparan sulfate (HS) consists of 1-4 linked disaccharide units that are negatively charged and structurally heterogeneous due to variable sulfation, deacetylation and epimerization during biosynthesis. 1 HS is often covalently linked to proteins and peptides to form heparan sulfate proteoglycans (HSPGs). 2 HSPGs are themselves a major component of the extracellular matrix (ECM) and basement membranes, forming a protective barrier by interacting with other major components of the ECM such as collagen, fibronectin and laminin. Their structural diversity and negative charge attract various cationic proteins and water, forming porous hydrogels that are able to store bioactive molecules including growth factors, 3,4 chemokines 5 and enzymes. 6 Heparanase (HPSE) is the only mammalian enzyme that is known to hydrolyse HS. [7] [8] [9] [10] In adults, HPSE is normally expressed at low levels, found only in platelets, immune cells and the placenta, [11] [12] [13] but increased expression of HPSE has been observed in many disease states, including cancer and Covid-19. [14] [15] [16] When overexpressed, HPSE catalyses the hydrolysis of HS, resulting in weakening of the ECM barrier, which can promote inflammation, 17, 18 cancer cell invasion, growth and migration, 19, 20 as well as angiogenesis. 21, 22 HPSE is also associated with tumour initiation by up-regulating proinflammatory mediators. 23, 24 Moreover, animal studies have shown HPSE genetic knock-outs improve cancer prognosis and increased survival without significant side effects. 25, 26 Owing to its roles in many diseases, HPSE has been a drug target for many years. For instance, HPSE expression promotes resistance to chemotherapy, whereas targeting HPSE with inhibitors can overcome chemoresistance and tumour relapse. 27 Indeed, many groups have attempted to produce drug-like HPSE inhibitors over recent decades. 1, 26, 28, 29 However, HPSE production currently relies on complex and expensive eukaryotic expression systems such as mammalian 7, 8, 30 and insect cells. 31, 32 While some prokaryotic HPSE expression methods have been reported, 33,34 they have not been sufficiently robust for widespread adoption. HPSE has many features that are known to reduce soluble expression prokaryotic systems, such as Escherichia coli, including multiple disulfide bonds and large positive regions on the surface, [35] [36] [37] as well as N-glycosylation. 22, 38 Moreover, HPSE is natively expressed as a pre-proheparanase which undergoes proteolytic cleavage of a signal peptide then a linker segment, resulting in an active heterodimer composed of 8 kDa and 50 kDa subunits ( Fig. 1) 39 Thus, in prokaryotic expression systems the 8 kDa and 50 kDa subunits have to be expressed separately and assemble into a heterodimeric complex. 33, 34 There are many experimental and computational methods that have been developed to improve enzyme function and stability, such as bioinformatics-based approaches like consensus design 40, 41 or ancestral sequence reconstruction, 42, 43 or forcefieldbased approaches like Rosetta 44 or FoldX. 45 However, both approaches have limitations. The Protein Repair One Stop Shop (PROSS) algorithm combines forcefield-based Rosetta modelling and phylogenetic sequence information to create variants with improved stability. 46 Here, we describe the use of PROSS to generate the first stable human HPSE variant to be expressed in E. coli. We demonstrate that it has significantly increased solubility, very similar catalytic activity and identical inhibition by competitive inhibitors, when compared to wild type human HPSE produced from mammalian cells. Our results are supported by an X-ray crystal structure and molecular dynamics simulations, which demonstrate that the introduced mutations stabilise HPSE with almost no effect on the three-dimensional structure or dynamics. This mutant HPSE should significantly reduce the costs and technical barriers to the development of HPSE inhibitors and its widespread study. We first tested bacterial expression of human HPSE (wild type) by cloning two subunits (8 kDa and 50 kDa, Tables S1 and S2, ESI †) into the dual expression vector (pETDuet-1). To optimize the chances of obtaining soluble, properly folded protein, we co-expressed the protein with chaperones (trigger factor and GroEL/GroES), [47] [48] [49] and used E.coli Shuffle T7 Express 50 cells, which allow disulfide bonds to form in the cytosol. Under these conditions, the 50 kDa subunit was totally insoluble, while the 8 kDa subunit was partially soluble (Fig S1, ESI †) . Given that the molecular structure of HPSE has recently been solved, 31, 32 it is now an appropriate candidate to be engineered to allow expression in simple and inexpensive expression systems, such as E. coli. Recently the PROSS algorithm 46 has demonstrated its utility in designing stable variants of challenging proteins for soluble and functional expression in bacteria. [51] [52] [53] Unlike conventional consensus mutagenesis approaches, in which poorly conserved residues are mutated back to their consensus identity (from a multiple sequence alignment), 54 PROSS combines this approach with computational modelling with Rosetta, 55 generating a set variants, each containing multiple mutations that ideally act together to increase stability. 46 We therefore used PROSS to redesign the insoluble 50 kDa subunit based on the crystal structure of the insect cell expressed human HPSE (PDB ID: 5E9C). The substrate binding site and the heterodimer interface residues were restrained to maintain function and preserve the interaction with the 8 kDa subunit. Seven variants with accumulated mutations were generated (Fig. S2 , ESI †), which were subsequently synthesized and sub-cloned into multiple cloning site 2 of pETDuet-1 vector. The 8 kDa subunit with a N-terminal poly histidine tag was sub-cloned into multiple cloning site 1. All variants were tested (Fig. S3 , ESI †) and the most soluble design, containing 26 amino acid substitutions was identified and purified (HPSE P6), using Ni 2+ -NTA, heparin and size exclusion chromatography. This resulted in pure, homogeneous, heterodimeric HPSE with a final yield of 4 mg from 1 L E. coli culture (Fig. 2) . Notably, PROSS is not infallible; many of the designs did not produce soluble protein, which emphasises the need to test multiple different variants. In mammalian cells, pre-proheparanase (Met1-Ile543) undergoes successive cleavage events of the N-terminal signal peptide (Met1-Ala35) and linker (Ser110-Gln157, red cartoon) segments to produce mature HPSE. The resulting heterodimer assembly of two subunits (8 kDa subunit (Gln36-Glu109, yellow cartoon) and 50 kDa subunit (Lys158-Ile543, blue cartoon)) consists of a TIM barrel (b/a) 8 and b-sandwich fold. The sequence of HPSE is shown on top as a bar representation in which glycosylation sites are shown as green sticks and cysteine residues are indicated as black arrows. In non-mammalian systems active protein must instead be produced via co-expression of the two subunits. 31 (B) Crystal structure of human HPSE expressed in an insect expression system (PDB ID: 5E8M) is shown as grey cartoons (bottom-left). Six N-glycosylation sites (Asn162, 178, 200, 217, 238, 459) are shown as green sticks and four cysteines (Cys179, 211, 437, 542) are shown as yellow sticks whereby two of them form a disulfide bond (Cys437-542) at the b-sandwich domain. Catalytic residues (Glu343, Glu225) at the TIM face are shown as red sticks. (C) Electrostatic potential surface was calculated using amino acid residues in the crystal structure by APBS (glycans were not included in the calculation). This shows two large positively-charged patches at the TIM domain and at the b-sandwich domain, which may promote aggregation in the nucleic acid rich micro environment. 37 Given the large number of mutations and loss of glycosylation sites, it was important to test whether these changes had any effect on the activity of the protein. The catalytic activity of purified HPSE P6 was tested by colorimetric assay using fondaparinux 56 (Arixtra), a synthetic analogue of HS. Although, the catalytic rate (k cat ) of HPSE P6 was slightly (less than 2-fold) higher (k cat = 2.94 AE 0.13 s À1 ) compared to HPSE WT (k cat = 1.72 AE 0.07 s À1 ), the binding affinity (K M = 11.6 AE 2.8 mM and 11.83 AE 2.7 mM respectively) was the same, demonstrating that the introduced mutations have no effect on the interaction between enzyme and substrate ( Fig. 2C , Table 1 ). In fact, the slight increase in k cat is most likely due to the higher purity of the HPSE P6, compared to the commercially available HPSE WT. The loss of the six glycosylation sites no effect on the activity of the enzyme, suggesting these sites may be important for protein solubility in mammalian systems. Having established the enzyme kinetic parameters are comparable to HPSE WT, we then tested whether the HPSE P6 variant would interact identically with heparan sulfate mimetic inhibitors; in this case pentosan polysulfate 57 ( Fig. 2C and D) . As with the enzyme kinetics, the inhibitory response to the model inhibitor pentosan polysulfate was near identical between HPSE WT and HPSE P6, with an IC 50 of 12.46 AE 1.26 nM and 12.43 AE 2.47 nM, respectively (Fig. 2D ). The thermal stability HPSE P6 was measured using circular dichroism (CD) by observing the loss of helicity at 222 nm over 20-90 1C. This revealed that HPSE P6 is somewhat thermostable, undergoing a transition to an unfolded state with a T m value of 63.6 AE 0.19 1C, (Fig. 3A ). This Tm value is similar to other engineered variants of human proteins produced through the use of PROSS, 46, 53 and significantly exceeds the normal temperature range that human proteins are exposed to (B37 1C). To understand how the 26 mutations in HPSE P6 result in enhanced protein folding and stability, we solved the crystal and HPSE WT, where catalytic rate (k cat ) of HPSE P6 was 70% higher (k cat 2.94 AE 0.13) compared to (k cat 1.72 AE 0.07) for WT. The binding affinity (K M 11.6 AE 2.8 mM and 11.83 AE 2.7 mM respectively) was the same. (D) Pentosan polysulfate was used to compare the design with the human HPSE expressed in mammalian cell. This was measured using colorimetric method with fondaparinux. Error bars represent standard error from a minimum of three measurements. structure of HPSE P6 at 1.30 Å resolution (Table S3 , ESI †). The protein crystallised in the P2 1 2 1 2 1 space group within 1 day, forming rod shaped crystals. This compares to WT HPSE crystallising in the P2 1 space groups, after 1-3 days. Despite 26 mutations, the crystal structure of HPSE P6 shows almost identical overall backbone and active site structures to HPSE WT expressed in an insect system (with the exception of the absence of any glycosylation). The Ca RMSD was 0.645 Å, with an alignment score of 0.017 (Fig. 3C) . Many subtle changes were observed due to the 26 mutations. Firstly, surface polarity, which is known to positively contribute to folding and stability, 59 was increased by substitutions to surface leucine and alanine residues to more polar or smaller amino acids (e.g. Leu197Gly, Leu354Gly, Leu498Gln, Leu230Arg, Ala195Ser) (Fig. 3C) . Secondly, additional stabilising interactions, including increased hydrogen bonding networks and hydrophobic packing were introduced, which stabilise the folded state. For example, new hydrogen bonding interactions were introduced by Leu483-His, lle318Thr, Lys477Gln and Ser322Gln and new hydrophobic interactions were introduced by Ser530Ala, Ser292Ala and Arg307Leu in the partially solvent exposed areas. Interestingly, we observed indirect conformational change of Phe258 by Ser212-Ala (Fig. 3C .iv). Finally, the disulfide bond (Cys437-Cys542) was possibly stabilized by introduction of proline at the position 540 (Ala540Pro) on the loop (Fig. S4, ESI †) . In previous applications of PROSS, it was noted that large positively charged patches, which could promote aggregation in the nucleic acid rich micro environment, 37 were eliminated or reduced. 60, 61 Here, in the case of HPSE, the electrostatic potential surface of HPSE WT shows two large positive patches around the active site and the b-sandwich domain (Fig. 3B) . For HPSE P6, the theoretical isoelectric point was the same as the wild type (pI 9.4), and the electrostatic surface potential shows that while one of the large positive patches around the b-sandwich domain was slightly diminished by two lysine mutations (Lys427Asp and Lys477Gln), the electrostatic potential around the active site at the TIM face was maintained as the area was constrained during the design process (Fig. 3B ). It has previously been shown that the dynamics and function of similar proteins can be very different despite ground state structures appearing very similar in terms of Ca RMSD. 62 Crystallographic B-factors are commonly used to probe differences in the conformational flexibility of proteins within a crystal lattice, although this approach can be limited by the existence of crystallographic artifacts, whereby flexible regions on the protein surface could be stabilized by interactions with the lattice symmetry mates. Comparison between the B-factors of HPSE WT and HPSE P6 reveal the overall trend in terms of regions with high or low B-factors is conserved, although a decrease in the overall B factors of the P6 variant in the TIM (b/a) 8 domain fluctuations, mostly around the surface loops of the active site (Fig. S4A, ESI †) . However, this analysis is confounded by the higher resolution, lower Wilson B-factor and different crystal packing of the HPSE P6 variant. To complement the structural analysis, we also performed molecular dynamics simulations to examine the effects of these mutations on the conformational sampling and motions of the protein. To identify whether the dynamic range of HPSE P6 is the same as the HPSE WT, a total simulation time of 1 ms per protein was completed. Principal component analysis was conducted to visualize motions that represent the major fluctuations of the system. Principal components 1 and 2 of the HPSE WT and HPSE P6 (10.4% and 9.0%) overlap, demonstrating that the breathing motion of the active site is conserved (Fig. 4A) . The third major component, which only contributes 6.5% of the total movement of the protein, shows slight differences, being comprised predominately of the movement of surface-exposed loops. No other principal component showed any difference between the two proteins (up to 20 components). Root mean square fluctuations were also analysed to identify the displacement of amino acids throughout the course of the simulation (Fig. 4B) . The average RMSF and their 95% confidence intervals, (where 95% of the residue's displacement occurs in that region) are overlaid for both proteins. This demonstrates that HPSE WT and P6 fluctuations overlap closely. There were very few differences overall, where the most consistent change is a decrease in magnitude of surface loops for HPSE P6 in comparison to WT simulations. Even though these residues have a very slightly decreased magnitude, the RMSF still has the same overall shape. The only residues with RMSF values outside of the 95% CI are residues 488-495. This is a surface loop on the b-sandwich domain with two introduced mutations; Leu483His and His486Asp. These mutations allow an increase of hydrogen bonding causing a slight rigidification of this loop (Fig. 4C) . Overall, the conservation of the protein dynamics despite 26 mutations is striking and unexpected. Indeed, the great majority of these mutations (identified as grey spheres in Fig. 4C ) do not cause any significant difference on the dynamics of the protein. This analysis is also fully consistent with the functional data, which revealed almost to effect of the mutations on activity or inhibition. Despite the widespread interest in HPSE as an important enzyme in human physiology and a drug target, the difficulty related to obtaining large quantities of pure recombinant protein has limited the ability of many groups to study the protein. Bacterial expression systems, such as E. coli, are widely accessible, allow protein to be expressed in high yields and at low cost. While prokaryotic HPSE expression methods have been reported, 33, 34 none have been repeated in the literature or have been widely adopted. Here, a stable version of human HPSE has been computationally designed, which allowed the mature heterodimeric enzyme to be expressed at reasonable yield in soluble form in E.coli. The subsequent characterization of this designed version (HPSE P6) showed the enzyme to behave essentially identically to HPSE WT in terms of its interactions with substrates (Table 1 ) and inhibitors (Fig. 1) . Thus, this computationally designed HPSE P6 variant should be a useful surrogate for the wild-type enzyme in structural biology, inhibitor screening and kinetic analyses. It is notable that despite 26 mutations, the enzyme is essentially structurally isomorphous to the wild-type, with no significant changes to the C-a backbone or side chain rotamer sampling, especially in the vicinity of the active site. Additionally, the dynamics of the enzyme were also largely identical to the wild-type enzyme, suggesting that there were no significant changes to the relative stabilities of different conformational substates. This reinforces the functional neutrality of many mutations and the power of bioinformatics inspired approaches such as consensus design and PROSS; these mutations were acquired through phylogenetic analysis i.e., they are known to be tolerated in related enzymes. Indeed, while their individual effects might be small, the summation of the effects can become considerable. However, the route to HPSE P6 was not simple or trivial; P6 was the only design of the seven that we tested that was effective. Thus, while the combinatorial effects of the mutations can be powerful, the unpredictable effects of the mutations, and their epistatic interactions, make it imperative that a range of designs are trialled. Our structural analysis of HPSE P6 shows that many of the mutations appear to have effects that can be rationalised in terms of our understanding of how proteins fold: increasing surface polarity, forming additional non-covalent interactions such as hydrogen bonds, increased packing within the hydrophobic core, etc. The lack of major structural changes, such as strong salt bridges or significant changes to internal cavities, which are characteristic of rational or computationally designed stabilising mutations, meant that the structural dynamics of the protein, and thus its catalytic function, was largely unperturbed. This study is thus an interesting example of protein stabilisation: on the one hand, 26 mutations could be considered to be a large number of mutations, but the counter argument is that 26 functionally neutral mutations that have almost no effect on the structure and dynamics is in fact a very conservative method for stabilizing a protein, in comparison to a smaller number of mutations that might have a larger effect on the structure, dynamics and function of the enzyme. Chain A of the crystal structure of the ligand bound human HPSE (PDB ID: 5E9C) was submitted to the PROSS stability design algorithm 46 on the web server (http://pross.weizmann. ac.il), with constrained residues, which have contacts with the ligand (Dp4) and with the chain B. This generated 7 mutants. The linear 8 kDa (Gln36-Glu109) and 50 kDa (Lys158-Ile543) subunits of the human HPSE were E. coli codon optimized and synthesized by IDT (Australia). The seven PROSS designs were E. coli codon optimized and synthesized by Twist bioscience. The 8 kDa subunit was amplified and sub-cloned into the multiple cloning site 1 of the linearized pETDuet-1 vector (Novagen) through the BamHI and NotI restriction sites (Fast Digest,Thermo) by Gibson one-step isothermal assembly. 63 The resulting plasmid DNA was linearized using NdeI and XhoI restriction enzymes (Fast Digest, Thermo) and designs were inserted into the multiple cloning site 2 by Gibson assembly. 63 The ligated DNA was transformed to E. coli TOP10 cells and the plasmid DNA was extracted and sent to Garvan Institute (Australia) for Sanger sequencing to confirm the sequences. The wild type and the 7 designs were transformed in E. coli Shuffle T7 Express cells (NEB), together with different combinations of chaperones in a pACYC vector and spread on an Agar plate with ampicillin and chloramphenicol. 1% overnight seed culture from a single colony was inoculated into 1 L of LB medium supplemented with ampicillin (100 mg L À1 ) and chloramphenicol (34 mg L À1 ), then incubated at 37 1C for 5 hours. Overexpression was induced by adding IPTG to a final concentration of 0.05 mM and the culture was further incubated for 3 hours at 37 1C. The cell pellets were resuspended in buffer A (20 mM HEPES pH 8, 300 mM NaCl, 5 mM b-mercaptoethanol, 10% glycerol, 0.05% Tween, 20 mM Imidazole) with Turbonuclease (Sigma) and lysed by sonication (Omni Sonic Ruptor 400 Ultrasonic homogenizer). The lysate was filtered (0.45 mm) and loaded onto Ni-NTA column (GE healthcare) and eluted with 100% buffer B (buffer A + 500 mM Imidazole). The peak eluent was diluted 5 times with buffer C (20 mM HEPES pH 7.4, 200 mM NaCl, 1 mM DTT, 10% glycerol, 0.05% Tween20) and loaded to heparin affinity column (GE healthcare) and eluted with 100% buffer D (buffer C + 1.5 M NaCl). The peak eluent was loaded onto a size exclusion column (HiLoad 26/600 Superdex 200 pg, GE healthcare) and eluted in a buffer E (20 mM HEPES pH 7.4, 200 mM NaCl, 10% glycerol, 1 mM DTT, 0.05% Tween20). The final concentration of the monomeric heparinase from the gel filtration was estimated by absorbance at 280 nm using NanoDrop One (Thermo) and the yield was more than 2 mg per litre of LB culture. Assays were conducted using the colorimetric assay designed by Hammond et al. 56 The plates were resealed and developed at 60 1C for 60 minutes, and the absorbance was measured at 584 nm. Kinetics were carried out with a standard curve constructed with D-galactose as the reducing sugar standard, prepared in the same buffer and volume over the range of 0-2 mM. All curve fitting to calculate IC 50 values and Michaelis-Menten constants, was done using GraphPad Prism software (version 8.1). The size exclusion fraction was directly used to measure the CD using the Chirascan CD spectrometer (Applied Photophysics). The thermal stability of the protein (0.15 mg mL À1 ) was measured in a range of temperatures 20-90 1C by monitoring the ellipticity at 222 nm using a cuvette with 1 mm path length. Data analysis was performed using GraphPad Prism, within which the mid-point of the melting curve was calculated using Boltzmann sigmoid equation. Well diffracted single crystals were obtained by the hangingdrop vapor-diffusion method at 18 1C by combining the protein (6-8 mg mL À1 ) and the well solution (1.9 M (NH 4 ) 2 SO 4 ) with a ratio of 1.5 : 1.5 mL. Crystals appeared within a week and continued to grow for 1-2 months. The crystal was cryoprotected with additional 30% glycerol to the mother liquor before flash freezing in liquid nitrogen. Crystallographic data were collected at 100 K at the Australian Synchrotron (MX2, 64 0.9537 Å). The obtained diffraction data were indexed and integrated with XDS. 65 Resolution estimation and data truncation were performed using aimless program in CCP4 66 on the basis of the datasets overall half-dataset correlation, a CC 1/2 value of 0.3. 67 All structures were solved by molecular replacement using the Molrep program in CCP4 66 using the structure deposited under PDB accession code 5E9M as a starting model. The models were refined using phenix.refine, 68 and the model was subsequently optimized by iterative model building with the program COOT v0.8. 69 The alternative conformations were modelled based on mF o -DF c density and the occupancies and B-factors were determined using phenix.refine. 68 The structures were then evaluated using MolProbity 70 in Phenix. Details of the refinement statistics were produced by Phenix v1.17 71 and summarized in Table S3 (ESI †). The structures were visualized and analysed using PyMol v2.3 72 or Maestro, 73 whereby APBS 58 program in PyMol was used to calculated the electrostatic potential and protein alignment program in Maestro was used to calculate the Ca RMSD. Molecular dynamic simulations were performed using the GROMACS 2018.8 engine with parameters from the Charm22* force field. 74, 75 All chain termini were capped with neutral acetyl or methylamide groups. Protonation states were assigned with the PDB2PQR server for pH 5.0. 58 Completed structures were solvated with a TIP3P water model 76 using a rhombic dodecahedron simulation box with a minimum distance of 12 Å between the protein and simulation box, followed by the addition of 200 mM NaCl to the aqueous phase and sufficient ions to neutralise the system charge. Simulation systems of WT and PROSS 6 were relaxed using the standard steepest descent minimization using at least 10 000 steps before being equilibrated for 1 ns in the isothermal-isobaric (NPT) ensemble to stabilize the system. Ten replicates of each system were simulated for 100 ns under NPT. Periodic boundary conditions were used, and long-range electrostatics were calculated using the particle-mesh Ewald method with a cutoff of 1.2 nm. 77 Non-bonded interactions were evaluated using a Verlet cut-off scheme. The temperature in all simulations was set to 300 K and controlled via the Bussi-Donadio-Parrinello stochastic velocity rescaling thermostat; 78 the initial velocities of all particles were pseudo-randomly generated. Pressure coupling was handled with the Berendsen barostat during equilibration and the Parrinello-Rahman barostat for production. 79, 80 The LINCS (Linear Constraint Solver) algorithm was used to constrain bonds involving hydrogen in conjunction with an integration time step of 2 fs. 81 Constraints were applied to the starting configuration of the production run. Analyses of simulations were preformed using the tools provided in the GROMACS package. Data was collected from the last 90 ns of each production simulation, as RMSF had stabilised by this time. Principal component analysis was performed using the MDTraj python library and the scikit-learn machine learning tool. 82, 83 Paper RSC Chemical Biology Using the aligned and concatenated trajectory, a merged dataset was created, from which the WT and P6 systems were projected. Data was plotted in Graphpad prism. Coordinates and structure factors have been deposited in the Protein Data Bank under accession code PDB 7RG8. This study describes the production of a new variant of HPSE, which is functionally identical to the wild-type protein in terms of activity, inhibition, structure and dynamics that is easily expressed in E. coli and crystallises within a day, yielding high resolution crystals. This protein should make the study of HPSE function and the development of inhibitors significantly easier and less expensive. It is notable that the large number of mutations in HPSE P6 were functionally neutral. This contributes to a growing understanding of the relationship between protein sequence and folding, where evolutionarily conserved and functionally neutral consensus-like mutations can be understood to significantly affect the efficiency of protein folding and expression and protein thermostability. CJ has received funding from Beta Therapeutics to work on heparanase inhibitors. Synthetic Biology The PyMOL Molecular Graphics System, Version 2.0 Schrödinger, LLC We acknowledge the ARC Centre of Excellence for Innovations in Peptide and Protein Science (CE200100012), the ARC Centre of Excellence in Synthetic Biology (CE200100029) and an ARC Linkage Grant (LP160101552). We thank the staff of the MX2 beamline at the Australian Synchrotron, part of ANSTO, which made use of the Australian Cancer Research Foundation (ACRF) detector. The table of contents entry was created with BioRender.com.