key: cord-0004542-qxzuplrf authors: Sharma, Ashish K.; Phue, Jenie; Khatipov, Emir; Dalal, Nimish; Anderson, Eric D.; Shiloach, Joseph title: Effect of restricted dissolved oxygen on expression of Clostridium difficile toxin A subunit from E. coli date: 2020-02-20 journal: Sci Rep DOI: 10.1038/s41598-020-59978-1 sha: 28816d307a3147a3f13699b2716419e2836a603e doc_id: 4542 cord_uid: qxzuplrf The repeating unit of the C. difficile Toxin A (rARU, also known as CROPS [combined repetitive oligopeptides]) C-terminal region, was shown to elicit protective immunity against C. difficile and is under consideration as a possible vaccine against this pathogen. However, expression of recombinant rARU in E. coli using the standard vaccine production process was very low. Transcriptome and proteome analyses showed that at restricted dissolved oxygen (DO) the numbers of differentially expressed genes (DEGs) was 2.5-times lower than those expressed at unrestricted oxygen. Additionally, a 7.4-times smaller number of ribosome formation genes (needed for translation) were down-regulated as compared with unrestricted DO. Higher rARU expression at restricted DO was associated with up-regulation of 24 heat shock chaperones involved in protein folding and with the up-regulation of the global regulator RNA chaperone hfq. Cellular stress response leading to down-regulation of transcription, translation, and energy generating pathways at unrestricted DO were associated with lower rARU expression. Investigation of the C. difficile DNA sequence revealed the presence of cell wall binding profiles, which based on structural similarity prediction by BLASTp, can possibly interact with cellular proteins of E. coli such as the transcriptional repressor ulaR, and the ankyrins repeat proteins. At restricted DO, rARU mRNA was 5-fold higher and the protein expression 27-fold higher compared with unrestricted DO. The report shows a strategy for improved production of C. difficile vaccine candidate in E. coli by using restricted DO growth. This strategy could improve the expression of recombinant proteins from anaerobic origin or those with cell wall binding profiles. . Bioprocess parameters, culture kinetics, rARU production and data processing to identify DEGs. Gene enrichment of DEGs from venn analysis. The up-regulated and down-regulated gene sets from the restricted and unrestricted aeration were analyzed using Venn diagram. When grown at unrestricted DO, 1564 DEGs were found to be up-regulated, and of these, 1208 were differentially expressed only in the unrestricted DO. While 349 of the 1564 DEGs were also found to be up-regulated at restricted DO, only 7 up-regulated DEGs were found to be down-regulated at restricted DO. A significantly lower number of genes were differentially expressed uniquely in the restricted DO culture; only 224 were down-regulated and 154 were up-regulated (Fig. 2B) . Gene enrichment analysis was conducted on the gene sets extracted from the Venn analysis and only pathways with a p-value < 0.05 were considered (Supplementary Table 2 ). The 1208 genes that were up-regulated at unrestricted DO were associated with carbohydrate metabolism, pyruvate metabolism, iron sulfur cluster, sulfur relay system, osmotic stress, and oxidative stress. yggE is a periplasmic protein, which is associated with the inner membrane and found to up-regulate under oxidative stress 32 . See Supplementary Table 1 for the complete list of pathways with their enrichment scores and p-values as an outcome of gene enrichment analysis. The 349 up-regulated DEGs that were identified in both unrestricted and restricted DO were involved in catabolic processes (e.g., organic acids, amino acids, small molecules), fatty acid metabolism, Tricarboxylic acid cycle (TCA), and amino acid degradation. Seven up-regulated DEGs out of the 1564 were down-regulated in restricted DO. Of these 7, 5 genes were related to membrane formation. From the 1618 differentially down-regulated genes that were identified in the unrestricted DO culture, 1066 were down-regulated only in unrestricted DO (and not in the restricted DO). Therefore, it is possible that the lower activities of these DEGs were associated with the lower expression of the protein at unrestricted DO. Most of these 1066 DEGs were related to cellular processes like ribosome formation (52 DEGs with an enrichment score of 26.9), aminoacyl-tRNA biosynthesis (19 DEGs with an enrichment score of 13.2), biosynthesis of amino acids (48 DEGs with an enrichment score of 5.8), RNA polymerase (4 DEGs with an enrichment score of 5.8), energy generating processes such as oxidative phosphorylation (21 DEGs with an enrichment score of 5.8), and protein export (10 DEGs with an enrichment score of 3.5) were significantly down-regulated. This list of 1066 DEGs indicated that translation, amino acid biosynthesis, and energy generating processes were affected at the unrestricted DO condition since all these functions were down-regulated. From the same group of 1618 differentially down-regulated genes identified in the unrestricted DO, 508 were down-regulated in both unrestricted and restricted DO cultures. These DEGs were associated with glycerophospholipid metabolism (5 DEGs with an enrichment score of 4.9), purine metabolism (8 DEGs with an enrichment score of 3.5), pyrimidine metabolism (6 DEGs with an enrichment score of 2.9), and biosynthesis of secondary metabolites (15 DEGs with an enrichment score of 1.3). Another group of genes from the 1618 differentially down-regulated genes at unrestricted DO were the 44 genes up-regulated in the restricted DO. The top pathways associated with this group belonged to the TCA cycle (8 DEGs with an enrichment score of 19.8), carbon metabolism (9 DEGs with an enrichment score of 10.4), oxidative phosphorylation (4 DEGs with an enrichment score of 5.5), fatty acid biosynthesis (2 DEGs with an enrichment score of 3.9), ribosome formation (4 DEGs with an enrichment score of 3.3), and biosynthesis of secondary metabolites (9 DEGs with an enrichment score of 3.1). This group of 44 genes is an indication that translation and energy generating pathways were down-regulated in the unrestricted DO while the same cellular processes were up-regulated in the restricted DO. From the 1286 DEGs at restricted DO, 547 were up-regulated and 154 of the 547 were exclusively up-regulated in restricted DO. These 154 DEGs were associated with energy generating pathways such as oxidative phosphorylation (6 DEGs with an enrichment score of 6.4), TCA cycle (3 DEGs with an enrichment score of 3.0), and carbon metabolism (7 DEGs with an enrichment score of 3). From the same 1286 DEGs, 224 that were exclusively down-regulated at the restricted DO were also overrepresented in processes related to ribosome formation (7 DEGs with an enrichment score of 5.4), biosynthesis of siderophore group non-ribosomal peptide (2 DEGs with an enrichment score of 4.2), glycerolipid metabolism (2 DEGs with an enrichment score of 3.2), and biosynthesis of secondary metabolites (13 DEGs with an enrichment score of 3.0). Pathway analysis in E. coli producing rARU at restricted and unrestricted DO. Pathway analysis of the genes at log and late-log phases of the cultures grown at restricted and unrestricted DO was identified by the microarray. It was conducted by determining the overall gene expression flux through pathways using the omics dashboard tool of EcoCyc 33 . The differential gene expression was estimated by applying a 1-way ANOVA on samples from log and late-log phases of the restricted and unrestricted DO. After filtering out the non-significant genes, the identified genes were grouped into major and sub-major pathways by EcoCyc. The change in the gene expression at log and late-log phases of cellular functions related to biosynthesis, energy metabolism, central dogma, heat and osmotic stress, and cell structural components at both aeration conditions are shown in Fig. 3 . The small dot represents fold change expression values for subsystem or individual genes within the subsystem, while the large dot represents the average (mean) fold change expression values for all the subsystems in that pathway. When the gene expression distribution in the log phase was compared, most pathways behaved similarly at both restricted and unrestricted DO, which matched the correlation shown in the PCA among the principal components of the microarray gene expression profiles. As seen in Fig. 3 , the large dots were close to the baseline representing no-significant difference when gene expression of the log phase of restricted and unrestricted DO was compared. However, when the gene expression distribution in the late-log phase was compared, significant differences associated with energy metabolism pathways were observed. A 4-fold increase in expression of genes associated with the TCA cycle and a 2-fold increase in expression of genes associated with aerobic respiration in late-log restricted DO was observed as compared with the late-log phase at unrestricted DO. At the same time, no significant change in the gene expression of glycolysis, pentose phosphate pathway (PPP), and fermentation was observed when the restricted and unrestricted DO conditions were compared. However, when processes such as translation, RNA metabolism, protein metabolism, and protein folding/ secretion were evaluated, there was an increase in overall gene expression of ~3.5, ~1, ~2 and ~1.5-fold, respectively at restricted DO compared with unrestricted DO. Heat stress related genes showed a ~2-fold increase and cell structural component synthesis processes like cell wall, plasma membrane, periplasmic membrane, and outer membrane biosynthesis genes were differentially expressed. Proteomics analysis. Proteomics analysis was performed on the same samples used for the gene expression analysis. The PCA of the proteomics data showed correlation among the three replicates ( Supplementary Fig. 1 ) and were correlated with the sample distribution that was observed from the gene expression data. The proteomics data were analyzed with a 1-way ANOVA and used a filtration cutoff of log 2 fold change of ≥1.5≤ and p-value of ≤ 0.05. The 167 down-regulated and 45 up-regulated differentially expressed proteins were detected when the culture was transitioned from log to late-log (late-log vs log phase) at unrestricted DO. At restricted DO, 197 differentially expressed proteins were up-regulated and 34 were down-regulated. The gene expression data were used to identify affected pathways, and the proteomics data were used for confirmation at the protein level of identified key genes. The complete proteomics data is provided in Supplementary Table 4 ). The main BLASTp hits were separated into two types: a) single proteins (e.g., choline binding, dextransucrase, autolysin, and endolysin), and b) clostridium Toxin A or Toxin B. Other hits were structures of toxin fragments bound to molecules such as single domain antibodies (e.g., VHH, PA50 Fab), DARPin, the surface protein PspA [Streptococcus pneumoniae R6] (NP_357715.1), and the cell wall protein [Clostridioides difficile 630] (YP_001089224.2). VHH protein, as a known interactant of TcdA, was used to identify proteins in the E. coli that may have similar structures to VHH, and therefore can unexpectedly interact with the recombinant TcdA. Therefore, subsequent BLASTp searches were performed with single domain antibody sequence VHH (accession 4NC0_B) as a query against the UniProtKB/Swiss-Prot (Swiss-Prot) database using DELTA-BLAST algorithm and word size adjusted to 2. These searches retrieved a short fragment of E. coli UlaR protein (accessions P0A9W2.1 and B7UQK0.1) which is a transcription regulator responsible for the utilization of l-ascorbate under anaerobic conditions. By using PSI-BLAST with the A26.8 VHH protein sequence, the following proteins were identified: YahD (accession number: P77736) and ArpA (accession number: P23325) ("ankyrin repeat" proteins), the outer membrane usher protein HtrE (accession P33129.3), Formylglycinamide ribonucleotide amidotransferase (accession Q1R8H7.3), DNA-invertase PinE (accession P03014.2), and Wzc protein (a Tyrosine-protein kinase required for the extracellular polysaccharide colanic acid synthesis with relatively weak yet informative homologies). To assess if the reason for the difference in rARU expression in restricted and unrestricted DO is associated with special properties of the rARU gene, the initial 300 and 600 bp sequence of the rARU gene were fused with the eGFP gene. The resulting constructs were cloned in the pRSETb backbone creating pRSETb-300rARU + eGFP and pRSETb-600rARU + eGFP. The 300 and 600 bp of the rARU contained 3 and 7 cell wall binding motifs respectively. The strain was transformed with the two plasmids and eGFP expression was evaluated at restricted and unrestricted DO. The results in Fig. 4A ,B showed that the expression of both fusion constructs was affected by the DO conditions. Expression of the 300rARU + eGFP and the 600rARU + eGFP fusion proteins were 1.8-and 1.99-fold higher in the log phase in cells grown at restricted DO as compared with the log phase of cells grown at unrestricted DO. Correspondingly, late-log phase showed 2.8and 2.7-fold higher expression at restricted DO as compared with the unrestricted DO condition. The fold change protein expression was calculated using IQTL (ImageQuant TL 8.2) image analysis software. Recombinant rARU from C. difficile was over expressed in E. coli only when the bacteria were grown at restricted DO concentrations. Therefore, an optimized expression was established by supplying the growing culture with 2 vvm air that generated restricted DO conditions and supported ~27-fold higher rARU concentrations. The possibility that the restricted DO conditions affected the T7 promoter was ruled out since at these conditions the expression of the T7 RNA polymerase was not affected (Supplementary Figure 3) . www.nature.com/scientificreports www.nature.com/scientificreports/ Transcriptomic and proteomic response to restricted and unrestricted DO. A larger number of DEGs were down-regulated at unrestricted DO as compared with restricted DO. A total of 1618 DEGs were down-regulated at unrestricted DO and of these, 1066 were down-regulated exclusively at unrestricted DO. Particularly important is the down-regulation of the RNA polymerases (rpoB -5.67-fold, rpoC -4.46-fold, rpoZ -3.39-fold and rpoA -6.58) which are essential for transcription initiation. In contrast, except for rpoD which was down-regulated, rpoA, rpoC, and rpoZ were up-regulated at restricted DO (1.2-fold, 1.0-fold, and 1.31-fold respectively). Since the translation process depends on the availability of ribosomes, amino acids, aminoacyl tRNA, and ATP molecules, it is not surprising that the genes associated with these processes were also differentially down-regulated in the unrestricted DO. This group includes 52 DEGs related to ribosome formation, 19 related to aminoacyl tRNA biosynthesis, 44 associated amino acid biosynthesis, and 17 involved in oxidative phosphorylation. The down-regulation of genes associated with energy related metabolism, central metabolism, and ribosomal RNA were previously shown to be associated with the cellular stress response (CSR) generated during recombinant protein overexpression 35, 36 . The down-regulation of the above-mentioned processes displays a behavior that supports the lower amount of rARU expressed at unrestricted DO. Previous reports showed that the presence of the rarely used codons of arginine and leucine in clostridial proteins is a possible bottleneck in their translations 18, 21, 22 . The aminoacyl tRNA genes for arginine (argS) and leucine (leuS) in late-log phase were up-regulated (f.c.1.9, and 1.67 respectively) at restricted DO condition as compared with unrestricted DO condition. In addition to the down-regulation of the DEGs mentioned above, genes associated with protein folding were also down-regulated at unrestricted DO. This included the 24 heat shock response genes, an indication that, rARU expression was too low to induce protein folding stress. Zhang et al. reported 37 that during recombinant protein expression, there is up-regulation of heat shock genes (chaperones and foldases) which are required for proper folding 37 ; however, a high expression of heat shock response genes was observed only when the cells grew at restricted DO thereby indicating folding stress associated with higher rARU expression. In both unrestricted and restricted DO, 508 DEGs were down-regulated and were associated with glycerophospholipid, purine, and pyrimidine metabolism, and with the biosynthesis of secondary metabolites. The plsY, glpC, glpQ, glpB, and glpD DEGs from the glycerophospholipid metabolism were down-regulated in www.nature.com/scientificreports www.nature.com/scientificreports/ unrestricted DO (−1.84-fold, −4.96-fold, −2.47-fold, −1.62-fold, and −4.99-fold respectively) and in restricted DO (−1.63-fold, −8.69-fold, −4.97-fold, −2.38-fold, and −7.03-fold respectively). Glycerophospholipids are essential components of the bacterial cell wall and fatty acid 38 as well as for glycerol consumption through the glycerol-3-phosphate regulon. The finding that the repressor of the glycerol-3-phosphate regulon (glpR) was up-regulated higher in unrestricted DO (1.28-fold) as compared with restricted DO (1.03-fold) can explain the higher inhibition of glycerol utilization at unrestricted DO. Culture grown at restricted DO showed a smooth transition from log to late-log possibly because 2.5 times less genes were differentially expressed as compared with unrestricted DO growth. Only 224 genes involved in ribosome formation, siderophore, and secondary metabolite biosynthesis were down-regulated when the cells grew in restricted DO. The number of down-regulated ribosome formation related genes were 7.4 times less in restricted DO compared with unrestricted DO, suggesting a possible reason for the increased translation at restricted DO. At the same time, 154 DEGs were up-regulated only at restricted DO (Fig. 2) and these genes were associated with oxidative phosphorylation, TCA cycle, and carbon metabolism. The oxidative phosphorylation genes sdhA (2.63-fold), cyoD (3.15-fold), sdhC (3.66-fold), cyoA (2.03-fold), and sdhB (1.67-fold) were up-regulated at restricted DO whereas sdhA (−1.18-fold), cyoD (−1.06-fold), sdhC (−1.27-fold), cyoA (−1.31-fold), and sdhB (−1.10-fold) were down-regulated at unrestricted DO. As indicated by all the above the transcription, translation, and energy generating pathways were much more active at restricted DO in comparison with the unrestricted DO. Changes in the global regulators. The ArcA/ArcB two-component system involved in sensing oxygen availability, acts by triggering oxidative response, regulates central carbon metabolism, and suppresses the TCA cycle under anaerobic conditions 39, 40 . No significant differences in the expression level of arcA and arcB at the two growth conditions were observed and is an indication that there is no difference in the regulation of the central carbon metabolism and the TCA cycle at restricted and unrestricted DO. The global regulator lrp is a leucine-responsive transcriptional regulator which acts by down-regulating nutrient uptake in rich medium 41, 42 . lrp was up-regulated in restricted DO (1.6-fold), thus reducing nutrient uptake and lowering growth. However, at unrestricted DO, lrp was down-regulated (−2.17-fold) which was associated with higher glycerol consumption and a higher growth rate. The increase in glycerol consumption was also supported by the proteomic data which showed up-regulation of glycerol kinase GlpK (2.17-fold) at unrestricted DO as compared to a 1.07-fold up-regulation at restricted DO condition. The global regulator csrA affects ppGpp expression through relA, which changes σ s behavior to affect the transcription of stress survival genes and DNA replication 43 . csrA was found to be up-regulated at unrestricted DO (2.07-fold higher) than at restricted DO (1.26-fold). Although no changes were observed concerning the relA expression, there is a possibility that differential expression of csrA can affect various cellular processes by altering the translation and stability of target molecules such as tRNA and rRNA 44 . The higher expression of csrA could be the reason for down-regulating translation (ribosomal, tRNA, and amino acids) and energy-related genes at the unrestricted DO condition. The global DNA-binding transcriptional dual regulator fis, was up-regulated at restricted DO (1.92-fold) and down-regulated at unrestricted DO (−4.27-fold). fis acts by controlling gyrase and changing the expression of topoisomerase I 45 and alleviates stress response, energy, and carbon metabolism processes. fis also up-regulates processes involved in flagellar biosynthesis and translation (rRNA and tRNA genes) 46 . Therefore, it is possible that fis down-regulation is the reason for down-regulating 54 genes associated with translation-related processes at unrestricted DO. An additional transcriptional regulator that may have a possible role in different DO conditions is the RNA chaperone hfq. This molecule binds to small RNA and is involved in elongating poly(A) tails and in stabilizing mRNA 47, 48 . However, there is a possibility that hfq can induce mRNA decay by sRNA mediated regulation, which has been shown to be related to different growth conditions 49 . hfq was up-regulated (1.34-fold) at restricted DO thereby potentially enhancing rARU mRNA level, and down-regulated (−1.33-fold) at unrestricted DO possibly lowering rARU mRNA levels A ProSite scan showed that the expressed rARU gene (Supplementary Table 5 (S5a)) contained 31 cell wall binding repeats, each composed of 20 amino acids (QNRFLHLLGKIYYFGNNSKA) 50 belonging to the class PS51170 which has a unique structure. So far, only 15 protein sequences were identified to contain a similar cell wall binding repeat profile (Supplementary Table 5 (S5b)). The Blast search showed proteins with similar structures in E. coli that potentially can bind to rARU (Supplementary Table 4 ). These were: the transcriptional repressor ulaR, the tyrosine protein kinase Wzc, the outer membrane Usher protein HtrE, ankyrin proteins (YahD and ArpA), the phosphoribosylformylglycinamide synthetase PurL, and the site-specific DNA recombinase PinE. Additional interactions with cellular proteins are possible due to the presence of the two cysteine residues in rARU, which can form disulfide bridges at high dissolved oxygen levels. The possibility that higher expression at restricted DO is likely due to the interaction between cell wall binding motifs and cellular proteins and was further tested by the fusion of a partial rARU sequence containing cell wall binding motifs with eGFP. Initially to capture the sequence feature, we fused 600 bp of rARU (with 7 cell wall binding motifs) N-terminal region to eGFP to see if this affects the protein expression. As a result, the rARU expression was higher at restricted DO. Then we tried smaller sequence size of 300 bp (with only 3 cell wall binding motifs), which gave similar results. The expression of eGFP fused to both 300 and 600 bp of rARU at the N-terminal region was higher at restricted DO. The selected 300 and 600 bp from the rARU N-terminal contained respectively 3 and 7 cell wall binding repeats (Fig. 5) , which was likely responsible for the higher expression at restricted DO. Although the mechanism behind this interaction is unknown, it is possible that the interaction of the rARU sequence with cellular proteins triggered a higher expression at restricted DO condition. www.nature.com/scientificreports www.nature.com/scientificreports/ The interaction of rARU with the transcriptional repressor ulaR may influence the biosynthesis of NADH and pentoses by affecting the L-ascorbate uptake 51 . ulaR operates by interaction with L-ascorbate-6-P and by binding to the intergenic region of ulaA and ulaG and inhibition of the ulaABCDEF operon transcription. UlaR was down-regulated in the late-log phase (ulaR f.c. −1.03) at restricted DO, which may cause the up regulation of ulaG. As a result, ulaB, ulaD, ulaE, and ulaF of the ulaABCDEF operon remained active in the late log phase of restricted DO enabling the cells to efficiently utilize L-ascorbate and support PPP to produce NADH and pentoses. The interaction of rARU with the "ankyrin like" proteins of E. coli such as YahD (yahD f.c. 1.25 at log phase and f.c. 1.09 at late-log phase) and ArpA (arpA f.c. −1.05 at log phase and f.c. 1.54 at late-log phase) could also interfere with the integrity of the cell membrane and affect cell growth. The "ankyrin repeat" containing proteins were discovered in mammalian cells and were involved in cell signaling, regulation, and structural integrity. A previous study on the overexpression of an "ankyrin repeat" containing protein (e.g., RNase L) caused RNA degradation and inhibition of cell growth in E. coli 28 . Interestingly, the role of "ankyrin repeat" proteins at signaling and regulatory levels was never explored in E. coli. A list of many other E. coli proteins which may possibly interact with rARU is provided in Supplementary Table 4 . The list includes the Wzc protein (wzc f.c. −1.12 at log phase and f.c. −1.25 at late-log phase) which is involved in making exopolysaccharide colanic acid (M-antigen) which increases E. coli's ability to survive in environmental stresses 52 . Recently, this ability was shown to be regulated through the master regulator csrA 53 , which was up-regulated at unrestricted DO in this study. A schematic representing the state of the major metabolic processes and regulators in E. coli at restricted and unrestricted DO during rARU expression are presented in Fig. 6 . Recombinant rARU, a partial fragment of Toxin A from the anaerobic bacterium C. difficile, was expressed efficiently from E. coli only at low DO conditions. We observed that rARU transcript accumulation was highest only when the culture was grown at a restricted DO concentration. It is very likely that this unusual expression condition is an indication that a special gene sequence and structure of the produced protein are related to the expression. Therefore, significant production was possible only by applying restricted DO growth strategy. A possible reason for this special phenomenon was the presence of cell wall binding repeats motifs that allowed transcription and translation at restricted DO and therefore increased its expression at these conditions. The role of the heterologous protein sequence in the metabolic toxicity due to overexpression of recombinant proteins was reported, but no analysis was performed to understand the functional features associated with the sequence and how it interacted with the host cellular machinery [54] [55] [56] . This report demonstrated a connection between rARU expression and restricted DO, which presents the possibility of the improved expression of other recombinant proteins of anaerobic origin or of proteins that are associated with cell wall binding repeat profiles. The implemented restricted DO expression strategy used in this work will likely improve the expression of other similar proteins in E. coli. Bacterial strain and expression vector. E. coli Tuner (DE3) pLacI/TZ56 [F-ompT hsdSB (rB-mB-) gal dcm lacY1 (DE3) pLacI (CmR)] (Novagen, Madison, WI) was used for the growth and production experiments. Expression plasmid pRSET-B-rARU-KanR containing rARU gene under T7 promoter was transformed in the E. coli strain. The pRSET-B-rARU plasmid possesses kanamycin resistance along with entire C-terminal repeating region of toxin A (rARU) (861 amino acids plus 4 amino acids upstream) having a Molecular weight (MW) of 104 kDa 16 , along with a 6x His tag at the C-terminal. Both plasmid and strain were provided by TechLab, Inc. (Blacksburg, VA). Two fusion constructs were designed consisting of partial N-terminal regions (i.e., 300 bp and 600 bp of rARU with the eGFP). The fusion constructs were renamed as pRSET-B-300rARU + eGFP and pRSET-B-600rARU + eGFP. Samples for rARU, acetic acid, and total RNA were collected at regular intervals. After centrifugation at 14,000 g for 10 min at 4 °C, the supernatant was kept at −20 °C for glycerol and acetate analysis, and the cell pellets were quickly frozen by dry ice and stored at −80 °C for RNA extraction. Biomass was monitored by measuring the OD 600 . Glycerol and acetate concentration were determined using the Cedex Bio HT Analyzer (Roche, Penzberg, Germany). rARU quantification. Culture samples were diluted with 1X PBS to bring the cell concentration to 1.0 OD 600 and the diluted culture was centrifuged at 16,000 g for 5 min, the supernatant was removed, and the pellet was frozen at −20 °C. Each cell pellet was thawed and resuspended in 200 μl of lysis buffer (BugBuster Protein Extraction Reagent, Novagen, Madison, WI). Processed samples were centrifuged at 16,000 g for 20 minutes at 4 °C, and then supernatant was collected. The ELISA based method using monoclonal antibody conjugated to horseradish peroxidase of C. difficile Toxin A and B II enzyme immunoassay kit (TechLab, Inc., Blacksburg, VA) was used for quantitation of rARU in the processed supernatant (Fig. 1B) . Purified rARU (List Biological Laboratories, Inc. Campbell, CA) was used as the standard, and assay plates were read on UV/VIS microplate spectrophotometer (SpectraMax190 Molecular Devices, Sunnyvale, CA) at 450 nm and 620 nm. Total RNA purification and northern blot analysis. Total RNA was isolated with MasterPure RNA Purification Kit (Epicentre Technologies, Madison, WI) which resulted in RNA with a A260/A280 ratio of 1.85-1.95. To ensure equivalency between individual samples, the 23S and 16S rRNA from each sample were analyzed by Agilent 2100 Bioanalyzer (Agilent Technologies, Santa Clara, CA) and the rRNA ratio (23S/16S) for all samples was calculated to be greater than 1.5. The isolated RNA (5 ng/well) was separated using 1% agarose/formaldehyde denaturing gel at 75 V. The gels were blotted on Nytran Super Charge membrane (11 cm × 14 cm) (Schleicher & Schuell, Keene, NH) at room temperature (25 °C) in 20 X SSC, and the membranes were fixed by UV-induced cross-linking. Amplification of DNA fragments of rARU gene were performed using the following primers: Forward primer: 5′-TGCACCTGCTAATACGGATG-3′; Reverse primer: 5′-GCCATCCAGTAACTGCAACA-3′. PCR products were purified and labeled with 32 P using Ready-To-Go DNA Labelling Beads (Amersham Pharmacia Biotech, Piscataway, NJ) and were purified by Probe Quant G-50 Micro column (Amersham Pharmacia Biotech, Piscataway, NJ). Hybridization with 32 P-labeled DNA probes was performed with Quickhyb www.nature.com/scientificreports www.nature.com/scientificreports/ solution (Stratagene, La Jolla, CA) as recommended by the manufacturer. Northern blots were repeated in triplicate to verify reproducibility and quantification was performed by phosphor image scanning. rARU and T7 RNA polymerase mRNA quantitation with qPCR. Total RNA was extracted from cultures grown at restricted and unrestricted DO conditions. The extracted RNA was converted to cDNA using maxima first strand cDNA synthesis kit (Thermo Scientific, Waltham, MA). The cDNA was used as the template for qPCR amplification of the rARU gene transcript and used the following thermocycler steps: 1 cycle: 95 °C for 10 min; 40 cycles: 95 °C for 15 sec, 60 °C for 1 min. The primers used were: Forward primer: 5′-TGCACCTGCTAATACGGATG-3′; and Reverse primer: 5′-GCCATCCAGTAACTGCAACA-3′. The primers used for the T7 RNA polymerase were: Forward primer: 5′-ACAGCCTTCCAGTTCCTGCAAGAAATCAAG-3 ′; and Reverse primer: 5′-CATTTTGGCGGTGTAAGCTAACCATTCCGG-3′. The thermocycler steps used for the qPCR amplification of T7 RNA polymerase transcript were: 1 cycle: 95 °C for 3 min; 39 cycles: 95 °C for 10 sec, 60 °C for 40 sec. Transcriptomic profiling with microarray. RNA isolation. culture samples were treated with RNA protect, the suspension was centrifuged at 5000 rpm for 10 min and the pellet was resuspended in TRIzol reagent (Life Technologies, Carlsbad, CA) and immediately frozen in dry ice and stored at −80 °C. RNA was extracted by suspension of the cell pellets in 0.5% SDS, 20 mM NaAc, and 10 mM EDTA and extracted twice with hot acid phenol: chloroform, followed by 2 additional extractions in phenol: chloroform isoamyl alcohol. Absolute ethanol was added, the extract stored at −80 °C for 15 min, and then the stored RNA extract was centrifuged at 14,000 g for 15 min and washed with 70% ethanol and dried. The dried pellet was resuspended in ultrapure water (KD Medical, Columbia, MD) and was evaluated in a Bioanalyzer (Agilent, Santa Clara, CA) using RNA integrity number > 8.0 as a metric of sample quality. To avoid batch effect, all samples were extracted at the same time. Proteomics with LC/MS/MS. Peptides were obtained from each sample with reduction and alkylation of cysteines using a detergent approach 58 . One-third of peptides obtained from each sample were pooled together. The pool and the remaining sample were loaded (separately) on C8/C18 STAGE tips and subjected to on-column reductive demethylation 59 with each sample receiving one label from the pool and one from the remaining sample. After elution, the same portion of the pool elute was added to each sample eluate, and these thoroughly "doped" samples were dried under nitrogen and analyzed as described previously 60 . Statistical analysis. Time point samples were processed using biological triplicates from three different batch runs on different dates. ANOVA was performed to compare transcriptomic profiles among log and late-log phase samples at unrestricted DO and restricted DO. Significant DEGs were obtained by using filtering criteria of p-value and fold change cut-off of less than or equal to 0.05 and 1.5 respectively. Microarray and proteomics data analysis. Raw microarray data were processed and normalized using Partek Genomics Suite7.0 and annotations for E. coli BL21 were used to identify the gene ID. The normalized data files were analyzed with PCA to determine the significant differences in the spatial distribution among the populations of biological replicates from the different samples collected from cultures growing at unrestricted (30%) and restricted DO conditions. Twelve samples were processed: three biological replicates from two time points (log and late-log phase) from each growth condition for both microarray and proteomics studies. Fold change values were calculated among two different contrasts: 1a) log phase (restricted DO) vs log phase (unrestricted DO), and 1b) late-log phase (restricted DO) vs late-log phase (unrestricted DO); and 2a) restricted DO condition (late-log phase vs log phase), and 2b) unrestricted DO (late-log phase vs log phase). Each comparison was performed using ANOVA to generate fold change values among the two groups (ANOVA estimates the geometric means of the samples in each group to calculate the fold change for the contrast conditions chosen). The significant DEGs were identified by applying a filtering criterion of >1.5 and <−1.5-fold change with p-value < 0.05. Gene ontology was performed using Fisher's exact test with the 'gene set analysis' tool of Partek Genomics Suite 7.0 by integrating the KEGG database for the E. coli BL21 strain (organism code 'ebl'). The significant DEGs gene list was used to find the overall gene expression flow of subsystems in the enriched pathways, to predict the relative up or down-regulation of pathways using the omics dashboard tool of EcoCyc 33 . MaxQuant (version 1.3.0.5, Max-Planck-Institute of Biochemistry, Munich, Germany) 61 was used with a binary setting and requiring either light or heavy di-methylation of N-termini and lysine for protein raw data analysis. The data was searched using protein sequences from E. coli as well as known contaminants and a list of control proteins used to monitor reaction efficiency. Default values were used but matches between runs was enabled with a 0.5 min allowable difference in elution time (which was applied after spline fitting the elution patterns). Clostridium difficile-associated diarrhea in developing countries: A systematic review and meta-analysis Clostridium difficile infection: A worldwide disease Clostridium difficile toxins a and b: Insights into pathogenic properties and extraintestinal effects The role of toxin a and toxin b in clostridium difficile infection Mutagenesis of the clostridium difficile toxin b gene and effect on cytotoxic activity Characterization of a toxin a-negative, toxin b-positive strain of clostridium difficile responsible for a nosocomial outbreak of clostridium difficile-associated diarrhea Clostridium difficile toxoid vaccine candidate confers broad protection against a range of prevalent circulating strains in a nonclinical setting Development of a novel candidate vaccine New insights for vaccine development against clostridium difficile infections Deciphering the domain specificity of c. Difficile toxin neutralizing antibodies Clostridium difficile toxoid vaccine candidate confers broad protection against a range of prevalent circulating strains in a nonclinical setting Status of vaccine research and development for clostridium difficile Protection against clostridium difficile infection with broadly neutralizing antitoxin monoclonal antibodies Serum antitoxin antibodies mediate systemic and mucosal protection from clostridium difficile disease in hamsters Clostridium difficile recombinant toxin a repeating units as a carrier protein for conjugate vaccines: Studies of pneumococcal type 14, escherichia coli k1, andshigella flexneri type 2a polysaccharides in mice Regulation of toxin synthesis in clostridium difficile by an alternative rna polymerase sigma factor Frameshift suppression at tandem aga and agg codons by cloned trna genes: Assigning a codon to argu trna and t4 trna(arg) Simple and efficient method for heterologous expression of clostridial proteins Protective vaccination with a recombinant fragment of clostridium botulinum neurotoxin serotype a expressed from a synthetic gene in escherichia coli Engineering and validation of a vector for concomitant expression of rare transfer rna (trna) and hiv-1 nef genes in escherichia coli Optimizing scaleup yield for protein production: Computationally optimized DNA assembly (coda) and translation engineering Expression of tetanus toxin fragment c in e. Coli: High level expression by removing rare codons Systematic comparison of strategies to achieve soluble expression of plasmodium falciparum recombinant proteins in e. Coli High level expression and purification of recombinant flounder growth hormone in e. Coli Reprogramming chaperone pathways to improve membrane protein expression in escherichia coli Chaperone-based procedure to increase yields of soluble recombinant proteins produced in e. Coli Expression of interferon-inducible recombinant human rnase l causes rna degradation and inhibition of cell growth in escherichia coli Comparative transcriptional analysis of bacillus subtilis cells overproducing either secreted proteins, lipoproteins or membrane proteins Comparative transcriptomic profile analysis of fed-batch cultures expressing different recombinant proteins in escherichia coli Volcano plots in analyzing differential expressions with mrna microarrays The gene ygge functions in restoring physiological defects of escherichia coli cultivated under oxidative stress conditions The omics dashboard for interactive exploration of gene-expression data Scanprosite: Detection of prosite signature matches and prorule-associated functional and structural residues in proteins Bacterial growth inhibition by overproduction of protein A comparative study of global stress gene regulation in response to overexpression of recombinant proteins in escherichia coli Heat-shock response transcriptional program enables high-yield and high-quality recombinant protein production in escherichia coli Biosynthesis and function of phospholipids in escherichia coli Arca (dye), a global regulatory gene in escherichia coli mediating repression of enzymes in aerobic pathways Aerobic-anaerobic gene regulation in escherichia coli: Control by the arcab and fnr regulons The leucine-responsive regulatory protein, a global regulator of metabolism in escherichia coli Escherichia coli lrp regulates one-third of the genome via direct, cooperative, and indirect routes Ppgpp: A global regulator in escherichia coli Circuitry linking the csr and stringent response global regulatory systems Differential regulation of escherichia coli topoisomerase i by fis Effects of fis on escherichia coli gene expression during different growth stages Coincident hfq binding and rnase e cleavage sites on mrna and small regulatory rnas Host factor hfq of escherichia coli stimulates elongation of poly(a) tails by poly(a) polymerase i Hfq chaperone brings speed dating to bacterial srna Clostridium difficile surface proteins are anchored to the cell wall using cwb2 motifs that recognise the anionic polymer psii Utilization of l-ascorbate by escherichia coli k-12: Assignments of functions to products of the yjf-sga and yia-sgb operons Influence of tyrosine-kinase wzc activity on colanic acid production in escherichia coli k12 cells Metabolome and transcriptome-wide effects of the carbon storage regulator a in enteropathogenic escherichia coli A support vector machine-based method for predicting the propensity of a protein to be soluble or to form inclusion body on overexpression in escherichia coli Production of soluble mammalian proteins in escherichia coli: Identification of protein features that correlate with successful expression Analysis of high throughput protein expression in escherichia coli High-yield growth of e. Coli at different temperatures in a bench scale fermentor Sodium laurate, a novel protease-and mass spectrometry-compatible detergent for mass spectrometry-based membrane proteomics Multiplex peptide stable isotope dimethyl labeling for quantitative proteomics Effect of amino acids on transcription and translation of key genes in e. Coli k and b grown at a steady state in minimal medium Maxquant enables high peptide identification rates, individualized p.P.B.-range mass accuracies and proteomewide protein quantification The authors thank the National Institute of Diabetes and Digestive and Kidney Diseases Proteomics Core for assistance with LCMS/MS, and Alicia Livinski, Biomedical Librarian at the NIH Library, for manuscript assistance. The work was supported by the Intramural Research Programs of the National Institute of Diabetes and Digestive and Kidney Diseases (NIDDK) at the National Institutes of Health, Bethesda, MD, USA. A.S., J.S. and J.P. conceived and planed the experiment, A.S., J.P. and N.D. performed the experiment, E.D.A. performed LCMS/MS proteomics, E.K. performed sequence and structural analysis. A.S. and J.S. analyzed the data and wrote the manuscript, A.S. and J.P. performed the molecular biology work, interpreted the results, reviewed and contributed to the final version of the manuscript. The authors declare no competing interests. Supplementary information is available for this paper at https://doi.org/10.1038/s41598-020-59978-1.Correspondence and requests for materials should be addressed to J.S.Reprints and permissions information is available at www.nature.com/reprints.Publisher's note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations. License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article's Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article's Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.