key: cord-0722866-3c1j9mut authors: Simões, Tânia; Charro, Nuno; Blonder, Josip; Faria, Daniel; Couto, Francisco M.; Chan, King C.; Waybright, Timothy; Isaaq, Haleem J.; Veenstra, Timothy D.; Penque, Deborah title: Molecular profiling of the human nasal epithelium: A proteomics approach date: 2011-12-10 journal: Journal of Proteomics DOI: 10.1016/j.jprot.2011.05.012 sha: d03d96612a27c7ed946acb2d03f45f2a8261b712 doc_id: 722866 cord_uid: 3c1j9mut Abstract A comprehensive proteomic profiling of nasal epithelium (NE) is described. This study relies on simple subcellular fractionation used to obtain soluble- and membrane-enriched fractions followed by 2-dimensional liquid chromatography (2D-LC) separation and tandem mass spectrometry (MS/MS). The cells were collected using a brushing technique applied on NE of clinically evaluated volunteers. Subsequently, the soluble- and the membrane-protein enriched fractions were prepared and analyzed in parallel using 2D-LC-MS/MS. In a set of 1482 identified proteins, 947 (63.9%) proteins were found to be associated to membrane fraction. Grand average hydropathy value index (GRAVY) analysis, the transmembrane protein mapping and annotations of primary location deposited in the Human Protein Reference Database (HPRD) confirmed an enrichment of hydrophobic proteins on this dataset. Ingenuity Pathway Analysis (IPA) of soluble fraction revealed an enrichment of molecular and cellular functions associated with cell death, protein folding and drug metabolism while in membrane fraction showed an enrichment of functions associated with molecular transport, protein trafficking and cell-to-cell signaling and interaction. The IPA showed similar enrichment of functions associated with cellular growth and proliferation in both soluble and membrane subproteomes. This finding was in agreement with protein content analysis using exponentially modified protein abundance index (emPAI). A comparison of our data with previously published studies focusing on respiratory tract epithelium revealed similarities related to identification of proteins associated with physical barrier function and immunological defence. In summary, we extended the NE molecular profile by identifying and characterizing proteins associated to pivotal functions of a respiratory epithelium, including the control of fluid volume and ionic composition at the airways' surface, physical barrier maintenance, detoxification and immunological defence. The extent of similarities supports the applicability of a less invasive analysis of NE to assess prognosis and treatment response of lung diseases such as asthma, cystic fibrosis and chronic obstructive pulmonary disease. As air passes through the nose, three distinct respiratory processes are taking place in the nasal cavities: air filtering, heating and humidifying. Altogether, these functions constitute the air conditioning within the upper respiratory airways [1] . In the vestibular region, these cavities are lined by a skinlike epithelium that gradually changes towards the nasal valves to the characteristic pseudo-stratified columnar ciliated lining termed the respiratory epithelium [2] . As part of the nose's functions, nasal epithelium (NE), along with mucous airway surface liquid (ASL), provides an important physicochemical and immunological barrier against different factors targeting lower airways [2] . These defence mechanisms depend on mucociliary clearance, its regenerative capacity and ability to participate in immune responses by secreting pro-and anti-inflammatory cytokines, among others [3] . In fact, all respiratory passages, from the nose to the terminal bronchioles share these features reflecting their common susceptibility to various agents such as irritants/occupational sensitizers, allergens and microorganisms [2] . In both upper and lower airways, physiological responses to these agents include: narrowing of the lumen, mucous secretion and inflammation [1, 2] . Given the anatomical and physiological similarities, the concept of united airways is being increasingly discussed in clinical medicine leading to the notion that reactions in the upper airways (i.e., nose) influence or reflect complications in lower airways (i.e., lungs) [4] [5] [6] [7] . Therefore, NE cells, harvested by a non-invasive brushing procedure, have been used to assess upper and lower airway diseases [8, 9] . In this way, extensive characterization of NE proteome would allow better understanding of signaling pathways and networks driving the physiology or pathophysiology of the respiratory system. To the best of our knowledge, only three studies aimed to investigate human NE proteome have been published using two-dimensional electrophoresis (2DE), one by our own group to search proteins associated with cystic fibrosis disease [10, 11] . Min-Man et al. [12] investigated proteins associated with the pathogenesis of nasal polyps and chronic sinusitis while Lee et al. constructed a partial 2DE reference map of nasal mucosa proteins. However, 2DE-based proteomics deliver poor coverage of membrane proteins [13] . Herein, we present a deeper investigation on human normal NE proteome using subcellular fractionation combined with solution-based approach targeting membrane proteins that relies on strong cation exchange-reverse-phase liquid chromatography (SCX-RPLC) coupled with tandem mass spectrometry (MS/MS) to profile NE soluble-and membrane-enriched subproteomes. We anticipate that a global proteomic characterization of the human NE will expand our knowledge of the airways epithelium physiology allowing further comparative investigations to identify pathological profiles associated with cystic fibrosis (CF), asthma and chronic obstructive pulmonary disease (COPD). Materials and methods Ethical approval and informed consent were obtained from participating institutions and enrolled individuals, respectively. In total, 129 volunteers [84 F, 45 M; (36 ± 9) years] were clinically characterized by pulmonologists and recruited for sample collection. Nasal cells were harvested by a noninvasive brushing procedure as previously described [14] . Immediately after brushing, recovered cells (around 2 × 10 6 ) were washed with ice-cold PBS, pelleted and preserved at −80°C until analysis. Before cell pelleting, an aliquot was fixed and saved at 4°C for cytological analysis. Cellular preparations were qualitatively evaluated using conventional light microscopy to confirm the collection of a sufficient number of intact cells and the absence of evident contamination by blood cells caused by occasional bleeding that might occur during sample collection. Soluble-and membrane-enriched fractions were prepared by differential centrifugation. Briefly, NE pellets were thawed on ice, pooled and resuspended in 1 mL of lysis buffer containing 10 mM Tris-HCl, 1 mM EDTA (pH 7.6) and protease inhibitors cocktail (P8340-5 mL, Sigma). Cell lysis was assisted by intermittent sonication cycles (10 × 10 in. pulse interleaved by 30 in.) on ice. The sample was clarified by centrifugation at 2000 ×g for 3 min at 4°C to remove cellular debris. Soluble proteins were obtained from this clarified sample by ultracentrifugation at 100,000 ×g for 90 min at 4°C. The resulting pellet was resuspended and incubated in 100 mM Na 2 CO 3 for 2 h at 4°C on a rotisserie followed by ultracentrifugation at 100,000 ×g for 90 min at 4°C. The resulting pellet was washed twice with ddH 2 O to obtain the membrane-enriched fraction. Protein content of both fractions was measured using BCA™ Protein Assay Kit (Pierce). A total of 100 μg of lyophilized proteins from the soluble-or membrane-enriched fractions were digested overnight using sequence-grade modified trypsin (Promega, Madison, WI, USA) at an enzyme:protein ratio of 1:50 or 1:20 in either 50 mM NH 4 HCO 3 or 50 mM NH 4 HCO 3 /60% (v/v) methanol, respectively [15] . Tryptic peptides were desalted by SPE (3M™ Empore™ High Performance Extraction Disk Cartridges) according to the manufacturer's instructions and lyophilized to dryness prior to 2D-LC-MS/MS analyses. 2D-LC-MS/MS analysis, data processing and bioinformatic analysis Peptides were solubilized in 45% (v/v) ACN/0.1% (v/v) FA to achieve a concentration of approximately 0.5-1 μg/μl before separation and analysis by 2D-LC-MS/MS. The 2D-LC-MS/MS experiments were conducted as previously described [15, 16] . The CID spectra were analyzed using SEQUEST™ operating on a Beowulf 18-node parallel virtual machine cluster computer (ThermoElectron, Thermo Fisher Scientific, Waltham, MA, USA) using a UniProt non-redundant human proteome database (http://www.expasy.org, 03/2008 release). Only peptides with conventional tryptic termini (allowing for up to two internal missed cleavages) possessing delta-correlation scores (ΔCn) > 0. 1 3+ were considered as legitimate identifications. A final list of the identified proteins with the corresponding number of unique peptides (UPCs; non-redundant) and total counts for those peptides (TPCs) observed in two consecutive runs was obtained. The obtained identifications are the sum of 70 and 80 SCX fractions for the soluble and membraneenriched fractions, respectively, each analyzed in duplicate. Results were further analyzed using in-house developed software for determination of unique peptides and proteins, considering only positive identifications when at least 2 UPs per protein were assigned. To increase identification confidence, only proteins identified in three independent experiments were considered positively identified. The false discovery rate (FDR %) (peptide level) was calculated by searching the data against a decoy database and was estimated to be between 4.2% and 4.5%. To aid the biological interpretation of the extensive proteins' lists, proteins were categorized according to their Gene Ontology (GO) annotations using ProteinOn, a web tool focused on calculating GO-based protein semantic similarity [17] , curated information deposited in Human Protein Reference Database (HPRD) (http://www. hprd.org, release 8, July 2009) and Ingenuity Pathway Analysis (IPA) (Ingenuity® Systems, www.ingenuity.com). Putative transmembrane domains (TMD) and grand average hydropath value index (GRAVY) [18] were calculated by freely available tools (http://www.cbs.dtu.dk/services/TMHMM and http:// www.geneinfinity.org/sms_proteingravy.html). Protein amount in the samples was estimated by exponentially modified protein abundance index (emPAI) [19] , an index that compares the number of parent ions per protein observed with the number of predicted peptides for each protein. For emPAI calculations, only predicted tryptic peptides with more than 5 amino acids (TP > 5aa) were considered for index calculation as smaller peptides are hardly identified by MS and fall out of the range of the MS analysis (>300 m/z). Protein contents in molar percentages were calculated by emPAI normalization between experiments, as described by Ishihama et al. Herein, we report results of large-scale characterization of the nasal epithelium (NE) proteome using a 2D-LC-MS strategy that relies on subcellular fractionation coupled with SCX-RPLC-MS/MS that may facilitate elucidation of putative protein markers for respiratory diseases and/or drug targets. NE specimens were obtained by a non-invasive nasal brushing procedure followed by their quality evaluation by cytology. The cytological analysis showed that most of the cells (about 85%) in a brushing sample are epithelial cells (i.e., columnar, goblet and basal cells) as previously described (results not shown) [14, 20] . Soluble and membrane-enriched fractions were obtained using differential centrifugation to increase the proteome coverage. The complete list of proteins identified in this analysis, along with their respective peptides, is provided in supplementary data (Table SD1) . A total of 535 and 1169 proteins were identified in the soluble (sNE) and membrane fractions (mNE), respectively, using criteria described in Materials and methods. A total of 222 proteins were identified in both fractions (oNE), resulting in a net identification of 1482 unique NE proteins (Fig. 1 , Table SD2 ). Annotation of primary subcellular location of identified proteins was performed based on information deposited in Human Protein Reference Database (HPRD). As anticipated, cytoplasmic proteins were predominant in the soluble fraction ( Fig. 2A) , whereas plasma membrane proteins were observed by much greater percentage in the membrane fraction (Fig. 2B) ; plasma membrane proteins represented 5.4% of those identified in both fractions (Fig. 2C ). The term integral to membrane was mainly assigned to proteins allocated to the membrane fraction ( Fig. 2B and C) . Putative transmembrane domains (TMDs) were mapped and grand average hydropathy value index (GRAVY) calculated for the identified proteins (Table SD3 ). The analysis revealed that 513 proteins contained one or more mapped TMDs. Of those, 497 proteins (96.8%) belong to NE membrane fraction, which corresponds to 52.5% of all proteins identified in membrane-enriched fraction (Fig. 3B ). Considering that 21% of proteins within the entire human proteome contain one or more putative TMDs [21] , our results represent a significant enrichment of this proteins' class. The majority of proteins with five or more predicted TMDs elicit positive GRAVY indexes (Fig. 3C ), constituting about 14% of membrane's proteome. There is, however, a significant percentage of proteins in membrane fraction with no predicted TMDs (46.7%, corresponding to 442 proteins). Observation of many structural constituents of ribosomes, components of vesicle traffic machinery and some catalytic subunits of the mitochondrial membrane ATP synthase (e.g., ATP5C1, ATP5I) and vacuolar ATPase (e.g., ATP6V1E1, ATP6V1A), located in inner mitochondrial membranes and endosomes, respectively, seems to point for a significant contamination with soluble/cytosolic proteins entrapped within membrane vesicles during the lysis step. Other factors such as the presence of membrane-attached proteins via lipid/glycolipid anchors or a deficiency in the algorithm to correctly classify all potential TMDs could be also considered. The GRAVY index calculation, a global descriptor of protein's solubility (proteins exhibiting positive GRAVY values are recognized as hydrophobic while proteins exhibiting negative GRAVY values are recognized as hydrophilic) [18] , revealed that 284 proteins (out of 1482) showed GRAVY values ranging from 0.0 to +1.2 and were found mainly in the membrane fraction. In contrast, the majority of proteins identified on the soluble and overlapping fraction had negative GRAVY (Fig. 3A ). Searching our dataset against human proteome database (UniProt Human, release March 2008) showed that 95 of the NE identified proteins had only previously been observed at the transcript level; three proteins are annotated as uncertain and one as in-silico predicted (Table SD2 -UniProt). Functional annotation of identified proteins was carried out using HPRD and IPA's knowledgebase. According to HPRD, proteins were mainly distributed by biological processes such as metabolism and energy pathways, cell communication and signal transduction, protein metabolism, transport, regulation of nucleobase and cellular growth and/or maintenance. A total of 158 identified proteins (10.7%) had no annotation at all while a total of 147 proteins (9.9%) had no biological function annotations ( Fig. SD1 and Table SD2 -HPRD). In addition, some proteins belonged to different functional families; therefore their participation in other crucial cellular processes cannot be ignored. Among the 1458 proteins mapped by IPA's knowledgebase ( overrepresentation of proteins involved in functions such as cell death, molecular transport, protein trafficking and synthesis, lipid metabolism, small molecule biochemistry, cellular growth and proliferation, cellular movement and cell-to-cell signaling and interaction (Table 1 ; detailed information available on Table SD4a ). The subsets of 307, 938 and 213 proteins corresponding to the soluble, membrane and overlap fractions of NE, respectively, were further analyzed independently using IPA to explore their distinct roles and most significant cellular functions (detailed information available on Table SD4b ). Among those mainly identified in the NE membrane fraction, an enrichment of proteins involved in molecular transport, protein trafficking and cell-to-cell signaling/interaction was observed ( Fig. 4 ) as described in detail below. The top molecular and cellular functions in membrane fraction found in NE using IPA analysis were molecular transport ( Fig. 4 and Table SD4b ). A large number of identified proteins are involved in ion, amino acid and lipid transportation. The identification of several ion channels is consistent with the role of the respiratory epithelium in controlling the ionic composition and fluid volume at the airways surface, which is critical for normal lung physiology [22, 23] (Fig. 5) . Indeed, we have identified the electrogenic Na + /K + -ATPase subunits (ATP1A1, ATP1B1, ATP1B3) components of a pump located at basolateral membrane that drive the absorption of Na + at the apical membrane through the epithelial sodium channel ENaC [24] . The transepithelial electrical potential difference generated by Na + absorption also drives a transepithelial Cl − absorption to maintain the electroneutrality of the ion transport process. The Cl − ion must first enter the cell through the basolateral membrane by means of the Na + -K + -Cl − co-transporter (NKCC2), which was also identified in this study, being secreted at the apical membrane, for example, via the cystic fibrosis transmembrane conductance regulator (CFTR). The electrical driving force for Cl − secretion is also supplied by basolateral K + channels [25] . Although we were able to show the presence of CFTR in human NE by immunocytochemistry [14, 26, 27] neither CFTR, ENaC nor basolateral K + channels were identified in this study by two unique peptides identified in three independent runs. However, by lowering this threshold to one peptide, the CFTR was identified in four independent experiments by a total of five unique peptides. The high degree of hydrophobicity, extensive glycosylation of CFTR protein [28] and low expression in NE [29] may explain our inability to identify this protein using stringent identification thresholds. Similarly, important basolateral K + channels such as KCNN1, KCNN4, KCNJ5, KCND1 and KCNAB1 were also identified using less stringent thresholds, suggesting the low abundance of these regulatory proteins. However, a subunit of the electrogenic H + /K + -ATPase (ATP12A) was identified using stringent criteria. This protein, along with Na + /K + -ATPase, is responsible for maintenance of osmotic balance and intracellular ionic composition. Absorption or secretion of water in NE is driven by osmosis and determined by transepithelial NaCl movement through the lipid layer via aquaporin family of proteins [30, 31] . Here, aquaporin 5 (AQP5) was identified as the only molecule involved in water movement across the nasal epithelium. Our data also revealed the presence of the Na + /H + exchanger NHE1 and the bicarbonate transporter NBC1, which are involved in the maintenance of epithelial acidbase balance. The countertransport of H + in an electroneutral manner (1Na + :H + stoichiometry) by NHE exchangers, propelled by the inward Na + gradient established by plasma membrane Na + /K + -ATPase pumps, allows extrusion of H + excess (acid equivalents) accumulated by cellular metabolism. Additionally, bicarbonate cotransporters allow extrusion of HCO 3 − (1 Na + :HCO 3 − stoichiometry), a cellular biological buffer that is important for the solubilization of ions and macromolecules such as mucins and digestive enzymes in secreted fluids. Thus, the activity of those proteins is crucial not only to fine control of intracellular pH but also for cell volume control, systemic electrolyte, acid-base fluid volume homeostasis and epithelia protection from injury [32, 33] . Other ATPases, namely calcium pumps, were also identified in this fraction. Three Ca 2+ -ATPases have been described in the cells of higher animals. They are located in the membranes of endo(sarco)plasmic reticulum, including the nuclear envelope (SERCA pump), the Golgi network (the SPCA pump) and plasma membrane (PMCA pump) [34] . Each pump is a product of a multigene family, the number of isoforms being further increased by alternative splicing of the primary transcripts. In this study, several components of those three Ca 2+ pumps were identified: ATP2A1, ATP2A2 and ATP2A3 belonging to the SERCA pump, ATP2C1 belonging to the SPCA pump and ATP2B1, ATP2B4 belonging to the PMCA [34] . Components of V-type H + -ATPase were also identified, namely two subunits belonging to the membrane-embedded V0 complex (ATP6V0A4, ATP6V0C) and two subunits belonging to the citoplasmic V1 complex (ATP6AP1, ATP6V1E1) Fisher exact test was used to calculate a P value determining the probability that each biological function assigned to that dataset is because of chance alone; IPA=Ingenuity Pathway Analysis. Fig. 4 -Molecular and cellular functions that were most significantly different across the three data sets corresponding to proteins identified in soluble (blue), membrane (lighter blue) and both (darker blue) fractions according to Ingenuity software [25] in nasal epithelial cell. The bars represent the biological functions identified, the x-axis identifies it. The y-axis shows the −log of the P value calculated based on Fisher's exact test. The dotted line represents the threshold above which there are statistically significantly more genes in a biological function than expected by chance. Fig. 5 -Schematic representation of the most important ion transport systems in the airway epithelium. The airway surface fluid (ASF) is the fluid in which cilia moves to enable mucociliary clearance. The correct depth of ASF is maintained by regulated absorption and secretion of ions and water. The epithelial sodium channel (ENaC) and the Na + /K + ATPase membrane proteins are involved in Na + absorption. The TMEM16A (also known as anoctamin 1) and the cystic fibrosis transmembrane conductance regulator (CFTR) on the apical membrane and the sodium-potassium-chloride cotransporter (NKCC) on the basolateral membrane are involved in Ca 2+ -and cAMP-mediated Cl-secretion, respectively. The inward Na + gradient established by plasma membrane Na + /K + -ATPase pumps also propels the countertransport of H + by sodium/proton exchanger (NHE) which along with bicarbonate cotransporters extrude the excess of acid accumulated by cellular metabolism and by various H + (acid equivalents) leak pathways conducts to intracellular pH maintenance. responsible for H + translocation across the membrane and mediation of ATP hydrolysis. While known to be associated with the endosomal membrane, this pump was also found in plasma membranes where the proton pump energizes transport across cell membranes and entire epithelia [35] . The Cu 2+ -ATPase 7B (ATP7B), also observed in our data, is responsible for the delicate balance of Cu 2+ in the cells and throughout the whole body in association with ATP7A. Its absence or malfunction is described in severe human genetic disorders, Menkes and Wilson diseases. ATP7B is primarily expressed in the liver but it was also observed at lower levels in brain, heart and lungs [36] . Combined data from various reports on polarized hepatic cells suggest a continuous recycling of ATP7B-containing vesicles between sub-apical compartment and the apical membrane for release of Cu 2+ as its levels increased in the cell [36] . Among identified proteins, it should be highlighted the identification of membrane proteins -ABCC1, ABCC3 and ABCC4responsible for drug transport belonging to the ATPbinding cassette (ABC) transporter C family. In particular, ABCC1 (also known by MRP1, Multidrug Resistance Protein 1) is highly expressed mainly at the basolateral side of bronchial epithelial cells, namely ciliated and goblet cells. This is consistent with our data where a high number of unique peptides were detected for this protein (Table SD1 ). MRP1 confers resistance to several chemotherapeutic agents including vincristine, daunorubicin and methotrexate. Physiological substrates for MRP1 are e.g., leukotriene C4 (LTC4) and glutathione disulfide. Interestingly, these substrates play an important role in lung physiology with respect to inflammation and oxidative stress [37] . ABCC3 (or MRP3) gene is the closest MRP1 homologue. Its product is involved in resistance against the anti-cancer drugs etoposide, teniposide and at higher concentrations also to methotrexate. The physiological function is still unknown for ABCC4 (or MRP4). They may serve as an efflux pump of nucleosides cAMP and cGMP at low affinity, likely in a GSH-independent manner, having thus different substrate specificities. In summary, expression levels of these transporters at airway epithelium, in particular in NE, should be crucial for detoxification and might have implications in therapeutic strategies for respiratory diseases. IPA also ascribed a large number of proteins in membrane fraction to vesicular transport. From a total of 56 proteins, several are known to be involved in SNARE protein-mediated membrane trafficking responsible for docking and fusion of vesicles to targeted membrane. Associated-molecular machinery identified in this fraction includes vesicle-trafficking protein SEC22B and SEC23 interacting proteins, STX7, STX8 and STX18, VAMP2, VAPA, USE1, SNAP23 and GOSR1 proteins [38] . In this functional group, it is also observed the secretory carrier membrane proteins 1, 2 and 3 (SCAMP1, SCAMP2 and SCAMP3), γ1-adaptin (AP1G1) and αadaptin A (AP2A1), all components of adaptor protein complexes involved in cargo selection and vesicle formation associated to clathrin-dependent endocytosis [39] [40] [41] . Proteins belonging to GTPase family, representing regulators of eukaryotic vesicular membrane traffic [42] such as RAB10, RAB13, RAB6A and RAB7A, were also grouped here. Concordantly, IPA revealed caveolar-mediated endocytosis as a canonical pathway enriched in membrane fraction (Table SD4d ). Caveolae are plasma membrane invaginations characterized by flask-shape morphology, enriched in cholesterol and glycosphingolipids. They are proposed to be important in lipid transport, acting as scaffolding or organizing platforms for signaling events that includes endothelial nitric oxide synthase (eNOS) activation and recruitment of protein kinase C isoforms to the plasma membrane [43] . Two proteins, flotilin 1 and 2 (FLOT1 and FLOT2) known as structural molecules for caveolar support were also identified. Curiously, caveolin-1 (CAV1) was only observed after reducing previously described criteria. According to some evidences, flotilins are involved in an alternative noncaveolin-dependent mechanism for structural support of caveolae [44, 45] . Whether our observation reflects the predominance of flotilins over caveolin-1 in NE or it is the artifact of the sample preparation remains to be elucidated. Evidences show a link between caveolar-mediated endocytosis and regulation of cell adhesion [46] , through internalization of integrins (ITGA2, ITGA3, ITGA6, ITGAM, ITGAV, ITGB1, ITGB2 and ITGB4), also grouped here. The same was observed for several members of small GTPase superfamily, RAB5A, RAB5B and RAB5C which seem to have an important role in caveolae dynamics [47] . Significant number of proteins associated with cell-to-cell signaling/interaction, was identified and includes intercellular adhesion molecules which make part of tight (CLDN1, CLDN3, F11R), adherens (CDH1, CTNNA1, CTNNB1, CTNND1), desmosome (PKP2, DSG2, DSG3) and hemidesmossome junctions (ITGA2, ITGB1, ITGAV, ITGA6, ITGA3, ITGB2, ITGB4) [48] . These structures are located at plasma membrane and have an important function of establishing physical interactions between adjacent cells, maintaining integrity and impermeability of airway epithelia [49, 50] . In addition, a substantial number of proteins are involved in receptor activity which is compatible with the important role of epithelial cells in sensing changes in its environment [51] . This group includes proteins belonging to the small GTPase superfamily (RAB13, RAB21, RAP1A, RAC1), tyrosine protein kinase family (EGFR, IGF1R, ERBB2, EPHA2) and tyrosine protein phosphatase family (PTPRC, PTPRJ). Some molecules involved in immune response were also grouped here. This is the case of HLA-A, HLA-DRA proteins, constituents of the MHC classe I and II, respectively, the antigen binders IGHG1, IGKC, IGHM, and leukocyte surface markers such CD47, CD9, CD59, CD14. The latter marker was already described as cooperating with MD-2 and TLR4 to mediate the innate immune response to bacterial LPS [51, 52] . Other LPS binding proteins such as α-1 defensin, cathelicidin antimicrobial peptide and bactericidal permeability-increasing protein were also grouped here. Molecules involved in immune response can arise from several sources, including inflammatory cells resident in the epithelium. In accordance with current knowledge, its interaction with respiratory epithelia modulates innate and acquired immune response against microbial infections and inhaled toxic pollutants [3, 53] . Biological functions enriched in soluble fraction account for cell death, drug metabolism and protein folding, typical of cytoplasm/intracellular compartments (Fig. 4) . The most relevant functions are discussed below. The IPA analysis revealed cell death as the top molecular and cellular functions in the soluble fraction, exhibiting significant enrichment when compared to the membrane and the overlapping fractions of NE (Fig. 4) . Exploring this function, we observed proteins from different pathways related to oxidative stress response, Myc-mediated apoptosis signaling and cell cycle G2/M DNA damage (Table SD4b /d). Concerning oxidative stress response, several identified proteins play a noteworthy role in glutathione (GSH) metabolism (GSR, GSTP1, GSTA1, GCLC, IDH3A, GSTO1, PRDX6, IDH1) as well as NRF2-mediated oxidative stress response [54, 55] (AKR7A2, GSTP1, SOD1, GSTA1, NQO1, GCLC, GSTO1, GSR, AKR1A1, SOD2, ERP29, STIP1, CAT, TXN) and free radical scavenging (CLIC6, DNL, HNRNPA3, ILF2, LGALS3BP, PDIA6, PPIA, PRB3, PRDX1, PRDX6, S100A8, TALDO1, TST, UBXN, ALB, ALDH2, GPX1, HSD17B10, HSPB11), especially reduction and catabolism of hydrogen peroxide (MPO, GPX1, PRDX1, PRDX2, PRDX3, PRDX5). Glutathione is one of the most powerful endogenous antioxidants, protecting molecules by conjugating with hydrogen peroxide to yield water and oxygen via glutathione peroxidase (GPX1). Oxidized glutathione (GSSG) is then regenerated to GSH by NADPH via glutathione reductase (GSR). NFR-2 [Nuclear factor (erythroid-derived 2)-like 2] is a transcription factor that regulates the antioxidant response [54] [55] [56] by inducing expression of genes involved in combating oxidative stress and activating the body's own protective response. Examples of these are glutathione-S-transferases (GSTs), a family of cytosolic, mitochondrial and microsomal enzymes that catalyze the conjugation of GSH with both endogenous and xenobiotic electrophiles which members GSTA1, GSTK1, GSTO1 and GSTP1 were identified here, along with the glutamatecysteine ligase (GCLC), an enzyme involved in the rate-limiting step in the synthesis of GSH. GST and GCLC are induced by NFR-2 activation and represent an important route to eliminate potentially harmful and toxic compounds while balancing redox state of a cell [57, 58] . This result is in agreement with higher proportion of identified proteins involved in drug and xenobiotics metabolism (ALDH5A1, ALDH7A1, ALDH9A1, CAT, CES1, CES2, GCLC, GSTA1, GSTO1, GSTP1, HSP90AB1, NQO1, PPP2R1A). All these proteins are involved in the maintenance of an adequate oxidant/antioxidant balance in NE cells, playing a decisive role in ameliorating the oxidative stress and protect the airways against hostility and injury promoted by the environment. Accumulation of ROS with concomitant attack to cellular membranes might compromise cellular structure maintenance and tissue integrity, therefore combating oxidative stress is a critical and vital function in this tissue. A slight enrichment of DNA replication, recombination and repair functions in soluble fraction was also observed (Table SD4b) , which in conjugation with oxidative stress response machinery, may orchestrate a particular proteinaceous environment propitious to cellular division in accordance with the rapid turnover and regeneration of NE. Concerning general metabolism, overrepresentation of proteins involved in glycolysis/gluconeogenesis, pentose phosphate pathway and citrate cycle was observed in the soluble fraction, as expected (Table SD4b ). The latter mentioned processes are closely related to the unquestionable importance and function of NE as the first barrier between environment and organism's interior. This interface is constantly being exposed to physical and chemical aggressions resulting in increased wound healing and remodeling [3, 53, 59, 60] . The epithelium, therefore, has increased demands of high energy compounds (ATP and NADH), reducing equivalents (NADPH) and pentoses, molecules derived from the highlighted processes, while maintaining a constant reservoir of glucose to guarantee proper physiological functions and assure tissue integrity and structure. Of particular interest is the generation of pentoses for the synthesis of nucleotides and nucleic acids, used in genomic material processing, and NADPH, which functions in reductive biosynthesis (e.g. fatty acids synthesis) and cell protection from oxidative stress by glutathione reduction [61] . Some abundant proteins associated with metabolism, energy pathways (PKM2, ALOX15, HK1, HADHB, LDHA, HADHA, PGK1, FDXR, ATP5B, ATP5A1) and regulation of gene expression and nucleobase, nucleoside, nucleotide and nucleic acid metabolism (ILF3, PML, DDX3X, EWSR1, LGALS3, XRCC5), although predominately found in the soluble fraction, were also observed in membrane fraction. Protein folding and degradation also appear as processes overrepresented within the soluble fraction. These processes occur naturally in the cytosol associated with ER, Golgi and transport vesicles constituting pathways of proteins' processing and maturation until a final functional form. Proteins such as CALR, UGGT1, ERP29, ERP44, HSP90AB1, HSP90AA1, PDIA6, PPIA, HSPE1, ERO1L, TXN, ST13, HSPA5, HSPA8, HSPD1 and RUVBL2 are mainly assigned to folding while ALDH1A1, ALDH3A1, MPO, HSP90B1 and HSPA5 strongly contribute to proteins' turnover mechanisms. Several of the identified proteins present chaperone activity ensuring proper proteins' homeostasis and upholding of a functional tissue. Also, several proteins were also assigned to post-translational modifications (Table SD4b) , a late stage phenomena occurring during protein synthesis, influencing proteins' conformation and ultimately their function. The modifications highlighted by those proteins are mainly amino acid-based, like auto-oxidation of cysteines and tyrosines by CP, ALB and MPO, respectively, deamination of glutamine acid by GLUD1 or exposure of lysines by ANXA2 while proteins' homotetramerization is the result of the action of ALDH1A1, DECR1 and HSD17B10. However, posttranslational modifications of amino acids extend the range of proteins' functions by attaching to tem other biochemical functional groups, changing their chemical nature or by making structural rearrangements. An interesting observation was the predominance of actins and actin-related proteins in the soluble fraction compared to the membrane fraction. Proteins such as ACTN1, ACTR2, ACTR3, ARPC2, ARPC3 and ARPC4 but also CFL1, GSN, MSN, MYL6, MYL12A, PPP1CA, TMSB4X, VCL and ARHGDIA illustrate this result. Most of these cytoskeleton constituents are involved in cellular assembly, organization and Rho-dependent movement, biological processes also enriched in the soluble fraction subproteome. In fact, members of the Rho GTPase family have been shown to regulate many aspects of intracellular actin dynamics [62] , acting like molecular switches of cell proliferation, apoptosis, cell polarity or vesicular trafficking. They also participate in the formation of the lamellipodia and filopodia, by encouraging actin retraction [63] , fundamental to cellular movement and wound healing. Corroborating this outcome was the identification of several subunits of the ARP2/3 complex (ARPC2, ARPC3, ARPC4, ACTR2, ACTR3), proteins present at the microfilaments junctions that help to create the actin meshwork [64] . Observation of the mentioned proteins substantiates their part in the maintenance of an architectural structure that fits the function of the epithelium. The characterization of proteins' abundance in a particular cell/tissue may provide important information about its functional contribution to the organism physiology. Qualitatively, some parameters such as the hit rank, score and number of peptides per protein, i.e., integrated peptide ion count measurements [65] , can be considered as indicators of protein abundance in a sample [66] . Recently, the developed identification-based algorithms that include the emPAI have shown a high correlation with the actual protein amount in complex mixtures with a wide dynamic range [19] . Although the emPAI accuracy is inferior to absolute quantification using synthesized peptide standards, it is effective in providing the information of protein abundances within the proteome of interest. We used emPAI to estimate protein content in NE, expressed in molar percentages for each identified protein (% mol), highlighting here the top 100 most abundant proteins in both subproteomes (Table SD5a /b). The top 100 most abundant proteins constitute 49.4% and 39.8% of the total proteins identified in membrane and soluble fraction, respectively (Table SD5a /b). Mitochondrial and ribosomal proteins are highly represented among the top 100 most abundant proteins, being more predominant in membrane fraction, as referred before. This observation is illustrated by the identification of several subunits of the mitochondrial electron transport chain, involved in energy production and ATP metabolism, and also structural ribosomal constituents, namely several members of the 40S and 60S subunits of the ribosome. Components of cytoskeleton such as keratins, among others, also belong to the top 100 most abundant proteins in nasal epithelium cells. We observed on both NE fractions type II (KRT1, KRT5, KRT7, KRT8) and type I keratins (KRT10, KRT18, KRT19). High abundance of keratins is in agreement with their known importance as structural stabilizers of epithelial cells given their polimerization into keratin filaments. Intracellularly, they braid the nucleus, span through the cytoplasm and are attached to the cytoplasmic plaques of the desmosomes. Thus, it seems that they are inherent part of the continuum of stability from the single cell to the tissue formation [67] . Identification of tubulin isoforms (TUBB2A and TUBA1A) among the most abundant proteins (mNE), is in accordance with its role in airway epithelium as building blocks of cilia of tall columnar cells besides being important components of cytoskeleton microtubules [68] . Another key function assigned to the group of most abundant proteins in sNE is metabolism and detoxification of xenobiotics by cytochrome P450 (ALDH2, GSTP1, ADH7, GSTA1, ALDH3A1, GSTK1). This occurs in tight conjugation with antioxidant response of the cells and glutathione association (GSTP1, GSTA1, IDH2, GSTK1) to eliminate potentially harmful compounds that might contribute to destabilization of the tissue. Reflecting the relevance of innate immunity on NE, as already discussed, we identified among the top 100 most abundant proteins in sNE the LPS binding proteins calgranulin A and B. LPLUNC1, and also PLUNC protein, although not included on the top 100, were also identified with relative abundance in this fraction. These proteins belong to the PLUNC protein family which are structural homologues of LPSbinding protein and the bacterial permeability-increasing protein, mediators of host defence against Gram-negative bacteria [69] . Along with neutrophil gelatinase-associated lipocalin, was also grouped here, these three proteins represent the lipocalin superfamily, which is an important group of serous cell antimicrobial proteins [70] . Despite its abundance on nasal fluids [70] , playing a role in protecting the epithelium from infection and chemical damage by binding to inhaled microorganisms and particles further removed by the mucocilary system, mucins were not included in the group of the top most abundant proteins. This is not surprising considering its extensive post-translation modification, including C-, O-and N-glycosylation and disulfide bonding through N-and C-terminal cysteines. Nevertheless, the following mucins were identified in this work: MUC1, MUC2, MUC4, MUC5AC, MUC5B, MUC13 and MUC16. Comparison with other proteomic studies exploring respiratory epithelium in vivo A literature search performed on February 2011 with the keywords "Human nasal epithelium" and "proteomics" indicated only 19 studies among more than 20 million citations deposited on Pubmed (Fig. SD2 ). The majority of the studies were focused on human nasal lavage fluid (NLF) and only three elected nasal cells as biological material for investigation of human nasal epithelium proteome, all using a 2D-PAGE-MALDI-TOF-MS approach [10] [11] [12] . By comparing that data (160 proteins in total) with our own (1482 proteins), we observed only 61 commonly proteins identified, confirming that 2D-PAGE is a complementary tool in proteomics although limitative (Fig. 6 ) [71] . Out of the studies describing the microenvironment of human nasal epithelium based on nasal lavage fluid analysis, a comprehensive investigation of this human sample was performed by Casado et al.(2005) , using capillary LC coupled to an ESI-Q-TOF MS equipment as an approach to produce a profile of proteins in NLF from healthy subjects [70] . From the 67 IPA mapped-proteins described in Casado's work, 43 proteins were highlighted in our own work (results not shown). This group of proteins includes many secreted proteins involved in innate and acquired immunity (e.g., immunoglobulins and proteins secreted by serous and mucous cells) but also components of plasma as a result from exudation and cellular cytoskeleton (e.g., keratins). In order to compare our data with other proteomic studies focused on other human respiratory epithelia using a similar technological platform, we performed a literature search on Pubmed with keywords "human respiratory epithelium", "proteomic" and "liquid chromatography". This search delivered five references (Fig. SD3) , describing proteomic studies referred above on human nasal lavage [70, 72] , human olfactory cleft mucus [73] and human bronchial airway epithelium brushing [74] . In the last study, Steiling and co-workers used bronchial epithelial cells harvested by brushing the mainstream bronchus to investigate molecular changes induced by tobacco exposure on a transcript and protein level. Data coming from protein profiling was extracted (859 proteins) for further comparison with NE proteome herein obtained (1482 proteins). IPA's comparison revealed that about 540 proteins are commonly expressed in both airway epithelia (Fig. SD4 , details available in Table SD7 ). Despite using a 1D-PAGE-LC-MS/MS strategy, which is compatible with membrane analysis, the nature of these overlapping proteins is mainly soluble and the major molecular function associated to this group of proteins is catalytic activity, including oxireductase and hydrolase activities. Among these, several proteins are annotated in HPRD as being expressed in respiratory system (e.g., nasal and bronchial epithelium, lungs and bronchoalveolar fluid) as is the case of PLUNC protein, polymeric immunoglobulin receptor and aldehyde dehydrogenase 3B1, annotated as being expressed in bronchial epithelial cells, airway epithelia and lung, respectively. In the present work, proteins such as CD44, HLA-DRA, mucin 1 transmembrane and aquaporin 5 are examples of proteins described in mNE that are also annotated in HPRD as expressed in bronchial epithelial cells but failed their identification in Steiling's work. One of the challenges in clinical proteomics is the difficulty in the selection of samples that provide scope into specific disease entities such as asthma, cystic fibrosis and smokingrelated lung diseases. Several methods of sampling airway and lung fluids have been used to assess the microenvironment of the lungs, including induced sputum, bronchoalveolar lavage and exhaled breath condensate [75] . In our study, we were able to identify about 141 proteins previously associated with the development and/or progression of several respiratory diseases (Table SD4e) . Lung cancer appears as the pathology with more associated proteins (75 out of the 141). However, other diseases emerged from IPA's analysis, namely severe acute respiratory syndrome (SARS) or pneumonitis (Table SD4e) . Two proteins, NOS2 and TPI1, have been associated with asthma while MPO and LTF have been involved in the development of cystic fibrosis [76, 77] . Significantly, several of those proteins are among the top 100 most abundant proteins, as is the case for AGR2, ALDH3A1, ALDOA, GSN, HSP90AA1, HSP90B1, KRT5, KRT19, PPIA, S100P, SERPINB3, TMSB4X, TPI1, TUBA1A, VIM, YWHAE, S100A4, CTSB, previously associated with development, progression and or metastasis of lung cancer and also ACTA2, GAPDH, HBB, HSP90B1, S100A4, S100A9, S100P, TALDO1, TXN, associated with other respiratory disorders. This is also the case of two key proteins, previously described in cystic fibrosis mouse model by our group [78] , retinal dehydrogenase 1 (ALDH1A1) and aldehyde dehydrogenase (ALDH2) along with ADH7, which were found to be crucial enzymes in retinoic acid metabolism, a hormone known as important regulator of organ development and homeostasis, including the lung [79] . Expression of ALDH1A1 in airway epithelium has already been confirmed by in situhybridization of embryonic lung [80] . As recently referred by Bossi and Lehner [81] , these proteins might fall on the group of "universally expressed housekeeping proteins", which brings the clinical proteomics field to the future challenge of, not only completing characterization of each cell type that comprise airway epithelium, but also most importantly, defining particular interactions that can occur in that tissue that could be associated with the pathophysiology of respiratory diseases such as asthma, CF and COPD. The aim of present study was the in-depth proteomic characterization of NE in support of hypothesis that NE proteome reflects the lower airways' protein environment and thus may be a useful alternative specimen to lower airway specimens accessible only by highly invasive biopsies for biomarker research and discovery for lower respiratory diseases. Although nasal fluid, bronchoalveolar lavage and sputum have been extensively used to assess airway physiology, they mainly provide secreted and tissue leakage molecules that may not effectively reflect tissue/cellular events. Here, we present the most complete proteome profiling of NE to date. We employed an improved subcellular fractionation method followed by a high sensitive multidimensional protein identification technology unraveling mechanisms and processes intimately correlated with normal and pathological lung function. Most of the identified NE proteome overlaps with the previously described ones for bronchial epithelia providing extra consistency related to the usefulness of these cells in lower airways epithelia research. The exhaustive NE proteome data achieved here might be useful for further studies aimed to better understanding of molecular features underlying chronic respiratory diseases using non-invasive and easily collectable specimens that are physiologically and molecularly similar to those on lower airways. Supplementary data associated with this work can be found in the online version. The list of proteins and corresponding peptides identified by the shotgun approach is available in Table SD1 . Annotation of proteins considered positively identified (proteins identified in three independent experiments, with more than one unique peptide) was performed independently using HPRD, IPA and UniProtKb and is available in Table SD2 . Table SD3 provides results of hydrophobicity analysis. Protein data analysis performed by IPA was exported to Table SD4 . Protein abundance analysis is available on Table SD5 . Comparison of our data, performed by IPA, with other proteomes, namely nasal epithelia and bronchial epithelium, is available in Table SD6 and SD7, respectively. Supplementary materials related to this article can be found online at doi:10.1016/j.jprot.2011.05.012. Tânia Simões and Nuno Charro were supported by Fundação para Ciência e Tecnologia (FCT) fellowships. The present study was partially financed by FCT (research grant POCI/SAU-MMO/56163/2004), FCT-Poly Annual Funding and FEDER-Saúde XXI. Authors would like to acknowledge all volunteers and donors of nasal epithelium specimens and doctors for their valuable cooperation in this study. Pulmonary Ventilation Upper and lower airways: similarities and differences Breakdown in epithelial barrier function in patients with asthma: identification of novel therapeutic approaches The nose-lung interaction in allergic rhinitis and asthma: united airways disease Rhinitis and asthma: one airway, one disease Rhinitis and asthma: united airway disease The united airways concept: from bench to bedside The ultrastructure of nasal mucosa in children with asthma Mucosal and systemic inflammatory changes in allergic rhinitis and asthma: a comparison between upper and lower airways Proteomic analysis of nasal cells from cystic fibrosis patients and non-cystic fibrosis control individuals: search for novel biomarkers of cystic fibrosis lung disease Proteomic analysis of normal human nasal mucosa: establishment of a two-dimensional electrophoresis reference map Differential proteomic analysis of nasal polyps, chronic sinusitis, and normal nasal mucosa tissues Membrane proteins and proteomics: un amour impossible? Cystic fibrosis F508del patients have apically localized CFTR in a reduced number of airway cells Identification of membrane proteins from mammalian cell/tissue using methanol-facilitated solubilization and tryptic digestion coupled with 2D-LC-MS/MS Performance metrics for liquid chromatography-tandem mass spectrometry systems in proteomics analyses ProteinOn: Web Tool for protein Semantic Similarity A simple method for displaying the hydropathic character of a protein Exponentially modified protein abundance index (emPAI) for estimation of absolute protein amount in proteomics by the number of sequenced peptides per protein Modified epithelial cell distribution in chronic airways inflammation Proteomic analysis of extracellular matrix and vesicles Osmolytes and ion transport modulators: new strategies for airway surface rehydration Human airway ion transport. Part one Structure of acid-sensing ion channel 1 at 1.9 A resolution and low pH Molecular diversity and function of K+ channels in airway and alveolar epithelial cells CFTR localization in native airway cells and cell lines expressing wild-type or F508del-CFTR by a panel of different antibodies Human-specific cystic fibrosis transmembrane conductance regulator antibodies detect in vivo gene transfer to ovine airways Mapping of cystic fibrosis transmembrane conductance regulator membrane topology by glycosylation site insertion Respiratory epithelial gene expression in patients with mild and severe cystic fibrosis lung disease Physiological importance of aquaporins: lessons from knockout mice Aquaporin water channels in transepithelial fluid transport Diversity of the mammalian sodium/ proton exchanger SLC9 gene family The solute carrier 26 family of proteins in epithelial ion transport Calcium pumps in health and disease The V-type H+ ATPase: molecular structure and function, physiological roles and regulation Trafficking of the copper-ATPases, ATP7A and ATP7B: role in copper homeostasis ATP-binding cassette (ABC) transporters in normal and pathological lung Extracting sequence motifs and the phylogenetic features of SNARE-dependent membrane traffic Secretory carrier membrane proteins interact and regulate trafficking of the organellar (Na+, K+)/H+ exchanger NHE7 Life of a clathrin coat: insights from clathrin and AP structures Integrating molecular and network biology to decode endocytosis Rab35-a vesicular traffic-regulating small GTPase with actin modulating roles The multiple faces of caveolae Caveolae-from ultrastructure to molecular mechanisms Molecular mechanisms of clathrin-independent endocytosis Fibronectin matrix turnover occurs through a caveolin-1-dependent process Mechanisms of endocytosis A proteomic characterization of the plasma membrane of human epidermis by high-throughput mass spectrometry Human airway epithelial tight junctions Increased permeability of asthmatic epithelial cells to pollutants. Does this mean that they are intrinsically abnormal? How epithelial cells detect danger: aiding the immune response TLR4-dependent recognition of lipopolysaccharide by epithelial cells requires sCD14 Immunological functions of the pulmonary epithelium Isolation of NF-E2-related factor 2 (Nrf2), a NF-E2-like basic leucine zipper transcriptional activator that binds to the tandem NF-E2/AP1 repeat of the beta-globin locus control region Molecular mechanisms of Nrf2-mediated antioxidant response The Nrf2-antioxidant response element signaling pathway and its activation by oxidative stress The Nrf2 transcription factor contributes both to the basal expression of glutathione S-transferases in mouse liver and to their induction by the chemopreventive synthetic antioxidants, butylated hydroxyanisole and ethoxyquin Glutamate-cysteine ligase modifier subunit: mouse Gclm gene structure and regulation by agents that cause oxidative stress Proteomics of lung physiopathology Mouse models of rhinovirus-induced disease and exacerbation of allergic airway inflammation Reactive oxygen species and endothelial activation GTP-binding proteins of the Rho/Rac family: regulation, effectors and functions in vivo A molecular model for axon guidance based on cross talk between rho GTPases Cortactin localization to sites of actin assembly in lamellipodia requires interactions with F-actin and the Arp2/3 complex Analysis of the Plasmodium falciparum proteome by high-accuracy mass spectrometry Toward a protein profile of Escherichia coli: comparison to its transcription profile The human keratins: biology and pathology When cilia go bad: cilia defects and ciliopathies Host defense in oral and airway epithelia: chromosome 20 contributes a new protein family Identification of human nasal mucous proteins using proteomics Gel-based and gel-free proteomic technologies Proteomics for nasal secretion analysis Identification of human olfactory cleft mucus proteins using proteomic analysis Comparison of proteomic and transcriptomic profiles in the bronchial airway epithelium of current and never smokers Proteomics in detection and monitoring of asthma and smoking-related lung diseases Myeloperoxidase-dependent oxidative metabolism of nitric oxide in the cystic fibrosis airway Identifying Peroxidases and Their Oxidants in the Early Pathology of Cystic Fibrosis Proteomic analysis of naphthalene-induced airway epithelial injury and repair in a cystic fibrosis mouse model Function of retinoid nuclear receptors: lessons from genetic and pharmacological dissections of the retinoic acid signaling pathway during mouse embryogenesis Embryonic retinoic acid synthesis is required for forelimb growth and anteroposterior patterning in the mouse Tissue specificity and the human protein interaction network