key: cord-0762185-ry9ayyia authors: Alston, Jhullian J.; Soranno, Andrea; Holehouse, Alex S. title: Integrating single-molecule spectroscopy and simulations for the study of intrinsically disordered proteins date: 2021-04-06 journal: Methods DOI: 10.1016/j.ymeth.2021.03.018 sha: 2d3713326569890273a89beb3c248e271108b78a doc_id: 762185 cord_uid: ry9ayyia Over the last two decades, intrinsically disordered proteins and protein regions (IDRs) have emerged from a niche corner of biophysics to be recognized as essential drivers of cellular function. Various techniques have provided fundamental insight into the function and dysfunction of IDRs. Among these techniques, single-molecule fluorescence spectroscopy and molecular simulations have played a major role in shaping our modern understanding of the sequence-encoded conformational behavior of disordered proteins. While both techniques are frequently used in isolation, when combined they offer synergistic and complementary information that can help uncover complex molecular details. Here we offer an overview of single-molecule fluorescence spectroscopy and molecular simulations in the context of studying disordered proteins. We discuss the various means in which simulations and single-molecule spectroscopy can be integrated, and consider a number of studies in which this integration has uncovered biological and biophysical mechanisms. A structure-centric perspective has dominated our models of molecular function since the first folded proteins were visualized over 60 years ago [1] [2] [3] [4] . Despite this, over a third of the eukaryotic proteome consists of regions or entire proteins that do not adopt a stable structure but instead sample a conformationally heterogeneous collection of structurally distinct states referred to as a conformational ensemble (Fig. 1 ) [5] [6] [7] [8] . These intrinsically disordered proteins and protein regions (collectively referred to hereafter as IDRs) play a wide variety of roles that are critical for biological function [9] [10] [11] . As a result, the classical view that protein function is determined by folded proteins has expanded to recognize that function is driven by the combination of structure, conformation, and dynamics. There exists a continuum of structural heterogeneity, with well-folded hyper-stable proteins at one end and heterogeneous disordered regions at the other (Fig 1) [12] . While well-folded proteins lend themselves to various functions, including mechanical strength or enzymatic activity, disordered proteins are ideally suited for molecular recognition or biological self-assembly [9, 13] . It is this repertoire of conformational plasticity that provides cells with a complex molecular toolkit, through which adaptive and responsive function can be encoded. The three-dimensional structure of a folded domain is encoded by its primary sequence, an observation that has generally been referred to as the sequence-to-structure relationship [15] [16] [17] . Although IDRs do not adopt a set three-dimensional structure, they are far from "featureless random noodles." As such, an analogous sequence-to-ensemble relationship exists for IDRs in which the amino acid sequence of an IDR determines the conformational biases associated with its ensemble [10, 18, 19] . Just as the last four decades have focused immense attention on understanding the physical principles that map sequence to structure, the same types of questions are now being asked of disordered regions. Beyond merely an exercise in understanding physical chemistry, the conformational biases in IDRs are a central determinant of their biological function [20] [21] [22] [23] [24] . As such, our ability to interpret IDR function rests at least partially on how well we understand their sequence-encoded conformational biases and transient structure. A major challenge in studying conformational behavior in IDRs is posed by the structural heterogeneity and rapid dynamics associated with their ensembles. Due to the absence of a standard 'reference' structure, techniques such as X-ray crystallography are inherently limited in their ability to provide molecular information in the context of IDRs. Similar limitations can be extended to cryogenic electron microscopy (cryoEM), where class averaging across multiple particles is often limited to a few conformational subsets. While various techniques have been instrumental in elucidating the conformational behavior of IDRs, single-molecule fluorescence spectroscopy and all-atom simulations have played essential roles in contributing to our understanding of IDR conformational behavior and IDR dynamics. In this review, we focus on how combining single-molecule fluorescence spectroscopy and computational methods can provide quantitative and complementary insights into the solution state behavior of IDRs. Single-molecule fluorescence spectroscopy offers a high-resolution readout of molecular behavior, making it ideal for investigating the complexities and heterogeneity of disordered proteins [25] [26] [27] . Specifically, single-molecule fluorescence spectroscopy enables measurements of intra-and inter-chain distances and protein dynamics with high temporal and spatial resolution. Paired with an understanding of the physics that underlie protein interactions, single-molecule approaches can be used to dissect the molecular mechanisms that drive protein behavior, dynamics, and binding. As an example, fluorescence correlation spectroscopy (FCS) allows the diffusion coefficient of an IDR to be measured, from which the overall hydrodynamic radius of the protein can be estimated. Single-molecule Förster Resonance Energy Transfer (smFRET) provides a molecular ruler to quantify intramolecular distances within the protein [27] [28] [29] , which can be directly tied to fundamental descriptors of polymer physics. Finally, many single-molecule fluorescence approaches provide access to protein dynamics, over a broad range of times, from nanoseconds to hundreds of seconds, depending on the method of choice. Readouts of protein dynamics are often essential for adequately interpreting measured transfer efficiencies in smFRET experiments, particularly when discriminating whether a population reflects a static or dynamic conformation. More generally, experimentally-derived molecular dynamics offer an additional lens through which perturbations to an IDR (mutation, binding partners, solution changes) can be examined. Molecular simulations include a robust set of tools that provide structural insight at an effectively infinite spatial resolution [30] [31] [32] [33] . By generating large conformational ensembles, protein conformation and dynamics can be directly assessed (in the case of molecular dynamics), or ensemble-averaged properties can be computed (in the case of molecular dynamics and Monte Carlo simulations). Essentially any property that can be derived from the collection of conformations can be calculated, offering a window into a wide array of molecular information. Of particular relevance in the context of disordered proteins, all-atom simulations are especially well-poised to enable a structural interpretation of experimental data as a function of some perturbations, such as mutations, post-translational modification, and changes in solution properties such as temperature, ion concentrations, or pH. [20, [34] [35] [36] [37] [38] [39] [40] . Single-molecule fluorescence spectroscopy and all-atom simulations are highly complementary. Both techniques can, in principle, provide information at the resolution of a single molecule and do so at high temporal resolution. As such, the types of information available from single-molecule fluorescence spectroscopy and all-atom simulations are simultaneously overlapping, yet the assumptions and limitations are inherently orthogonal. As such, results from simulations can help interpret measurements made by single-molecule fluorescence spectroscopy, and vice versa. The remainder of this review is laid out as follows. We provide a brief description of all-atom simulations and single-molecule fluorescence spectroscopy approaches used in the context of disordered proteins. We discuss theoretical approaches through which results from simulations and experiments can be formally integrated. We then consider specific examples in which simulation and experiment have been integrated to provide complementary insight. Finally, we conclude by summarising the outstanding questions and challenges. on the degrees of freedom that are explicitly encoded within that scheme (Fig. 2) . We broadly categorize all-atom simulations here as those in which every biomolecule in the system is represented with atomistic detail, providing a one-to-one mapping between a simulated and real molecule. This would include representations that encode implicit and explicit hydrogens, given in both cases a clear mapping between a given biomolecule and atomic position are present. Commonly used modern forcefields that have shown good agreement in the context of disordered proteins include Amber ff03w, Amber ff03ws, Amber ff99SBws, a99SB-disp, DES-Amber, Amber ff99SBws-STQ, CHARMM36m, and the ABSINTH implicit solvent model [43] [44] [45] [46] [47] [48] [49] [50] [51] . In contrast to all-atom simulations, coarse-grained simulations sacrifice accuracy for a reduced number of degrees of freedom, facilitating larger, longer, and faster simulations. Disordered proteins have been well-described by a range of different coarse-grained models, including ultra-coarse-grained models, one-bead-per residue models, or mixed-resolution models [52] [53] [54] [55] [56] [57] [58] [59] [60] [61] [62] [63] [64] [65] . While coarse-grained simulations have had remarkable success in capturing conformational behavior in disordered proteins, here we focus on all-atom simulations [66] [67] [68] [69] [70] [71] . Not only must the protein of interest be represented, so too must the solution environment. The solvent can be represented using an explicit solvent model (in which water is described as individual molecules) or an implicit solvent (in which the solvation effects are 'felt' by the molecules through a mean-field interaction) [72] . Explicit solvents are generally computationally expensive but benefit from directly capturing information related to the local solvent structure. While implicit solvents sacrifice molecular detail, the performance enhancement by reducing the number of atoms in the system by 90-99% is substantial. In the context of disordered proteins, the strength of attractive protein-water interactions has been the subject of substantial investigation, and may be one of the most important factors that determines forcefield accuracy in the context of disordered proteins [46, 47, 50, 73, 74] . The update scheme reflects how the molecules defined by the representation scheme evolve as the simulation proceeds. In Molecular Dynamics (MD) simulations, the update scheme converts changes in energy with respect to the atomistic position into force [41, 75, 76] . This force dictates the evolution of the system through a series of timesteps in which new forces are calculated and used to alter the velocity of each atom in a sequential manner. MD simulations can be used to obtain both ensemble-average values for observables of interest (e.g., end-to-end distance, the radius of gyration, local transient structure) as well as information on chain dynamics [39, 44, 77, 78] . Monte Carlo (MC) simulations differ from MD simulations with respect to the update scheme. For MC simulations, changes to the protein conformation are made in a series of Monte Carlo steps [79] . During each step, i) a random perturbation (move) to the system is applied, leading to a temporary change in protein conformation ii) the potential energy associated with the new conformation is calculated, and iii) the new conformation is either accepted or rejected depending on the change in energy compared to an acceptance criterion. Typically Monte Carlo moves include rigid body moves (e.g., translation or rotation of a molecule of interest), local moves that act on a single degree of freedom (e.g., a single dihedral angle rotation or bond stretching), or more complex moves that perturb several degrees of freedom simultaneously (e.g., in the context of concerted pivot moves or moves to perturb systems in specific ways [80] [81] [82] [83] [84] .) The acceptance criterion determines how moves are accepted or rejected. The most commonly used criterion here is Metropolis-Hastings, and when combined with an ergodic moveset that maintains detailed balance, this approach ensures that the collection of conformations generated sample the canonical (NVT) distribution [85, 86] . Standard MC simulations cannot provide information on chain dynamics as there is no time component involved in the update scheme. However, if well-sampled ensembles are generated, equilibrium distributions of various ensemble properties such as global dimensions, average distances between residues, or transient structure can be obtained [24, 35, 87] . There are several caveats associated with the interpretation of intrinsically disordered proteins with all-atom simulations. One area that has received considerable attention is that of force field accuracy [44, 50, 74, 88] . Obtaining the correct balance of attractive and repulsive atomic interactions and dihedral angle distributions is an inherently challenging problem. For IDRs in particular, small inaccuracies can have a substantial impact on the final conformational ensemble due to the metastable nature of residual structure in IDRs. Many standard force fields lead to over-compaction of IDRs, influencing both final ensemble behavior and introducing local kinetic traps that can impair conformational sampling [46, 47, 88] . There has been a substantial effort over the last decade to address this challenge with IDRs in mind, with notable work from several key players including Best, Mittal, and Piana on this challenging problem [31, [43] [44] [45] [46] [47] [48] [49] [50] 73, 74, [88] [89] [90] [91] . A related but distinct challenge is that of conformational sampling. The heterogeneous conformational landscape of an IDR means that the total number of energetically accessible conformations is vast -much larger than there are for the same folded protein. Given conformational rearrangement takes time, there is a real and practical challenge in that for MD simulations, adequate sampling in unbiased simulations will typically require many microseconds of simulation time, even in the best-case scenario where there are no kinetic traps. Unfortunately, this requirement is often forgotten, with simulations run as a single replica for just a few hundred nanoseconds. These simulations can inherently only explore a small slice of conformational space and will inevitably lead to biased or noisy conclusions. As mentioned above, simulations of IDRs also often experience "kinetic" traps -long-lived metastable states that impede conformational exploration. Both MD and MC simulations can suffer from these metastable states (Fig. 3) . In the context of MC simulations, structurallycooperative energetic minima raise a specific challenge, whereby the probability of the specific move(s) necessary for escape becomes vanishingly small. In the context of MD simulations, large energetic barriers between distinct states can yield slow conformational rearrangements that lead to locally trapped states domination ensembles. Even long MD or MC simulations may only sample a small region of phase space due to spending large fractions of simulations in a single state. In both cases, local conformational traps can lead to disparate levels of conformational sampling along a single polypeptide, with locally trapped structural 'nuggets' that may give the illusion of good sampling. All told, substantial care should be taken when assessing ensembles for goodness of sampling [92, 93] . Förster Resonance Energy Transfer (FRET) is a non-radiative energy transfer that can occur when the emission band of one fluorophore (the donor) overlaps in part with the absorption band of the other fluorophore (the acceptor), and the two fluorophores are in proximity to one another. FRET provides a spectroscopic ruler to measure distances between specific positions on a molecule of interest [94] , such as a disordered protein (Fig. 4a) . As derived by Förster, the rate of energy transfer, denoted here as k FRET , is dependent on the sixth power of the distance r between the two fluorophores [95] , Here k D is the inverse of the fluorescence lifetime of the intrinsic donor lifetime τ D (i.e., in the absence of the acceptor) and R 0 is the Förster radius, where Q D is the fluorescence quantum yield of the donor, n is the refractive index of the solution, J is the spectral overlap integral, N A is Avogadro's constant, and κ is the dipole orientation factor which reports on the relative orientation of the dyes. The efficiency of the energy transfer E(r) can be computed by comparing the rate of the transfer k FRET with the other radiative and non-radiative relaxation rates (in the absence of acceptor) from the excited state to the ground state of the donor, k rad and k nrad , using Eq. 1 and the fact that k rad + k rnad = k D . In single-molecule experiments, the transfer efficiency can be measured by comparing the number of acceptor photons n A over the total number of acceptor (n A ) and donor (n D ) photons, E(r) = n A n A + n D (4) or by measuring the change in the lifetime of the donor in the presence and absence of the acceptor, where τ DA (r) = k F RET (r) + k rad + k rnad −1 (6) It is important to note that only a small number of photons are detected in a typical experiment. The low number of photons is determined by the relatively long interval between the detection of two consecutive photons (interphoton time), which is largely due to the fluorophores being trapped in long-lived dark states (e.g., triplets state on the microsecond timescale) after excitation. Therefore, the measurement of transfer efficiencies is affected by shot-noise [96] . This means that, even when measuring a rigid distance across a folded domain where one single transfer efficiency is expected, a distribution of transfer efficiencies will be determined, and the mean and width of the distribution can usually be extracted. The mean value of a shot-noise limited distribution reports on the configuration of the chain. For a rigid protein, this will coincide with a single distance as follows from Eq. 3. For IDRs, this mean value reports on the average value of transfer efficiency across the multiple conformations of the protein, and the factors that determine the average transfer efficiency are detailed below. The width of a shot-noise limited distribution depends on the average total number of detected photons according to, where ⟨1/N⟩ is the average of the inverse number of photons in a burst and N T is the minimum number of photons in a burst (usually determined as acceptance threshold for burst identification) [97] . This implies that to determine whether a single population in the transfer efficiency distribution represents a static, rigid distance (as for folded domains) or a dynamic, flexible polymer (as for IDRs), an orthogonal measure is required. Particularly helpful in this context are measurements that report on chain dynamics, and many single-molecule fluorescence approaches provide access to protein dynamics, including the analysis of transfer efficiencies vs. fluorescence lifetimes, transfer efficiencies vs. time binning, burst variance [98] , the use of Probability Distribution Analysis (PDA) [99] [100] [101] , and analysis of photon trajectories of immobilized molecules [23, [102] [103] [104] [105] [106] [107] . Since the measured transfer efficiency is an average of a given interval of time, the measured dynamics will reflect the conformational changes occurring on the characteristic timescale of observation. The diffusion time of molecules in the confocal volume and the camera detection rate in TIRF microscopy set an intrinsic timescale of reference for the corresponding measurements, usually in the range of milliseconds. Another timescale is given by the time-data bin used to analyze the data. There are no special limitations in the range of binning times that can be applied besides the intrinsic limitations due to the detection rate, whether related to the acquisition rate of the instrument (e.g., camera frame rate) or to the emission rate of the fluorophores (e.g., only a limited number of photons are observed in freely diffusing molecules). However, the choice of bin width dictates the averaging of FRET information over the selected time range. As a tangible example of what this assumption can mean, let us assume the case of two different conformational states with distinct conformations. When exchange dynamics are slower than the binning time, the two states will appear as separated peaks with distinct mean transfer efficiencies and widths. When dynamics are faster than the binning time, the transfer efficiencies associated with the two states will be averaged out together, giving rise to a single population. When using intermediate binning times, a partial averaging of the two populations occurs. Therefore, analysis of transfer efficiency histograms as a function of time binning can provide insights on conformational changes and dynamics [103, 108, 109] . When the distribution of transfer efficiencies is broader than shot-noise, Photon Distribution Analysis (PDA) can provide insights into the underlying populations as well as interconversion between different states [99] [100] [101] . The method appears to be more sensitive to interconversion occurring between 0.01 and 10 times of the burst duration [110] . Whereas PDA considers the differences in transfer efficiency among all the detected molecules, Burst Variance Analysis (BVA) quantifies how the transfer efficiency changes inside each molecule (burst) over time [98] . Consequently, BVA provides a measure of dynamics on timescale longer than the minimum binning of photons required to compute the transfer efficiency variance within the burst. Analysis of the photon trajectory with maximum likelihood methods do not require time binning and can provide access to fast dynamics (up to the microsecond timescale) by studying the statistics of detected photons [111] . Another intrinsic timescale in single-molecule measurements is the fluorescence lifetime of the fluorophore, which is typically in the nanosecond range. Therefore, contrasting the donor lifetime in the presence of the acceptor (Eq. 5) with the transfer efficiency determined from the number of acceptor and donor photons detected in a burst (Eq. 4) provides a useful test for the occurrence of fast dynamics compared to the burst duration. Indeed, Eq. 5 provides information on the transfer efficiency adopted by the system on the lifetime timescale [112] [113] [114] [115] . Instead, Eq. 4 computes transfer efficiencies from the number of donor and acceptor photons detected in a burst and, therefore, probes the transfer efficiency on the timescale associated with burst duration (or with the data binning time). The burst duration of freely diffusing species is commonly on the millisecond timescale. In the case of a rigid distance, we expect an identical transfer efficiency on the nanosecond and millisecond timescale probed by lifetime and bursts, respectively, as indicated by the linear relation between the two terms in Eq. 4 and 5. As a result, the measured static distribution should fall on the corresponding predicted linear trend. A deviation from this linear behavior is expected when the molecule of interest samples a broad conformational ensemble on a timescale longer than nanoseconds but shorter than milliseconds, as in the case of many IDRs [111, 116, 117] τ DA /τ D = 1 − E + σ 2 1 − E (8) where σ represents the variance of transfer efficiency due to fluctuations in the donoracceptor distance. A similar dependence can also be found when studying the characteristic delay acceptor emission [111] . If we denote P(r) as the distribution of conformations adopted by the interdye distance and we assume the interdye dynamics are slower than the dye tumbling but significantly faster than the interphoton times, we can compute the average τ DA from the dynamic distribution as defined by, τ DA = ∫ 0 ∞ tI(t)dt/ ∫ 0 ∞ I(t)dt (9) where I(t) is the time-dependent fluorescence intensity and is given by [115] , ∞ P (r)e −t/τ DA (r) dr (10) By integrating over the distance r, Eq. 10 assumes that the lifetime decay occurs faster than the conformational change in r as sampled by the distribution of distances given by P(r). The corresponding mean transfer efficiency is computed as, l c E(r)P (r)dr (11) where l c is the contour length between the dyes if the protein segment was fully extended. Dye orientation is commonly described in terms of a parameter defined as κ, with the typical result of "κ 2 = ⅔" for isotropic orientation of the fluorophores [25, 27] . This is commonly valid if the dye tumbling is faster than the protein dynamics. However, if the dynamics of the protein are instead much faster than the tumbling of the dyes, the relative orientation of the dyes becomes coupled to the transfer efficiency. Under this regime, the mean transfer efficiency is given by the combination of the distribution of distances sampled by the protein and of k sampled by the fluorophores with the transfer efficiency dependence of distance and k: where a is contact radius between the dyes, P(r) is the inter-dye probability distribution as described previously, E(r, κ 2 ) is the transfer efficiency dependence on κ is as given by, E r, κ 2 = 1 + 2 3κ 2 r/R 0 6 (13) and the probability distribution p(κ 2 ) is given by: p κ 2 , 1 ≤ κ 2 ≤ 4 = 1 2 3κ 2 ln 2 + 3 κ 2 + κ 2 − 1 (15) Analogously, if the chain dynamics are faster or comparable to the fluorescence lifetime, the energy transfer rate will depend on the distribution of states sampled by labeled molecules, E = ∫ a l c R 0 /r 6 P (r)dr/ 1 + ∫ a l c R 0 /r 6 P (r)dr (16) where, as before, l c is the contour length of the chain and a the dye-dye contact radius. Experimentally, time-resolved lifetime and anisotropy measurements can provide information on the tumbling rate of the fluorophores [118] , and more extensive discussion of the influence of the different timescales at play on transfer efficiency histograms can be found in the fundamental works of Gopich and Szabo [96, 97, 112, 119] . Finally, the functional form of the inter-dye probability distribution P(r) is typically approximated using simple polymer models or inferred from molecular simulations. While the mean transfer efficiency can be used to constrain the mean value of the distribution, the variance of transfer efficiency fluctuations σ can be used as a further constraint [120] for the distribution given that, Various closed-form analytical and numerical models have been applied to describe FRET data, including the freely jointed (or Gaussian) chain, worm-like chain, and the self-avoiding walk [27, [121] [122] [123] [124] . Popularity of these models is largely due to the fact that they rely on single fitting parameters, enabling association of the mean transfer efficiency with a mean square distance, persistence length, or excluded volume term. While the worm-like chain and self-avoiding walk distributions provide descriptive parameters to capture excluded volume effects (and repulsive interaction in general), more advanced polymer models are required to capture the transition from good to poor solvent often observed by tuning solution conditions (e.g., denaturant), temperature, or by altering the sequence [125] [126] [127] [128] [129] . Ziv et al. adapted the coil-to-globule theory of Sanchez, introducing a conversion factor between the mean radius of gyration and the corresponding distribution of end-to-end distances [121, 130, 131] . More recently, by comparing single-molecule FRET and smallangle X-ray scattering (SAXS) data with simulations, Zheng et al. have proposed an empirical adaptation of the self-avoiding walk distance distribution that depends on the solvent quality through the scaling exponent ν [120, 132] . These polymer models have been employed extensively to study disordered and unfolded proteins in many different contexts, where they have shown remarkable success [34, 35, [133] [134] [135] [136] . Fluorescence correlation spectroscopy (FCS) is a powerful complementary tool to smFRET that measures the correlations of fluorescence fluctuations caused by diffusion and dynamics of labeled molecules as well as other photophysical effects [137] [138] [139] [140] [141] . This correlation can be computed as, where ⟨...⟩ t represents the average over all measured times, τ is the lag time at which the correlation is computed, and I(t) and I(t + τ) is the fluorescence intensity at times t and (t+τ). When applied to single-photon counting measurements, the expression in Eq. 18 can be interpreted as the joint probability of observing a photon at time t and (t + τ) compared to the joint probability of observing two photons at any time, G(τ) = p(pℎoton at t and t + τ) p(pℎoton at any t) 2 Eq. 19 provides an intuitive way to understand how the correlation decays of FCS relates to molecular diffusion through the confocal volume or other physical properties. If the lag time is shorter than the average residence time of a molecule in the confocal volume, the joined probability of observing two photons that are separated by that given lag time will be high since they are emitted by the same molecule. When the lag time increases and approaches the average residence time of the molecule in the confocal, the decrease in the joined probability reflects the increased probability of the emitting molecule exiting the confocal volume without being immediately replaced by a new one. Ultimately, if the lag time is much longer than the average residence time of the molecule, the joined probability of observing two photons at times t and (t + τ) will be identical to the probability of observing two photons at any time. Therefore, for very long lag times, the correlation (as described by Eq. 18 and Eq. 19) tends to unity. The same reasoning can be applied to understand the correlation decay connected to photophysical effects that result in dark states (e.g., quenching or triplet states). To better understand the properties of the correlation function, we can express the intensity as, with i j and I bg being the intensity of a single fluorophore and the background intensity at time t, respectively. Under this description, the correlation function from Eq. 18 adopts the form, Assuming that ⟨ I bg (t) ⟩ t has a negligible contribution compared to other quantities, Eq. 21 reduces to, (22) where the amplitude of the correlation clearly depends on the inverse of the average number of molecules Nobserved in the confocal volume. As implied by Eq. 22, FCS is not exclusively restricted to the single-molecule regime and is often applied in conditions under which multiple molecules diffuse through the confocal volume. Importantly, when measurements are performed at sufficiently low concentrations of N molecules, the background term may contribute to the correlation amplitude and, if not accounted for, can affect a proper determination of N. Importantly, the ability to function at extremely low concentrations makes FCS an ideal technique in the context of IDRs that are prone to undergo self-assembly [142, 143] . Nanosecond FCS (nsFCS) extends conventional FCS to sub-microseconds timescales by distributing photons across multiple detectors. The application of multiple detectors circumvents the intrinsic limitations (deadtime after pulse) that affect the correlation on individual detectors. Access to the sub-microsecond timescales allows the assessment of the contribution of static quenching (e.g., caused by dye-residues and dye-dye stacking), protein dynamics, and other photophysical effects [134, 144, 145] . Of particular interest in the context of IDRs is the application of nsFCS to provide an estimate of chain conformational dynamics. ns-FRET-FCS provides a measure of the protein dynamics through the characteristic correlated relaxation in the donor-donor and acceptor-acceptor correlations and the anti-correlated relaxation of the donor-acceptor cross-correlation. The anti-correlated decay directly reflects the anticorrelated intrinsic nature of FRET, where an increase in acceptor emission corresponds to a decrease in the donor emission and vice versa. When performed at the single-molecule level in a subpopulation specific way, the amplitude of the dynamic component (in the absence of quenching) of the correlation is directly related to the variance of transfer efficiency fluctuations in the solution, according to [146, 147] , Here, A ij is an amplitude component related to the number of fluorescent molecules in the confocal volume, c AB is the antibunching amplitude, τ is the lag time between the two detected photons, τ AB is the correlation time of the antibunching component, c T is the amplitude of the triplet component, τ T is the correlation time of the triplet component, and τ b is the correlation time associated with chain dynamics. It is important to stress that the relaxation time τ b of these three correlations represents a FRET-filtered value of the real reconfiguration time of the protein. Gopich et al. determined a simple correction factor that enables the extraction of the reconfiguration time of the protein [146] . This reconfiguration time can be directly linked to polymer quantities such as the characteristic times derived in Rouse and Zimm models [148] [149] [150] [151] [152] . Finally, since this approach provides access to the variance in the transfer efficiency fluctuations, it can be combined with single-molecule FRET and lifetime measurements to infer properties of the distribution of transfer efficiencies. Single-molecule contact formation dynamics can also be probed using photon-electron transfer (PET) between a single fluorophore and an aromatic residue (or other quencher attached to the protein) [134, 143, 145, 153, 154] . In PET-FCS experiments, the fluorophore forms transient static complexes with the quencher. Therefore, the amplitude c q and characteristic time τ q associated with the static quenching in the correlation contains information on both the on-and off-rate of contact formation: τ q = 1/(k on + k off ) and c q = k on /k off . Importantly, static quenching is not diffusion limited, such that the on-rate must be calibrated with a known diffusion-limited quenching process to extract the real on-rate of contact formation. For comparison, the dynamic quenching between dyes and aromatic residues, as measured by changes in the fluorescence lifetime, has been found very close to the diffusion-limited regime and offers a convenient strategy for extracting correction factors for reaction-limited quenching. Furthermore, the on-rate of contact formation as measured in PET-FCS experiments can be related to the reconfiguration time measured by ns-FRET-FCS when computing the first passage time of the corresponding polymer model [134, 155, 156] . In the scenario where internal friction dominates the protein dynamics, Cheng et al. proposed a convenient equation where the contact time τ c IF is computed by using the Szabo-Schulten-Schulten theory [157] in terms of 1D diffusion in a potential of mean force for the Rouse and Zimm model for internal friction [150] . This leads to the remarkably simple expression, Where τ i is the internal friction characteristic time, R is the root-mean-square separation between the dye and the quencher and R c is the contact radius for quenching. Recent cross-lab verification demonstrated that smFRET can provide highly reproducible results across different laboratories when the instruments are properly calibrated [158] . Calibration of experimental setups can be obtained by measuring reference samples that provide an estimate of the excitation and detection efficiency of the detectors and correct for the different quantum yields of the fluorophores [25, 27] . An elegant solution has recently been proposed [158] [159] [160] [161] and relies on the use of alternating-laser excitation (ALEX) [162, 163] or pulsed interleaved excitation (PIE) [164, 165] . In brief, fluorescence detection of donor-only and acceptor-only molecules provides insights on the direct excitation of acceptor and cross talk, while a comparison of the stoichiometry ratio of donor-acceptor labeled molecules as a function of transfer efficiency (e.g., polyproline or other systems of interest) enables estimates of the relative corrections for detection efficiency and quantum yield across the donor and acceptor channels. Investigating the dependence of the stoichiometry ratio vs. transfer efficiency requires either multiple samples with different mean transfer efficiency or altering transfer efficiency by changing the solution conditions of the same sample, although it should be noted that altered solution conditions may alter the quantum yield of the fluorophores or introduce quenching, which may further complicate this analysis. An important decision in designing smFRET experiments is the choice of the experimental strategy, e.g., whether one is investigating freely diffusing or immobilized molecules. Which approach to take is determined by several factors, including the accessible experimental setup and the biophysical or biochemical question being addressed. A common solution for the investigation of immobilized molecules is the use of Total Internal Reflection Fluorescence (TIRF). TIRF microscopy relies on evanescent illumination of samples tethered to the surface [166, 167] , reducing background fluorescence from labeled molecules in solution. TIRF microscopy often uses camera-based detection, enabling the simultaneous observation of multiple molecules and the study of out of equilibrium kinetics. Confocal single-molecule fluorescence microscopy enables measurements of both freely diffusing and immobilized molecules. The use of single-photon counting avalanche photodiodes and Time Correlated Single Photon Counting (TCSPC) electronics provide access to fast dynamics, kinetics, and photophysical properties of the systems such as triplet and fluorescence lifetimes. Owing to the high temporal resolution, confocal single-molecule fluorescence experiments have captured even rare events such as the transition path time from a folded to unfolded state or from a bound to unbound state [23, 168] . Several approaches have been developed to enable the investigation of higher concentrations regimes and out of equilibria phenomena in confocal setups. For example, zero-mode waveguides have been used to extend the concentration boundaries of single-molecule confocal detection up to micromolar concentrations [169, 170] . Similarly, microfluidic devices with fast mixing allows following the kinetics of the system of interest, at the single-molecule level, from hundreds of microseconds up to tens of seconds. [171] [172] [173] [174] [175] [176] [177] . Recurrence analysis of single particles (RASP) also captures the kinetics of freely diffusing molecules by identifying those molecules that after passing through the confocal volume re-enter in the confocal volume. By studying how the conformations of these molecules changes at different lag times, information on kinetics can be reconstructed [178, 179] Once the experimental setup and strategy have been chosen, the next step is the selection of appropriate labeling positions. The average Förster radius across the fluorophores suitable for single-molecule FRET lies between 5 and 7 nm, limiting the sensitivity of the method to distances approximately larger than 2-3 nm and smaller than 10 nm (see Fig. 4c ). While knowledge of the protein structure allows for the tailoring of dye placement in folded proteins, more difficult is the choice of label position when studying IDRs, since the sequence properties of the chain can significantly alter the root-mean-square interdye distance. A distance of approximately 50 -60 amino acids provides an appropriate dynamic range for sequences with a broad range of charge compositions, ranging from expanded polyelectrolytes to collapsed polyampholytes [180] . It is important to note that proline-rich sequences can adopt very extended configurations [38, 181] . Sampling different interdye positions within the same IDR can further improve the ability to to quantify the dependence of the related interdye distance with the sequence length of the measured segment, providing access to the associated scaling exponent [20, 65, 133, 182] . As mentioned, an estimate of the expected distance between two pairs of residues can be derived using appropriate polymer models or simulations [66, 67, 109, 180, 181, [183] [184] [185] [186] [187] . The amino acid sequence raises additional constraints with respect to the optimal strategy for labeling. Both FCS and FRET measurements rely on covalently labeling proteins of interest with one or more fluorescent dyes. The labeling strategies typically take advantage of endogenous cysteine residues or introduce novel cysteines via mutation. These cysteine residues can be covalently modified with fluorescent dyes via maleimide chemistry [188] . Given the general scarcity of cysteine residues in most protein sequences, it is not uncommon for an IDR of interest to contain one or even zero endogenous cysteines. In this scenario, mutations that convert small polar amino acids (e.g., serine or glutamine) to cysteine (or vice versa when removing unwanted endogenous cysteines) are generally expected to have minimal impact on the conformational behavior of a disordered protein owing to the approximate chemical equivalence of the residues. Nevertheless, scenarios in which altering the number of cysteine residues in the protein can arise, in which case alternative labeling strategies are required. The introduction of non-natural amino acids and enzymatic reactions for site-specific labeling presents a set of approaches that move beyond the intrinsic limitations of cysteinebased labeling methods. For example, the use of the enzyme sortase A has enabled site specific labeling of proteins that contain substantial cysteines and would be otherwise unamenable to site specific labeling by maleimide chemistry [189, 190] . Sortase A catalyzes the ligation of an "LPETG" motif with a "GGG" motif [191, 192] . In this way, a linker containing a fragment of a protein that harbors a single cysteine can be utilized to enable maleimide chemistry on the sole cysteine residue [193] . The rest of the protein that contains multiple cysteines can be ligated to the singular-cysteine containing protein fragment. Conversely, the use of split-inteins can enable maleimide labeling of multiple cysteine residues across a protein that has been separated into fragments that contain one cysteine each, followed by ligation of the fragments with native chemical ligation [194, 195] . Nonnatural amino acids, alone and in conjunction with Click chemistry, can enable site specific labeling which can be critical in the context of three-or four-color smFRET experiments, although the incorporation of non-natural amino acids can lead to complications in protein expression yields [135, [196] [197] [198] [199] [200] [201] [202] . Additionally, in sequences where mutating endogenous cysteine residues is likely to disrupt protein conformation, non-maleimide chemistry methods offer an alternative labeling strategy. For example, short peptide sequences (A4/ Q-Tags) have been used to site-specifically label several proteins [203] [204] [205] [206] [207] . Q tags utilize a transglutaminase catalyzed reaction to ligate cadaverine functionalized fluorophores to the glutamine residue present in Q-tag motifs (PNPQLPF, PKPQQFM, GQQQLG) [203] . Unlike sortase or maleimide chemistry, the A4 tag utilizes a phosphopantetheinyl transferase reaction to conjugate CoA conjugated substrates to the serine present in the A4 motif (DSLDMLEM) [205, [208] [209] [210] The subsequent key step rests on the choice of dye. With the advent of superresolution microscopy, a broad range of fluorophores and donor-acceptor combinations have become available, each with different photophysical and chemical properties. It is worth mentioning that, when targeting the cellular environment in single-molecule experiments, a choice of red-shifted fluorophores (compared to the often used 480-520 nm range of excitation) has proven to reduce the fluorescence contribution originated by the cellular background [179] . Each fluorophore differs not only in excitation and emission wavelengths, but also in terms of geometry, hydrophobicity, net charge and linker flexibility and length. As a result, different dyes have different possible impacts on protein conformation, depending on the sequence-encoded physical chemistry of the given protein. Although several studies have implicated dyes as a source of non-native interactions that can alter conformational behavior [211] [212] [213] [214] , with careful dye selection and validation, these issues can be minimized, and a number of studies have found that dyes can have a minimal impact on ensemble behavior [35, 65, 215, 216] . However, some relevant examples do require attention. Due to the high hydrophobicity, the popular ATTO 647N dye has been reported to cause a substantial collapse of an IDR, at variance with many other dyes [34] . This result suggests that caution must be taken when using this fluorophore on IDRs. Focusing on the role of dye charges, many of the commonly used fluorophores, such as Alexa 488 and 594, carry a −2 negative charge each. This net charge may become particularly relevant when investigating polyampholytic sequences with local regions of net positive charge, or with proteins that possess a net positive charge, such that these electrostatic effects must be accounted for when modeling or interpreting the experimental data [180] . Finally, the choice of the dye may also depend on the specific environment in which the protein is located: recent work has revealed preferential interaction of specific fluorophores with lipids [217] . The reality is that there is no "one-size fits all" solution for choosing dyes. For some proteins, certain dyes will likely have an impact on molecular details, while in others they will not. The determinants of dye effects reflect the physicochemical properties of dyes and the sequence-encoded physical chemistry present in a given protein. As such, due diligence is required when considering if and how dyes may be impacting conformational behavior. This may include testing different combinations of dye pairs to ascertain if different dyes reveal different results. Ideally, orthogonal verification with other techniques (either computational and/or experimental) offers a convenient route to refute or confirm findings [215] . When approaching the data analysis of smFRET experiments, several assumptions undergo transforming the measured transfer efficiency into a distance distribution. The most commonly cited assumption is the isotropic orientation of the fluorophores described by the κ 2 parameter in the definition of the Förster radius (R 0 ). Although simulations may achieve a quantitative estimate of κ 2 , a measurement of the steady-state and/or time resolved anisotropy of the two fluorophores provides quantitative insight into possible conformational restrictions of fluorophores orientation [118, 218] . Less discussed but equally important when comparing results from single-molecule fluorescence experiments with simulations is an appropriate estimate of the characteristic timescales at play. It is crucial to consider the timescales' impact on the interpretation of the mean transfer efficiency, with particular attention needed with respect to the rate of fluorophore tumbling and fluorescence lifetime, as well as chain dynamics. As mentioned in section 2.3, Eq. 6 assumes that the dynamics of the chain are faster compared to the interphoton time, but slower than both dye tumbling and the fluorescence lifetime (Fig. 5) . This behavior is a precondition for invoking the approximation that fluorophores are experiencing isotropic orientation. Once all these aspects are considered, a root mean square distance is extracted based on the mean transfer efficiency ⟨E⟩. The measured distances provide a readout on the separation of the fluorophore, as opposed to a direct readout on the residue-residue distance between labeling positions and fluorophores linker needs to be accounted for [118] . For the dye pair Alexa 488 and 594, the dye linkers' contribution to the root-mean-square interdye distance for an unstructured protein corresponds to an increase in the protein sequence length of about nine amino acids [184, 185, 219 ]. Various theoretical frameworks appropriate for the integrations of single-molecule fluorescence spectroscopy with results from atomistic simulations have emerged over the last decade, with many of these being directly applicable to the study of disordered proteins. Rather than providing an exhaustive technical description of these methods, we will briefly overview the conceptual approaches and practical methodologies available. The most straight-forward approach involves performing unbiased molecular simulations of a protein of interest without dyes, computing relevant observables from the simulations, and then comparing those observables with the analogous values obtained from experiment [20, 220, 221] . In the context of smFRET experiments, this would involve computing distributions of inter-residue distances and then comparing those distances with the analogous distribution obtained from experiments [161] . For FCS, this would involve computing a hydrodynamic radius (R h ) from simulations and comparing that value with the apparent R h obtained from the diffusion constant [222, 223] . For nsFCS, this would involve computing molecular reconfiguration times and comparing those times with timescales measured by experiment [134, 224, 225] . This approach makes two key assumptions. Firstly, it assumes that the dyes do not significantly contribute to the conformational ensemble obtained from simulations, such that the ensemble generated in the presence/ absence of dyes is equivalent. Secondly, it assumes that analytical models (i.e., P(r), for determining inter-dye distance, see section 2.3) offer an appropriate route to back-calculate molecular properties that can also be obtained from simulations. Both these assumptions are reasonable, well established for many systems, and often taken to be true irrespective of if a comparison between simulation and experiment is to be performed. This naive comparison offers a convenient first approach to demonstrate agreement between simulation and experiment. Moreover, if the agreement is poor, it provides a starting point to diagnose the origin of discrepancies. While simulations lacking fluorophores are -by definition -reporting on the naturally occurring state of the protein, for a quantitative comparison with single-molecule spectroscopy, this approach has some shortcomings. For one, the absence of explicit dyes ignores their conformationally heterogeneous nature, and as such, these simulations are unable to interpret/assess dye-protein interactions, should they occur. Furthermore, a simulation that lacks explicit dyes does not generally take dye photophysics into account. Consequently, an alternative approach involves the explicit inclusion of dyes in the simulations [147, 161, 215, 226] . Here, simulations of biomolecules with fluorescent dyes are run, and then relevant observables (e.g., FRET transfer efficiencies) are calculated from ensembles using dye orientation directly. The resulting computationally-derived FRET results can then be directly compared with mean transfer efficiencies obtained from smFRET experiments. While conceptually appealing, the inclusion of fully parameterized dyes in all-atom simulations is somewhat less common than one might expect. This reflects several challenges that dyes introduce in the context of all-atom simulations. One challenge in the inclusion of explicit dyes is the appropriate forcefield parameters. As mentioned, even for protein-only systems, correctly parameterized force fields that accurately describe IDR configurational rearrangement and dynamics are challenging. This is despite the wealth of data surrounding protein physical chemistry and structure. In contrast, large heterocyclic aromatic dyes are comparatively less well-studied. Consequently, the validity of dye parameters is less clear. Furthermore, there is good reason to expect that fixed-charge force fields may struggle to correctly capture the physical chemistry of large heterocyclic dyes due to the complex delocalized electron systems that are distributed across them. Finally, interpreting transfer efficiencies directly from dyes requires consideration of the dye photophysics, including the orientational dependence of the dipole-induced energy transfer that gives rise to FRET [161, 226] . In principle, the explicit inclusion of fluorophores allows the impact of dye-protein interactions and the associated photophysics to be directly taken into account when computing transfer efficiencies, which, on the surface, appears ideal. However, in practice, it also introduces many potentially poorlydefined parameters that may bias or confound the calculation of FRET transfer efficiencies if done incorrectly. Moreover, given fluorescent lifetimes are inherently stochastic, this necessitates sufficient sampling to capture both IDR conformational rearrangement and dye-rearrangement. Taken together, the inclusion of explicit dyes is certainly the appropriate long-term strategy. However, with the exception of a small number of groups who have pioneered the aforementioned technical and theoretical issues, in the absence of wellcharacterized parametrization of dye and protein force fields, it remains unclear if the additional challenges introduced by including explicit dyes is more of a help or hindrance. A final approach is one in which simulations are performed initially without dyes, but in a post hoc processing step the resulting ensemble has dyes (or clouds of dyes per protein conformation) rebuilt [35, 87, 118, [227] [228] [229] [230] . Using this approach, transfer efficiencies (or dye-dye distances) can be back-calculated. This offers a convenient middle-ground in that dye geometry and size are explicitly taken into account, yet the challenges associated with dye parameters are avoided. It does, however, operate under the assumption that the presence of dyes has no impact on the conformational ensemble explored in the simulations. Depending on the implementation details, this approach also runs the risk of over-representing conformations in which dye-attachment residues are more exposed, given only conformers where dyes can be added are included in the calculations of transfer efficiencies. Finally, this type of reconstruction requires further assumptions regarding the timescales associated with the fluorophores tumbling. The reconstruction of dye ensembles can be achieved in a number of ways and is facilitated by specific software tools [187, 228, 231 ]. The three approaches described above far operate under the assumption that simulation and experiment will agree "out of the box". In reality simulations and experiments frequently do not show quantitative (and sometimes even qualitative) agreement. This is generally taken (fairly or unfairly) to reflect weaknesses on the side of the simulations, specifically due to force field errors and/or limited sampling. One solution to this challenge is development and improvements in both force fields (as mentioned) and the development of more powerful supercomputers [232] [233] [234] . In parallel, a number of approaches for ensemble re-weighting (also known as ensemble refinement) have emerged. These reweighting strategies alter the probability of each conformation in the ensemble to shift the expected values to better match the experiment. While a mismatch between simulations and experiments is generally taken to mean the simulation is at fault, this need not necessarily be the case, and scrutiny with respect to possible experimental artifacts (e.g., fluorophore quenching altering transfer efficiencies) should be taken [144] . To summarize briefly, reweighting involves the process of re-defining the probability of each conformation in an ensemble. For clarity, we define conformation here in terms of a frame or snapshot of the simulation -i.e., in the case of a non-reweighted, correctly sampled set of n conformations taken from an NVT ensemble it is assumed that any conformation i selected at random from the ensemble is present with probability, Where, as for any discretized probability distribution, ∑ i = 1 n p i = 1 (26) As such, the ensemble-average value for any given observable with an instantaneous value (e.g., end-to-end distance, ⟨R e ⟩) can be computed as, 27) where R e i reflects the end-to-end distance of conformation i. There is nothing complex about Eq. 27 -in fact, this is simply a reformatted version of the arithmetic mean. When we calculate the mean we inherently assume every element in that calculation is equally important, such that every element appears with the same probability of 1/n. Reweighting reflects a change in this assumption where we instead re-define the probabilities such that not every element is equally likely, under the constraint that the probabilities must sum to 1. Several key factors must be considered for ensemble refinement. Firstly, when reweighting a large ensemble of states, we typically wish to apply systematic changes that simultaneously alter our observable to match some experiment while doing so in a manner that minimizes the loss of entropy. As such, maximum entropy-based methods have emerged as a key component of most reweighting schemes [235] [236] [237] . During maximum entropy reweighting, the collection of conformation-specific probabilities are altered such that the resulting probability distribution of an observable matches a prescribed distribution, or the reweighted ensemble average matches some experimentally observed value. This requirement is reached under a constraint in which probabilities must sum to 1 and the entropy S(p), defined as, is maximized. Entropy-maximization does not explicitly take uncertainty into account. This uncertainty can lie on the side of the experiment in terms of precision or accuracy but can also reflect uncertainty in the simulation. This uncertainty is often considered through some kind of Bayesian approach that allows fine-tuning of uncertainty in a variety of ways [238] [239] [240] [241] [242] [243] . Specifically, Bayesian inference provides a general framework through which a posterior model can be generated based on a prior model and the inclusion of newly observed data [235] . Several modern frameworks have emerged to facilitate simulation reweighting with large ensembles of disordered proteins in mind. These include Bayesian Inference of Ensembles (BioEn), Convex OPtimization for Ensembl Reweighting (COPER), Bayesian/ Maximum Entropy (BME), and Extended Experimental Inferential Structure Determination (X-EISD) [238, 240, 241, 243] . Although these tools have been recently developed and applied to disordered proteins, a large number of additional tools have been developed over the years (as reviewed by Bonomi et al. [236] ). An in-depth discussion of the theoretical and practical differences between these methods goes beyond the scope of this review. However, each approach offers distinct advantages and disadvantages, and in principle are compatible with the integration of multiple different types of experimental data with distinct uncertainties. An important caveat with respect to reweighting strategies reflects the fact that these approaches are ultimately limited by the quality of the starting ensemble [244, 245] . Put another way -you cannot reweight what is never observed in the original simulations. Consequently, when starting ensembles are sufficiently large and sufficiently close to reality, reweighting can be a powerful approach to fine-tune simulation results to improve the signal-to-noise. However, if a starting ensemble is sufficiently incorrect, no amount of reweighting can rescue it. The gold standard here is to include two orthogonal methods and show that re-weighting simulation results with respect to one experimental dataset improves agreement with the other [135, 245] . In this context, small-angle X-ray scattering is a good complementary technique to verify reweighted ensembles generated when ensembles are reweighted based on results from single-molecule spectroscopy. As a final note, rather than reweighting unbiased simulations to match experimentally measured distributions, an alternative set of methodologies involve applying restraints or bias terms directly to the simulation. In this approach, a cost function that penalizes conformational behavior that deviates from experimentally compatible results is applied [66, [246] [247] [248] [249] . The nature of the cost function, how it is applied over long-timescale simulations, or how experimental uncertainty is dealt with vary depending on the implementation. While this approach has been used extensively in the context of structure determination, it has been used less frequently in the context of integrating singlemolecule spectroscopy with all-atom simulations. For a comparison between restraints and reweighting in molecular simulations see work by Rangan et al. [250] The preceding section introduced maximum entropy and Bayesian inference as theoretical frameworks through which reweighting or restraining can be achieved. It is worth noting that the alternative and complementary approaches including maximum parsimony, maximum likelihood, and maximum caliber provide alternative theoretical frameworks for ensemble selection and reweighting. These approaches can be applied either to bias simulations or as a post-facto reweighting strategy, as reviewed by Bonomi, Gaalswyk, and Ghosh, respectively [236, 246, 251] . Having established the key features of all-atom simulations, single-molecule spectroscopy, and the integration of these methods, we will finish by considering a number of examples in which integrative experimental and computational modeling has revealed synergistic insight. The impact of solvent quality on denatured proteins was evident already in early studies of protein denaturation with single-molecule FRET [109, 252] as a shift in the transfer efficiency population associated with the unfolded state. The work of Sherman & Haran directly implied a coil to globule transition in the conformations of the unfolded state [131] . In this context, important early work that integrated single-molecule spectroscopy and simulations was performed by Best, Gopich, Eaton, and Schuler [185, 253] . Using both all-atom MD simulations and simple coarse-grained Langevin simulations, Merchant et al. showed a continuous transition in global dimensions of Protein L and cold shock protein CspTm observed by smFRET is reproduced as a function of solvent quality by simulations [253] . The integration of simulation and experiment here played a crucial role in helping to interpret smFRET data by demonstrating that the inferred radius of gyration (R g ) obtained from smFRET matched the R g values obtained from simulations. This study represents one of the earliest examples in which all-atom simulations and smFRET were combined, and in many ways, defined the template for this class of study. Subsequent work using coarse-grained models has arrived at similar conclusions and shows good agreement with extant smFRET data [68, 254] . The importance of solvent quality for disordered and unfolded proteins was again the topic of further study by Best and Schuler. In a series of papers, a comprehensive investigation of chain dimensions in response to denaturant concentration combined several different disordered proteins and a collection of methods including all-atom simulations, FCS, smFRET [34, 78, 220] . In work by Zheng et al., unbiased all-atom simulations without explicit dyes were performed as a function of denaturant concentrations [220] . Using these ensembles, intermolecular distances were then back-calculated, revealing a modest but continuous expansion in IDR global dimensions as a function of denaturant concentration. These computational results compared favorably with analogous measurements made by smFRET and SAXS. In a separate study by Borgia & Zheng et al., smFRET and SAXS data were used to reweight ensembles generated from all-atom simulations using a Bayesian approach. The resulting ensembles were compared against changes in global dimension obtained by FCS and dynamic light scattering (DLS) [34] . This study also identified a modest but meaningful chain 'contraction' as denaturant concentration is decreased (Fig. 6 ). In parallel, analogous integrative biophysical studies made on several other systems came to similar conclusions, supporting a model in which the solvent quality tunes the dimensions of unfolded protein ensembles, but that these ensembles do remain relatively expanded [135, 255, 256] . This is in reasonable agreement with measurements made by SAXS that inferred that if any chain-compaction occurred at all, it would be modest [257, 258] . Taken together, these results have helped establish that as unfolded polypeptides transition from high concentrations of denaturant into native conditions, there is a sequence-dependent contraction in global and local chain dimensions. The magnitude of this contraction depends on the chemical nature of the denaturant and protein sequence. In many foldable proteins, this contraction appears to be in the range of 10-25% in global dimensions prior to bona fide folding [256] . For disordered proteins the extent of compaction (or lack thereof) this contraction can range from a few percent to over 50%, depending on the amino acid sequence and denaturant [24, 34, 38, 133, 220, 259, 260] . Despite this substantial effort, a quantitative and absolute agreement between SAXS and FRET-derived measurements remains contentious for at least some systems [211, 212, 256] . Despite the valid and important concerns regarding the impact of dyes, a general consensus that disordered/unfolded proteins are sensitive to changes in their solution environment seems undeniable [256] . These conclusions need not be at odds with the observation that foldable proteins undergo a sharp folding transition when solution condition conditions permit [261] . As a final point, the magnitude, modality, and physical origin of solution-dependent changes in IDR conformational behavior will depend on the amino acid sequence and the chemical nature of the co-solute [255, [262] [263] [264] [265] . This sequence-encoded sensitivity has been proposed to offer IDRs a mechanism to act as biological actuators and sensors of cellular state [36, 260] . The apparent discrepancy between SAXS and smFRET has an additional possible origin: residual structure leading to deviations from homopolymer models used to infer smFRETderived distances [35, 69, 70, 266] . Analytical homopolymer models are remarkably good at quantitatively describing the conformational behavior of IDRs [20, 133, 136] . However, for IDRs with a substantial amount of residual structure or peculiar sequence patterning, there is an expectation that homopolymer models will become progressively less reliable [67, 266] . The possible impact of structural heterogeneity was examined simultaneously and independently in two studies. Song et al. applied simulations and theory to analyze extant smFRET data for unfolded proteins to argue that anisotropic biases in the underlying conformational ensemble could explain apparent discrepancies between SAXS and smFRET data [69] [70] [71] . Using coarse-grained simulations to construct transfer efficiency distributions, the authors show that even relatively small but persistent conformational biases can have a substantial impact on distances derived from transfer efficiencies. study that combined all-atom simulations, smFRET, and SAXS of both labeled and unlabelled IDRs under native and strongly denaturing conditions. In this work, a dyereconstruction approach was applied in which clouds of dyes were rebuilt around simulations run in the absence of dyes. A key result from this study reflects the fact that homopolymeric models are better equipped to describe conformational behavior under denaturing conditions. This result reflects the fact that in the limit of high denaturant concentration, the chain has -in effect -become a bona fide homopolymer. In contrast, under native conditions, sequence-dependent residual structure can lead to deviations from true homopolymeric behavior, limiting the accuracy when pairwise intra-chain distances are used to inform on global dimensions. An analogous study by Gomes et al. integrated smFRET with nuclear magnetic resonance (NMR) spectroscopy, SAXS, and simulations and came to similar conclusions [216] . Here, coarse-grained simulations in which explicit dyes with modeled photophysics were used to construct realistic transfer efficiency histograms. In agreement with Fuertes & Ruff, the authors found that integrative modeling is necessary to fully reconcile seemingly discordant observations due to local conformational biases. The need for several distinct methods that provide unfolded-state behavior across distinct length-scales has also emerged in other systems [24, 135, 267] . Taken together, the application of homopolymer models remains a critical tool for the analysis and interpretation of IDRs. As it turns out, the specific choice of polymer models often introduces only small systematic variations on the extracted root-means-square distances from single-molecule data [27, 120] . However, underlying assumptions baked into polymer models may not hold true across various interdye distances of the protein due to long-range anisotropic interactions or local residual structure [22, 35, 268] . It is therefore important to test whether the assumptions associated with a given model are robust across multiple interdye distances. Polymer models can be assessed by comparing the persistence length for a wormlike chain model or the Kuhn segment for a Gaussian Chain. The origins of any observed deviations must then be examined. At the same time, heteropolymer theories often describe the local contribution of compositional heterogeneity over a specific inter-residue distance in terms of an effective bond segment that rescales the second moment of the ideal chain distribution. Different segments of the chain will adopt different effective bond lengths, such that no single effective bond length is expected to fit an entire chain. The expected heterogeneity in the effective bond lengths along a heteropolymeric protein provides a possible explanation for the empirical success of using freely jointed chain (or similar) homopolymer models on systems that are clearly far from theta-solvent conditions. As such, one should carefully consider the physical meaning of the extracted distance in the context of appropriate theories and models. In this respect, the application of homopolymer models to the interpretation of heteropolymeric IDRs should be used under the guise of "What is the homopolymer that best describes my data?" as opposed to "Does my heteropolymer behave as a homopolymer?". [267] . The ability of single-molecule spectroscopy to provide direct insight into the molecular dynamics of a given IDR has opened up additional avenues of experimental characterization and comparison between simulation and experiment. Soranno et al. combined simulations, single-molecule spectroscopy, and theory to build a complete molecular dissection of the determinants of internal friction in unfolded proteins [115, 134] . By combining smFRET and ns-FCS, the authors were able to probe how fast chain dynamics depends on the interdye sequence length and solvent viscosity, demonstrating that under native condition protein dynamics are often not dictated only by solvent conditions, but more significantly by internal friction effects, where internal refers to intrinsic properties of the protein, such as transient intramolecular interactions and dihedral angle constraints. These results were in remarkable agreement with extant simulation data performed by Piana et al. [73, 134] . Moreover, the conclusions drawn in this study were further confirmed via integrative analysis of alpha-synuclein dynamics using smFRET, NMR, and MD simulations [224] . The integration of simulation and experiments provided a comprehensive molecular readout that implicates non-local intramolecular interactions and a second contribution from the retardation of dihedral rotation, although these two effects may be inherently coupled. Integrating smFRET with simulations allowed Metskas & Rhoades to reconcile apparent discrepancies between published structures of the intrinsically disordered C terminal domain of troponin-1 [269] . Multiple high-resolution structures lacked agreement with each other and with NMR based measurements, highlighting the conformational heterogeneity that exists in the system. MD simulations performed with discordant published structures as starting points allowed them to gain an understanding of the conformational landscape the protein adopted. Interestingly, although good agreement between smFRET measurements and MD simulations was obtained when comparing folded subregions, substantial disagreement was arrived at when smFRET measurements of the intrinsically disordered C terminal domain were compared with MD simulations. Hypothesizing that this discrepancy reflected a difference in the timescales of the techniques, the authors applied MC simulations to construct a large ensemble of conformations for the disordered region. This ensemble showed good agreement between smFRET, MD, and MC simulations, and the most populated conformations present in the MC simulations matched the three published structures that were 'incongruent.' This study elegantly demonstrates that if distinct timescales are probed, it is possible to obtain apparently contradictory yet entirely valid results. Zosel et al. integrated extensive single-molecule fluorescence data and all-atom simulations to assess complex binding kinetics between the disordered protein ACTR and its conformationally heterogeneous folded partner NCBD [23] . Single-molecule experiments revealed that an evolutionarily conserved proline in NCBD undergoes slow cis-trans isomerization. The binding affinity of NCBD for ACTR depends heavily on the isomerization state of this slow-switching proline. MD simulations provided a cogent molecular explanation for the proline-dependent affinities and demonstrated that the molecular structure of the bound complexes differs depending on the proline isomerization state. The ability to reconcile complex and counterintuitive kinetic behavior was entirely dependent on the ability to observe conformational rearrangement on a range of timescales and length scales. Similarly, the ability to offer a cogent structural explanation for this behavior rests on the application of molecular simulations to the binding event. Taken together, this study offers an example in which simulations and experiments offer complementary insights into the structure and dynamics of a complex molecular mechanism. simulations and single-molecule spectroscopy to determine the physical basis for slow protein folding in a small triple-helix designed protein [168] . By first analyzing photon trajectories from FRET histograms using a maximum likelihood method [271] to obtain relaxation rates, the authors reveal a sharp pH dependence on the folding rates, where folding is dramatically faster at low pH. A similar pH dependence on folding is also observed in all-atom molecular dynamics simulations. By strengthening or weakening the non-bonded interactions associated with salt bridges by altering the underlying forcefield, the authors are able to perform a computational experiment to decouple the observed rate effects on salt-bridge strength vs. net charge of the molecule. This ingenious analysis revealed that salt-bridge strength is the key determinant of the transition time, providing a clear example in which the types of theoretical experiments that simulations afford offers direct insight into a physical process that would otherwise be impossible to measure. The integration of single-molecule spectroscopy and simulations has more recently played key roles in providing a high-resolution window into dynamic protein:protein and protein:RNA complexes [65, 230, 272] . Ensemble methods typically hide the heterogeneous nature of IDPs, masking dynamic interactions that may underlie biological function. In a series of papers exploring polyelectrolytic complexes, the integration of smFRET, nsFCS, and MD simulations has been essential to deconvolve complex heterogeneous systems. In a landmark study, Borgia, Borgia, & Bugge et al. demonstrated that a binary complex formed between the negative polyelectrolyte prothymosin alpha (ProTα) and the positive polyelectrolyte linker histone H1.0 (H1) formed a high-affinity complex in which both proteins remain fully disordered [65] . Using a bespoke coarse-grained model that is directly compared against 28 distinct intra-and inter-molecular distances measured by smFRET, the authors demonstrate remarkably good agreement and provide a comprehensive molecular picture of the resulting high-affinity complex. Importantly, on the experimental side, the authors compare results with two different sets of dye pairs, and on the computational side, simulations are run both with and without explicit dyes. In addition to smFRET and simulations, extensive NMR data corroborate the disordered nature of the complex and provide additional key insights. In two subsequent studies, Holmstrom and Heiðarsson & Mercadante probed the dynamic nature of intrinsically disordered proteins in the context of protein:protein, protein:RNA, and protein:DNA interaction [272, 273] . In both of these studies, single-molecule spectroscopy was combined with coarse-grained MD simulations were able to capture the dynamic nature of the association of an IDP with another protein or nucleic acid. In the bound state, the IDP in question remains both disordered and dynamic upon association with its ligand, where this dynamic association underlies the biological function. [24] . More broadly, both smFRET and FCS offer a means to examine the conformational behavior of IDRs inside and around phase-separated droplets [274, 275] . The true power of integrating molecular simulations with single-molecule spectroscopies lies in the ability to uncover novel biophysical mechanisms. In our final results section, we consider a collection of studies in which specific molecular details have been unraveled through the combination of single-molecule fluorescence spectroscopy and simulations. A long-standing question in cell biology pertains to the molecular basis of recognition and translocation of nuclear transport receptors by the phenylalanine and glycine-rich (FG) disordered regions that line the interior of the nuclear pore complex [276] [277] [278] [279] . An integrative study by Milles & Mercadante et al. combined all-atom simulations with smFRET, NMR, and SAXS to offer a direct molecular picture of the nature of FG interactions with their associated cargo proteins [280] . This work revealed a degenerate network of transient molecular contacts between a nuclear pore protein and its corresponding nuclear transport receptors. These interactions were encoded by distributed adhesive phenylalanine residues in FG motifs where they interact in a multivalent fashion across the surface of the cognate transportin proteins. Despite the lack of specific binding sites and the microscopically weak binding affinities of individual motifs, the resulting macroscopic binding affinity is remarkably high. As such, nuclear transport receptors are tightly bound, yet relatively free to diffuse. This work provides a molecular explanation for the selective partitioning and rapid translocation of transportin-bound cargo proteins across the nuclear pore complex. The physical basis for temperature-induced collapse of disordered and unfolded proteins has been examined via smFRET interpreted via all-atom replica exchange molecular dynamics simulations, pointing to the role of sidechain solvation in driving compaction [183] . This observation was confirmed in subsequent work where temperature-dependent free energies of solvation were used with all-atom implicit-solvent Monte Carlo (MC) simulations to explain corresponding smFRET experiments for a number of different IDRs [281] . In both cases, unbiased simulations without explicit dyes were performed and the radius of gyration (R g ) from simulations compared with the apparent R g calculated from smFRET-derived inter-dye distances. Beyond these classic examples, there are many cases in which single-molecule spectroscopy and simulations have been combined to address specific mechanistic questions. In the context of protein folding, all-atom MD simulations have been used to identify transient non-native salt bridges that are the dominant determinant of transition-path times along the folding barrier [168] . All-atom simulations have been used in conjunction with smFRET of aggregation-prone polyglutamine (polyQ) to demonstrate that -contrary to naive expectation -the biophysical behavior of polyglutamine tracts do not show a discontinuous transition as polyQ length extends between physiological and diseaseassociated lengths [87, 142] . In a similar vein, residual structure in the monomeric state of the aggregation-prone amyloid-beta peptide was examined through an in-depth study that combined smFRET will all-atom MD simulations where explicit dyes were included [147] . [20, 260] . IDRs are frequently involved in molecular recognition, and single-molecule spectroscopy and molecular simulations are well-poised to provide molecular detail on those interactions. A crucial aspect of microtubule function in axons is their ability to undergo dynamic instability, where they experience periods of elongation and depolymerization, a process that is highly regulated by a family of intrinsically disordered Tau proteins [282, 283] . To better understand the first step of microtubule assembly, where tau protein binds soluble tubulin heterodimers, Melo et al. completed an extensive mapping of free and tubulin-bound tau conformations using smFRET [284] . Subsequently, they generated an ensemble of possible tau conformations using Monte Carlo simulations constrained by distances generated from their smFRET measurements. When modeled in proximity of coarse grained tubulin dimers it was possible to visualize how tau binding to multiple dimers could be accomplished. Importantly, this gave insight into the dynamic nature of the interaction. Instead of adopting a fixed structure upon tubulin binding, a "fuzzy complex" is observed, where the disordered nature of Tau allows for the binding of multiple tubulin dimers and highlighted the significance of conformational flexibility upon binding, a phenomena later seen with other IDP binding interactions as well [65, 272, 273] . Finally, in an integrative study that combined MD and MC simulations with single-molecule spectroscopy, Cubuk et al. performed a comprehensive dissection of the three disordered regions in the SARS-CoV-2 nucleocapsid protein [136] . This work revealed distinct structural features that provide a molecular explanation for several previously described binding interactions. In short, the ability to ascribe atomistic-level insight from simulations with analogous observations for a specific subset of intramolecular distances affords high-resolution physical descriptions of complex phenomena in a way that most other techniques do not. The integration of single-molecule spectroscopy and simulations has emerged as a fruitful approach to provide molecular insight into the complex and heterogeneous behavior of disordered proteins. A recurrent theme in many of the studies described above is the need to consider a range of length-scales and time-scales to construct a holistic understanding of IDR conformational behavior. While smFRET provides high spatial accuracy and precision with respect to specific pairs of distances, it is largely blind to conformational behavior that occurs distally to the labeling positions. In contrast, while simulations provide highprecision insight into both global and local conformational behavior, they are limited by possible force field or sampling inaccuracies. As such, the most comprehensive -and arguably informative -studies are those in which smFRET empowers confidence in the simulations (either by confirming simulated results or providing a means to refine them), which in turn allows simulations to report on features that are not directly captured by smFRET [34, 35, 65, 134, 168, 253, 272] . When smFRET and simulations can be combined to make predictions that can be tested via orthogonal methods such as FCS, SAXS, NMR, DLS, or any additional method, the accuracy of inferences made through integrative studies can be directly assessed [65, 73, 216, 240, 280] . Despite substantial successes, several open challenges remain for the effective integration of single-molecule spectroscopy and simulations. A significant challenge is the need for better methods to describe dyes and their photophysics. A number of groups have pioneered work in this arena, yet despite notable successes, the inclusion of dyes in all-atom or coarse-grained simulation simulations is by no means standard practice [65, 120, 161, 226] . As mentioned in the introduction, large heterocyclic dyes are inherently challenging for fixed-charge force fields due to their aromatic nature. The emergence of polarizable force fields offers a potential solution to this challenge [285] [286] [287] [288] . While in fixed-charge all-atom force fields, each atom has a fixed partial charge, in polarizable force fields the local charge density is responsive and variable, depending on the local chemical environment. As a result, polarizable dyes models may offer a more realistic route to describe their physicochemical effects and, potentially, help identify scenarios in which protein:dye interactions are likely. Beyond facilitating better interpretation of smFRET data, an accurate and transferable description of fluorescent dyes would allow experimental groups to computationally screen distinct pairs of dyes to help identify those which are least likely to interact with a given protein. While polarizable models (such as AMOEBA) have historically been viewed as substantially slower than fixed-charge models, recent major efforts to improve performance have yielded simulation times on the order 10-30 ns/day in AMOEBA [289, 290] . As a result timescales relevant for comparison with single-molecule spectroscopy are firmly within reach, suggesting further application of polarizable forcefields is a promising future avenue. A more general challenge for simulations of disordered proteins represents robust methods for the quantification and assessment of conformation sampling. While limitations in standard molecular force fields persist with respect to disordered proteins, even if a perfect forcefield existed, it would not guarantee that accurate estimates of chain conformations and dynamics could be reached. Recent work from Lincoff et al. has argued that while over compaction of standard force fields when describing IDRs is a known problem if better conformational sampling was available, some of the force field limitations may be less severe than they appear [291, 292] . This is not to suggest that forcefield limitations are overblown, but simply to urge a critical assessment of local and global conformational heterogeneity when performing molecular simulations of disordered proteins. Simulations of a few hundred nanoseconds are rarely sufficient for even modestly sized disordered proteins. General best-practices for assessing conformational sampling in IDRs are lacking but would help to guide researchers to understand if poor agreement between simulation and experiment is due to forcefield weaknesses, insufficient conformational sampling, or a combination of the two The integration of single-molecule fluorescence spectroscopy and all-atom simulations has been instrumental in our modern understanding of sequence-encoded conformational behavior in disordered proteins. As more advanced methods for multi-dimensional data integration emerge, integrative studies in which multiple experimental techniques are used to better understand a specific system will likely become more commonplace and more effective. The ability to obtain insight over multiple length-scales and timescales is an essential feature that integrative studies provide. For disordered proteins especially, the need to consider a range of length scales and timescales reflects the inherently heterogeneous and stochastic nature of the conformational transition. Given the fact that molecular simulations and single-molecule fluorescence spectroscopy offer a comparative spatial and temporal resolution, they are an inherently complementary and powerful combination. • Single-molecule spectroscopy offers a collection of methods that provide high-resolution insight into the conformational and dynamical behavior of disordered proteins Proteins exist along a continuum of structural heterogeneity. While some proteins adopt well-defined tertiary structures (far right), intrinsically disordered protein regions (IDRs) lack a defined reference state (far left). Importantly, all proteins are defined by an ensemble, where function is ultimately determined by the combination of chain dynamics and preferential conformations [11, 12, 14] . IDRs are not fundamentally different from folded proteins but are distinguished by conformational fluctuations so large that a single nativestate reference frame is no longer applicable nor useful. Examples of distinct levels of granularity of the representation schemes. As the number of degrees of freedom increases (from left to right), as does the computational cost. In principle, more degrees of freedom should yield a higher accuracy model, although this depends on the actual fidelity of the model. A model with many degrees of freedom is only more accurate if those degrees of freedom and described correctly. Snapshots taken from a simulation trajectory of α-synuclein reveal a scenario in which a subregion of the protein is kinetically trapped while the N and C-termini explore a diverse collection of conformational states. Overview of single-molecule FRET experiment and data (a) Schematic representation of disordered proteins with different mean end-to-end distances. (b) Histograms of photon bursts for the hypothetical ensembles in corresponding panels in (a). (c) The black curve represents the dependence of the mean transfer efficiency on the inter-dye distance as predicted by Förster's theory (eq. 3), shown with conformations annotated. The blue curve depicts the transfer efficiency of a fluctuating Gaussian chain as a function of the average root mean square inter-dye distance. The conformational ensemble of the 71-residue ACTR as a function of denaturant, as obtained from smFRET and all-atom simulation by Borgia & Zheng et al. [34] . A three-dimensional model of the myoglobin molecule obtained by x-ray analysis The structure of proteins; two hydrogen-bonded helical configurations of the polypeptide chain Configurations of polypeptide chains with favored orientations around single bonds: two new pleated sheets Structure of haemoglobin: a three-dimensional Fourier synthesis at 5.5-Å. resolution, obtained by X-ray analysis Classification of intrinsically disordered regions and proteins Intrinsically unstructured proteins: re-assessing the protein structurefunction paradigm Intrinsic protein disorder in complete genomes Why are "natively unfolded" proteins unstructured under physiologic conditions? Intrinsically unstructured proteins and their functions From sequence and forces to structure, function, and evolution of intrinsically disordered proteins The functional importance of structure in unstructured protein regions Versatility from protein disorder Intrinsically disordered proteins in cellular signalling and regulation Dynamic personalities of proteins The sequence-structure relationship and protein function prediction The influence of amino-acid sequence on protein structure How different amino acid sequences determine similar protein structures: the structure and evolutionary dynamics of the globins Relating sequence encoded information to form and function of intrinsically disordered proteins Atomic-level characterization of disordered protein ensembles Polymer effects modulate binding affinities in disordered proteins Promiscuous and Selective: How Intrinsically Disordered BH3 Proteins Interact with Their Pro-survival Partner MCL-1 Disorder and residual helicity alter p53-Mdm2 binding affinity and signaling in cells A proline switch explains kinetic heterogeneity in a coupled folding and binding reaction Valence and patterning of aromatic residues determine the phase behavior of prion-like domains Single-molecule studies of intrinsically disordered proteins Single-molecule fluorescence studies of IDPs and IDRs Single-Molecule FRET Spectroscopy and the Polymer Physics of Unfolded and Intrinsically Disordered Proteins Single Molecule FRET: A Powerful Tool to Study Intrinsically Disordered Proteins Single-Molecule FRET of Intrinsically Disordered Proteins Force field development and simulations of intrinsically disordered proteins Computational and theoretical advances in studies of intrinsically disordered proteins Insights into the binding of intrinsically disordered proteins from molecular dynamics simulation Simulations of disordered proteins and systems with conformational heterogeneity Consistent View of Polypeptide Chain Expansion in Chemical Denaturants from Multiple Experimental Methods Decoupling of size and shape fluctuations in heteropolymeric sequences reconciles discrepancies in SAXS vs. FRET measurements Controlling Structural Bias in Intrinsically Disordered Proteins Using Solution Space Scanning SWI/SNF senses carbon starvation with a pH-sensitive low complexity sequence, Cold Spring Harbor Laboratory Sequence Determinants of the Conformational Properties of an Intrinsically Disordered Protein Prior to and upon Multisite Phosphorylation Molecular interactions underlying liquid-liquid phase separation of the FUS low-complexity domain Molecular Details of Protein Condensates Probed by Microsecond Long Atomistic Simulations Molecular modelling: principles and applications, Pearson education Atomistic Force Fields for Proteins CHARMM36m: an improved force field for folded and intrinsically disordered proteins Developing a molecular dynamics force field for both folded and disordered protein states Development of a Force Field for the Simulation of Single-Chain Proteins and Protein-Protein Complexes Protein simulations with an optimized water model: cooperative helix formation and temperature-induced unfolded state collapse Balanced Protein-Water Interactions Improve Properties of Disordered Proteins and Non-Specific Protein Association Refining All-Atom Protein Force Fields for Polar-Rich, Prionlike, Low-Complexity Intrinsically Disordered Proteins ABSINTH: A new continuum solvation model for simulations of polypeptides in aqueous solutions Approach Rescues Overcollapse of a Disordered Protein in Canonical Protein Force Fields Developing Force Fields from the Microscopic Structure of Solutions Coarse-graining methods for computational biology CAMELOT: A machine learning approach for coarse-grained simulations of aggregation of block-copolymeric protein sequences AWSEM-IDP: A Coarse-Grained Force Field for Intrinsically Disordered Proteins Sequence determinants of protein phase behavior from a coarse-grained model A coarse-grained model for polyglutamine aggregation modulated by amphipathic flanking sequences A coarse-grained model for positionspecific effects of post-translational modifications on disordered protein phase separation Temperature-Controlled Liquid-Liquid Phase Separation of Disordered Proteins Sequence Effects on Size, Shape, and Structural Heterogeneity in Intrinsically Disordered Proteins Investigating the Conformational Ensembles of Intrinsically Disordered Proteins with a Simple Physics-Based Model Utilizing Coarse-Grained Modeling and Monte Carlo Simulations to Evaluate the Conformational Ensemble of Intrinsically Disordered Proteins and Regions Assessing SIRAH's Capability to Simulate Intrinsically Disordered Proteins and Peptides Grained Molecular Dynamics Approach to the Study of the Intrinsically Disordered Protein α-Synuclein Maximum Entropy Optimized Force Field for Intrinsically Disordered Proteins Extreme disorder in an ultrahigh-affinity protein complex The conformational ensembles of α-synuclein and tau: combining single-molecule FRET and simulations How accurate are polymer models in the analysis of Forster resonance energy transfer experiments on proteins? Effects of denaturants and osmolytes on proteins are accurately predicted by the molecular transfer model Conformations of a Metastable SH3 Domain Characterized by smFRET and an Excluded-Volume Polymer Model Conformational heterogeneity and FRET data interpretation for dimensions of unfolded proteins An Adequate Account of Excluded Volume Is Necessary To Infer Compactness and Asphericity of Disordered Proteins by Forster Resonance Energy Transfer Water models for biomolecular simulations Water dispersion interactions strongly influence simulated structural properties of disordered protein states Evolution of All-Atom Protein Force Fields to Improve Local and Global Properties Best Practices for Foundations in Molecular Simulations Molecular dynamics simulations of biomolecules TDP-43 α-helical structure tunes liquid-liquid phase separation and function Empirical Optimization of Interactions between Proteins and Chemical Denaturants in Molecular Simulations Methods for Monte Carlo Simulations of Biomacromolecules Monte Carlo backbone sampling for polypeptides with variable bond angles and dihedral angles using concerted rotations and a Gaussian bias A concerted rotation algorithm for atomistic Monte Carlo simulation of polymer melts and glasses Monte Carlo update for chain molecules: Biased Gaussian steps in torsional space Avoiding unphysical kinetic traps in Monte Carlo simulations of strongly attractive particles Monte Carlo simulations using sampling from an approximate potential Equation of State Calculations by Fast Computing Machines Sampling Methods Using Markov Chains and Their Applications Monomeric Huntingtin Exon 1 Has Similar Overall Structural Features for Wild-Type and Pathological Polyglutamine Lengths Structural Ensembles of Intrinsically Disordered Proteins Depend Strongly on Force Field: A Comparison to Experiment Molecular Dynamics Simulations of Intrinsically Disordered Proteins: Force Field Evaluation and Comparison with Experiment Optimizing Solute-Water van der Waals Interactions To Reproduce Solvation Free Energies Sampling Long-versus Short-Range Interactions Defines the Ability of Force Fields To Reproduce the Dynamics of Intrinsically Disordered Proteins Quantifying uncertainty and sampling quality in biomolecular simulations Best Practices for Quantification of Uncertainty and Sampling Quality in Molecular Simulations Fluorescence energy transfer as a spectroscopic ruler Theory of photon statistics in single-molecule Förster resonance energy transfer Single-molecule FRET with diffusion and conformational dynamics Identifying molecular dynamics in single-molecule FRET experiments with burst variance analysis Detection of structural dynamics by FRET: a photon distribution and fluorescence lifetime analysis of systems with multiple states Monte Carlo Diffusion-Enhanced Photon Inference: Distance Distributions and Conformational Dynamics in Single-Molecule FRET Separating structural heterogeneities from stochastic variations in fluorescence resonance energy transfer distributions via photon distribution analysis Single-molecule fluorescence experiments determine protein folding transition path times Extracting rate coefficients from single-molecule photon trajectories and FRET efficiency histograms for a fast-folding protein Two-state folding observed in individual protein molecules Watching proteins fold one molecule at a time Single-molecule fluorescence spectroscopy maps the folding landscape of a large protein Measuring ultrafast protein folding rates from photon-by-photon analysis of single molecule fluorescence trajectories FRET efficiency distributions of multistate single molecules Probing the free-energy surface for protein folding with single-molecule fluorescence spectroscopy Characterizing single-molecule FRET dynamics with probability distribution analysis Fast single-molecule FRET spectroscopy: theory and experiment Theory of the energy transfer efficiency and fluorescence lifetime distribution in single-molecule FRET Single-molecule detection and identification of multiple species by multiparameter fluorescence detection Principles of single molecule multiparameter fluorescence spectroscopy Quantifying internal friction in unfolded and intrinsically disordered proteins with single-molecule spectroscopy Analysis of Fluorescence Lifetime and Energy Transfer Efficiency in Single-Molecule Photon Trajectories of Fast-Folding Proteins Ultrafast dynamics of protein collapse from single-molecule photon statistics Accurate distance determination of nucleic acids via Förster resonance energy transfer: implications of dye linker length and rigidity Single-Macromolecule Fluorescence Resonance Energy Transfer and Free-Energy Profiles Inferring properties of disordered chains from FRET transfer efficiencies Phase Transition Behavior of the Isolated Polymer Chain Principles of Polymer Chemistry On the problem of random vibrations, and of random flights in one, two, or three dimensions, The London, Edinburgh, and Dublin Philosophical Magazine and A unified analytical theory of heteropolymers for sequence-specific phase behaviors of polyelectrolytes and polyampholytes Sequence charge decoration dictates coil-globule transition in intrinsically disordered proteins Modulating charge patterning and ionic strength as a strategy to induce conformational changes in intrinsically disordered proteins A theoretical method to compute sequence dependent configurational properties in charged polymers and proteins Sequence-Specific Polyampholyte Phase Separation in Membraneless Organelles Collapse transition in proteins Coil-globule transition in the denatured state of a small protein Excluded Volume Effects in Polymer Solutions: as Explained by the Renormalization Group Polymer scaling laws of unfolded and intrinsically disordered proteins quantified with single-molecule spectroscopy Integrated view of internal friction in unfolded proteins from single-molecule FRET, contact quenching, theory, and simulations Unfolded states under folding conditions accommodate sequence-specific conformational preferences with random coillike dimensions The SARS-CoV-2 nucleocapsid protein is dynamic, disordered, and phase separates with RNA Thermodynamic fluctuations in a reacting systemmeasurement by fluorescence correlation spectroscopy Fluorescence correlation spectroscopy. II. An experimental realization Fluorescence correlation spectroscopy. I. Conceptual basis and theory Mobility measurement by analysis of fluorescence photobleaching recovery kinetics Fluorescence correlation spectroscopy: past, present, future Fluorescence correlation spectroscopy shows that monomeric polyglutamine molecules form collapsed structures in aqueous solutions Dynamics of unfolded polypeptide chains in crowded environment studied by fluorescence correlation spectroscopy Combining short-and long-range fluorescence reporters with simulations to explore the intramolecular dynamics of an intrinsically disordered protein Fluorescence quenching by photoinduced electron transfer: a reporter for conformational dynamics of macromolecules Protein dynamics from single-molecule fluorescence intensity correlation functions Highly Disordered Amyloid-β Monomer Probed by Single-Molecule FRET and MD Simulation Internal friction in an intrinsically disordered proteincomparing Rouse-like models with experiments Rouse Model with Internal Friction: A Coarse Grained Framework for Single Biopolymer Dynamics Exploring the role of internal friction in the dynamics of unfolded proteins using simple polymer models Scaling Concepts in Polymer Physics The Theory of Polymer Dynamics PET-FCS: probing rapid structural fluctuations of proteins and nucleic acids by single-molecule fluorescence quenching A close look at fluorescence quenching of organic dyes by tryptophan Nanosecond Dynamics of Single Polypeptide Molecules Revealed by Photoemission Statistics of Fluorescence Resonance Energy Transfer: A Theoretical Study Spatiotemporal correlations in denatured proteins: The dependence of fluorescence resonance energy transfer (FRET)-derived protein reconfiguration times on the location of the FRET probes First passage time approach to diffusion controlled reactions Precision and accuracy of single-molecule FRET measurements-a multi-laboratory benchmark study Combining MFD and PIE for accurate single-pair Förster resonance energy transfer measurements Accurate FRET measurements within single diffusing biomolecules using alternating-laser excitation Accurate Transfer Efficiencies, Distance Distributions, and Ensembles of Unfolded and Intrinsically Disordered Proteins From Single-Molecule FRET Alternating-laser excitation of single molecules Alternating-laser excitation: single-molecule FRET and beyond Pulsed interleaved excitation Pulsed interleaved excitation: principles and applications Loss of Dynamic RNA Interaction and Aberrant Phase Separation Induced by Two Distinct Types of ALS/FTD-Linked FUS Mutations Defining the limits of single-molecule FRET resolution in TIRF microscopy Structural origin of slow diffusion in protein folding Zero-mode waveguides for single-molecule analysis at high concentrations Zero-mode waveguides for single-molecule analysis Ultrafast cooling reveals microsecond-scale biomolecular dynamics Domain-specific folding kinetics of staphylococcal nuclease observed through single-molecule FRET in a microfluidic mixer Microfluidic device for single-molecule experiments with enhanced photostability Single-molecule measurements of transient biomolecular complexes through microfluidic dilution Continuous throughput and long-term observation of single-molecule FRET without immobilization Quantifying kinetics from time series of singlemolecule Förster resonance energy transfer efficiency histograms Highthroughput smFRET analysis of freely diffusing nucleic acid molecules and associated proteins Quantifying heterogeneity and conformational dynamics from single molecule FRET of diffusing molecules: recurrence analysis of single particles (RASP) Single-molecule spectroscopy of protein conformational dynamics in live eukaryotic cells Charge interactions can dominate the dimensions of intrinsically disordered proteins Polyproline and the "spectroscopic ruler" revisited with single-molecule fluorescence Comprehensive structural and dynamical view of an unfolded protein from the combination of single-molecule FRET, NMR, and SAXS Single-molecule spectroscopy of the temperature-induced collapse of unfolded proteins Site-specific dimensions across a highly denatured protein; a single molecule study Effect of flexibility and cis residues in single-molecule FRET studies of polyproline Quantitative FRET studies and integrative modeling unravel the structure and dynamics of biomolecular systems Automated and optimally FRET-assisted structural modeling Covalent Modification of Biomolecules through Maleimide-Based Labeling Strategies A Toolbox for Site-Specific Labeling of RecQ Helicase With a Single Fluorophore Used in the Single-Molecule Assay Improved variants of SrtA for site-specific conjugation on antibodies and proteins with high efficiency Protein-protein fusion catalyzed by sortase A Site-specific labeling of proteins via sortase: protocols for the molecular biologist Efficient sortase-mediated N-terminal labeling of TEV protease cleaved recombinant proteins Site-Specific Three-Color Labeling of α-Synuclein via Conjugation to Uniquely Reactive Cysteines during Assembly by Native Chemical Ligation Site-specific two-color protein labeling for FRET studies using split inteins Ensemble and single-molecule FRET studies of protein synthesis Bioorthogonal Chemistry Enables Single-Molecule FRET Measurements of Catalytically Active Protein Disulfide Isomerase Three-color alternating-laser excitation of single molecules: monitoring multiple interactions and distances Studying Complex Biomolecular Dynamics by Single-Molecule Three-Color FRET Three-Color Single-Molecule FRET and Fluorescence Lifetime Analysis of Fast Protein Folding Single-molecule three-color FRET Three-color single-molecule fluorescence resonance energy transfer Transglutaminase-catalyzed site-specific conjugation of small-molecule probes to proteins in vitro and on the surface of living cells Mothes W, Real-Time Conformational Dynamics of SARS-CoV-2 Spikes on Virus Particles Site-specific protein labeling by Sfp phosphopantetheinyl transferase Shedding-Resistant HIV-1 Envelope Glycoproteins Adopt Downstream Conformations That Remain Responsive to Conformation-Preferring Ligands Mothes W, Associating HIV-1 envelope glycoprotein structures with states on the virus observed by smFRET Recognition of hybrid peptidyl carrier proteins/acyl carrier proteins in nonribosomal peptide synthetase modules by the 4'-phosphopantetheinyl transferases AcpS and Sfp Holo-(acyl carrier protein) synthase and phosphopantetheinyl transfer in Escherichia coli Ability of Streptomyces spp. acyl carrier proteins and coenzyme A analogs to serve as substrates in vitro for E. coli holo-ACP synthase Commonly used FRET fluorophores promote collapse of an otherwise disordered protein Random coil negative control reproduces the discrepancy between scattering and FRET measurements of denatured protein dimensions Innovative scattering analysis shows that hydrophobic disordered proteins are expanded in water Response to Comment on "Innovative scattering analysis shows that hydrophobic disordered proteins are expanded in water Modest influence of FRET chromophores on the properties of unfolded proteins Conformational Ensembles of an Intrinsically Disordered Protein Consistent with NMR, SAXS, and Single-Molecule FRET Choosing the right fluorophore for single-molecule fluorescence studies in a lipid environment Accurate single-molecule FRET studies using multiparameter fluorescence detection Simulation of Fluorescence Anisotropy Experiments: Probing Protein Dynamics Probing the action of chemical denaturant on an intrinsically disordered protein by simulation and experiment Net charge per residue modulates conformational ensembles of intrinsically disordered proteins HullRad: Fast Calculations of Folded and Disordered Protein and Nucleic Acid Hydrodynamic Properties An Efficient Method for Estimating the Hydrodynamic Radius of Disordered Protein Conformations Local and Global Dynamics in Intrinsically Disordered Synuclein Concerted dihedral rotations give rise to internal friction in unfolded proteins Quantitative interpretation of FRET experiments via molecular simulation: force field and validation Accounting for dye diffusion and orientation when relating FRET measurements to distances: three simple computational methods A toolkit and benchmark study for FRET-restrained high-precision structural modeling Single-molecule FRET measures bends and kinks in DNA Dynamics of the nucleosomal histone H3 N-terminal tail revealed by high precision singlemolecule FRET Six steps closer to FRET-driven structural biology Citizen Scientists Create an Exascale Computer to Combat COVID-19, bioRxiv Screen Savers of the World Unite! Information Theory and Statistical Mechanics Principles of protein structural ensemble determination Combining experiments and simulations using the maximum entropy principle Bayesian ensemble refinement by replica simulations and reweighting Inferring Structural Ensembles of Flexible and Dynamic Macromolecules Using Bayesian, Maximum Entropy, and Minimal-Ensemble Refinement Methods Extended experimental inferential structure determination method in determining the structural ensembles of disordered protein states Integrating Molecular Simulation and Experimental Data: A Bayesian/Maximum Entropy Reweighting Approach Bayesian-Maximum-Entropy Reweighting of IDP Ensembles Based on NMR Chemical Shifts A Rigorous and Efficient Method To Reweight Very Large Conformational Ensembles Using Average Experimental Data and To Determine Their Relative Information Content Combining molecular dynamics simulations with small-angle X-ray and neutron scattering data to study multi-domain proteins in solution Refinement of α-synuclein ensembles against SAXS data: Comparison of force fields and methods, bioRxiv The emerging role of physical modeling in the future of structure determination Blind protein structure prediction using accelerated free-energy simulations Accelerating molecular simulations of proteins using Bayesian inference on weak information Determining protein structures by combining semireliable data with atomistic physical models by Bayesian inference Determination of Structural Ensembles of Proteins: Restraining vs Reweighting The Maximum Caliber Variational Principle for Nonequilibria Single-molecule protein folding: diffusion fluorescence resonance energy transfer studies of the denaturation of chymotrypsin inhibitor 2 Characterizing the unfolded states of proteins using single-molecule FRET spectroscopy and molecular simulations Universal Nature of Collapsibility in the Context of Protein Folding and Evolution Quantitative assessments of the distinct contributions of polypeptide backbone amides versus side chain groups to chain expansion via chemical denaturation Emerging consensus on the collapse of unfolded and intrinsically disordered proteins in water Innovative scattering analysis shows that hydrophobic disordered proteins are expanded in water Small-Angle X-ray Scattering and Single-Molecule FRET Spectroscopy Produce Highly Divergent Views of the Low-Denaturant Unfolded State Properties of protein unfolded states suggest broad selection for expanded conformational ensembles Revealing the Hidden Sensitivity of Intrinsically Disordered Proteins to their Chemical Environment Water as a Good Solvent for Unfolded Proteins: Folding and Collapse are Fundamentally Different Quantifying additive interactions of the osmolyte proline with individual functional groups of proteins: comparisons with urea and glycine betaine, interpretation of m-values Thermodynamic analysis of ion effects on the binding and conformational equilibria of proteins and nucleic acids: the roles of ion association or release, screening, and ion effects on water activity Cell Volume Controls Protein Stability and Compactness of the Unfolded State In-Cell Titration of Small Solutes Controls Protein Stability and Aggregation SAXS versus FRET: A Matter of Heterogeneity? The Cold-Unfolded State Is Expanded but Contains Long-and Medium-Range Contacts and Is Poorly Described by Homopolymer Models Conformations of intrinsically disordered proteins are influenced by linear sequence distributions of oppositely charged residues Conformation and Dynamics of the Troponin I C-Terminal Domain: Combining Single-Molecule and Computational Approaches for a Disordered Protein Region Intrinsically Disordered Regions of the DNA-Binding Domain of Human FoxP1 Facilitate Domain Swapping Decoding the pattern of photon colors in single-molecule FRET Disordered Proteins Enable Histone Chaperoning on the Nucleosome Disordered RNA chaperones can enhance nucleic acid folding via local charge screening Phase transition of RNA−protein complexes into ordered hollow condensates Sequence-encoded and composition-dependent protein-RNA interactions control multiphasic condensate morphologies The nuclear pore complex and nuclear transport Transport between the cell nucleus and the cytoplasm Nucleocytoplasmic transport Transport Selectivity of Nuclear Pores, Phase Separation, and Membraneless Organelles Plasticity of an ultrafast interaction between nucleoporins and nuclear transport receptors Temperaturedependent solvation modulates the dimensions of disordered proteins Tau protein function in living cells A protein factor essential for microtubule assembly A functional role for intrinsic disorder in the tau-tubulin complex New developments in force fields for biomolecular simulations The Polarizable Atomic Multipolebased AMOEBA Force Field for Proteins An optimized charge penetration model for use with the AMOEBA force field An Empirical Polarizable Force Field Based on the Classical Drude Oscillator Model: Development History and Recent Applications Tinker-HP: a massively parallel molecular dynamics package for multiscale simulations of large complex systems with advanced point dipole polarizable force fields High-resolution mining of the SARS-CoV-2 main protease conformational space: supercomputer-driven unsupervised adaptive sampling Cool walking: a new Markov chain Monte Carlo sampling method The combined force field-sampling problem in simulations of disordered amyloid-β peptides