$$\rightleftharpoonup{xx}$$
$$\longleftharp{xx}$$,
$$\longrightharp{xx}$$,
Biological macromolecules are too small to be seen even with the best light microscopes. Current methods to determine their structures generally involve crystallizing the protein or measurements on vast numbers of identical molecules at the same time. While crystallography provides information on the atomic level, it represents an artificial sample environment, given that most macromolecules are not presented in a crystalline form in the cell. During the last couple of years cryo-electron microscopy delivered similar high-resolution structures of large macromolecules / macromolecular complexes, but although the samples are closer to physiological condition, they are still frozen, hence immobile and static. Bio-small angle X-ray scattering (BioSAXS) provides a structural measurement of the macromolecule, in conditions that are relevant to biology. This state can be visualized as a low resolution 3-D shape determined on nanometer scale and captures the entire conformational space of the macromolecule in solution. BioSAXS experiments efficiently assess oligomeric state, domain and complex arrangements as well as flexibility between domains1,2,3. The method is accurate, mostly non-destructive and usually requires only a minimum of sample preparation and time. However, for the best interpretation of the data, the samples need to be monodisperse. This is challenging; biological molecules are often susceptible to contaminations, poor purification and aggregation, for example from freeze thawing4. The development of inline chromatography followed by immediate SAXS measurement helps mitigate these effects. Size-exclusion chromatography separates the samples by size thus excluding most contaminants and aggregations5,6,7,8,9,10. However, in some cases even SEC-SAXS is not sufficient to produce a monodisperse sample, because the mixture may consist of components that are too close in size or their physical properties or their fast dynamics lead to overlapping peaks in the SEC UV trace. In these cases, a software-based deconvolution step of the obtained SAXS data might lead to an idealized SAXS curve of the individual component5,11,12. As an example, in protocol section 2, we show the standard SEC-SAXS analysis of the vaccinia E9 DNA polymerase exonuclease minus mutant (E9 exominus) in complex with DNA. Vaccinia represents the model organism of the Poxviridae, a family containing several pathogens, for example the human smallpox virus. The polymerase was shown to bind tightly to DNA in biochemical approaches, with the structure of the complex recently solved by X-ray crystallography13.
Most synchrotron facilities will provide an automated data processing pipeline that will perform data normalization and integration producing a set of unsubtracted frames. But the approach described in this manuscript could also be use with a lab source provided SEC-SAXS is performed. Furthermore, additional automation may be available that will reject radiation-damaged frames and perform the buffer subtraction14. We will show how to perform primary data analysis on pre-processed data and make the most of the available data in section 2.
In section 3, we show how to deconvolute SEC-SAXS data and analyze the curves efficiently. While there are several deconvolution methods such as the Gaussian peak deconvolution, implemented in US-SOMO15 and the Guinier optimized maximum likelihood method, implemented in the DELA software16, these generally require a model for the peak shape12. The finite size of individual peaks we are investigating allows the use of evolving factor analysis (EFA), as an enhanced form of singular value decomposition (SVD) to deconvolute overlapping peaks, without relying on the peak shape or scattering profile5,11. A SAXS-specific implementation can be found in BioXTAS RAW17. EFA was first used on chromatography data when 2D diode array data allowed matrices to be formed from absorbance against retention time and wavelength data18. Where EFA excels is that it focuses on the evolving character of singular values, how they change with the appearance of new components, with the caveat that there is an inherent order in the acquisition10. Fortunately, SEC-SAXS data provides all the necessary ordered acquisition data in organized 2D data arrays, lending itself nicely to the EFA technique.
In section 4, we will demonstrate the basics of model-independent SAXS analysis from the buffer-background subtracted SAXS curve. Model-independent analysis determines the particle’s radius-of-gyration (Rg), volume-of-correlation (Vc), Porod Volume (Vp), and Porod-Debye Exponent (PE). The analysis provides a semi-quantitative assessment of the particle’s thermodynamic state in terms of compactness or flexibility via the dimensionless Kratky plot2,4,19.
Finally, SAXS data are measured in reciprocal space units and we will show how to transform the SAXS data to real-space to recover the pair-distance, P(r), distribution function. The P(r)-distribution is the set of all distances found within the particle and includes the particle’s maximum dimension, dmax. Since this is a thermodynamic measurement, the P(r)-distribution represents the physical space occupied by the particles’ conformational space. Proper analysis of a SAXS dataset can provide solution-state insights that complement high-resolution information from crystallography and cryo-EM.