Single-throughput Complementary High-resolution Analytical Techniques for Characterizing Complex Natural Organic Matter Mixtures

Malak M. Tfaily; Rachel M. Wilson; Heather M. Brewer; Rosalie K. Chu; Heino M. Heyman; David W. Hoyt; Jennifer E. Kyle; Samuel O. Purvine

doi:10.3791/59035

Environment

Single-throughput Complementary High-resolution Analytical Techniques for Characterizing Complex Natural Organic Matter Mixtures

Published: January 7, 2019 doi: 10.3791/59035

Malak M. Tfaily*^1,2, Rachel M. Wilson*³, Heather M. Brewer¹, Rosalie K. Chu¹, Heino M. Heyman⁴, David W. Hoyt¹, Jennifer E. Kyle⁵, Samuel O. Purvine¹

¹Environmental Molecular Sciences Laboratory, Pacific Northwest National Laboratory, ²Department of Soil, Water and Environmental Science, University of Arizona, ³Department of Earth Ocean and Atmospheric Sciences, Florida State University, ⁴Bruker Daltonics Inc., ⁵Biological Sciences Division, Pacific Northwest National Laboratory

* These authors contributed equally

Summary

This protocol describes a single throughput for complementary analytical and omics techniques culminating in a fully-paired characterization of natural organic matter and microbial proteomics in different ecosystems. This approach permits robust comparisons for identifying metabolic pathways and transformations important for describing greenhouse gas production and predicting responses to environmental change.

Abstract

Natural organic matter (NOM) is composed of a highly complex mixture of thousands of organic compounds which, historically, proved difficult to characterize. However, to understand the thermodynamic and kinetic controls on greenhouse gas (carbon dioxide [CO₂] and methane [CH₄]) production resulting from the decomposition of NOM, a molecular-level characterization coupled with microbial proteome analyses is necessary. Further, climate and environmental changes are expected to perturb natural ecosystems, potentially upsetting complex interactions that influence both the supply of organic matter substrates and the microorganisms performing the transformations. A detailed molecular characterization of the organic matter, microbial proteomics, and the pathways and transformations by which organic matter is decomposed will be necessary to predict the direction and magnitude of the effects of environmental changes. This article describes a methodological throughput for comprehensive metabolite characterization in a single sample by direct injection Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS), gas chromatography mass spectrometry (GC-MS), nuclear magnetic resonance (NMR) spectroscopy, liquid chromatography mass spectrometry (LC-MS), and proteomics analysis. This approach results in a fully-paired dataset which improves statistical confidence for inferring pathways of organic matter decomposition, the resulting CO₂ and CH₄ production rates, and their responses to environmental perturbation. Herein we present results of applying this method to NOM samples collected from peatlands; however, the protocol is applicable to any NOM sample (e.g., peat, forested soils, marine sediments, etc.).

Introduction

Globally, wetlands are estimated to contain 529 Pg of carbon (C), mostly as organic C buried in peat deposits¹. Currently, such peatlands act as a net C sink, sequestering 29 Tg C y^-1 in North America alone¹. However, environmental disturbance such as draining, fires, drought, and warmer temperatures can offset this C sink by increasing organic matter decomposition resulting in increased C losses via greenhouse gas (carbon dioxide [CO₂] and methane [CH₄]) production¹^,². Climate change may contribute to C loss if warmer temperatures or dryer conditions stimulate faster C decomposition by microorganisms. Alternatively, higher temperatures and air CO₂ concentrations may stimulate primary production to sequester more CO₂ as organic carbon (OC). To what extent and how fast that OC is then decomposed into CO₂ and CH₄ depends on the complex interactions between the electron donor substrates, the availability of electron acceptors, and the microorganisms that mediate the transformation. In many cases, the mechanisms are not well-characterized, thus their response to environmental perturbations is not well-constrained and it remains unclear what the net result of climate change will be on carbon balance in peatland ecosystems.

The complex nature of natural organic matter (NOM) has made even identifying the organic compounds present in the NOM mixtures historically difficult. Recent advances have greatly improved our ability to characterize compounds that traditionally and, to some extent continue to be, regarded as recalcitrant humic or fulvic compounds³^,⁴^,⁵. We now understand that many of these compounds are actually microbially available and may be decomposed if a suitable terminal electron acceptor (TEA) is made available⁶^,⁷. Calculating the nominal oxidation state of the carbon (NOSC) for a compound provides a metric for predicting the potential for decomposition and the energy yield of the TEA required. However, it requires a molecular-level characterization of the organic matter⁷. NOSC is calculated from the molecular formula via the following equation⁷: NOSC = − ((−z + 4(#C) + (#H) − 3(#N) − 2(#O) + 5(#P) − 2(#S)) / (#C)) + 4, where z is the net charge. NOSC is correlated with the thermodynamic driving force⁸, wherein compounds with higher NOSC are easier to degrade, while compounds with lower NOSC require increasingly energetic TEAs in order to be reduced. Compounds with NOSC less than −2 require a high energy yielding TEAs such as O₂, nitrate or Mn^IV, and cannot be degraded by commonly occurring lower energy yielding TEAs such as Fe^III or sulfate⁷. This is an important consideration in the waterlogged anoxic conditions found in wetlands where O₂ and other high energy yielding TEAs are scarce⁹ and therefore the degradation of lower NOSC compounds under these conditions are thermodynamically limited. Environmental perturbation can influence the thermodynamic state of the ecosystem through hydrologic changes that influence O₂ (the most energetic electron acceptor), changes in organic substrates and electron acceptors made available by primary production, and to a smaller extent by temperature. An important example of the temperature effects in wetland systems occurs with regard to the trade-off that occurs between homoacetogenesis (i.e., acetate production from CO₂ and H₂) and hydrogenotrophic methanogenesis (i.e., CH₄ production from CO₂ and H₂). At low temperatures it appears that homoacetogenesis is slightly favored, while warmer temperatures favor CH₄ production¹⁰. This temperature effect may have important implications for the response of ecosystems to changing climate, as CH₄ is a much stronger greenhouse gas than CO₂¹¹ and thus increasing production of CH₄ at the expense of CO₂ at warmer temperatures may contribute to a positive feedback with climate warming.

Peatlands produce globally significant quantities of CO₂ and CH₄⁶ via microbial respiration of naturally occurring organic matter. The NOSC of the organic carbon substrates determines the relative proportion of CO₂:CH₄ produced which is a critical parameter because of the higher radiative forcing of CH₄ compared to CO₂¹¹, but also because modeling efforts have identified this ratio as a critical parameter for estimating C flux in peatlands¹². In the absence of terminal electron acceptors other than CO₂, it can be shown by electron balance that organic C substrates with NOSC > 0 will produce CO₂:CH₄ > 1, organic C with NOSC = 0 produces CO₂ and CH₄ in equimolar ratio, and organic C with NOSC < 1 will produce CO₂:CH₄ < 1¹³. Decomposition of OC in natural ecosystems is mediated by microorganisms, so that even when degradation of a specific compound is thermodynamically feasible, it is kinetically limited by the activity of microbial enzymes and, under anoxic conditions, by the thermodynamic driving force (i.e., NOSC)⁷. Until now it has been challenging to fully characterize the organic matter because the diversity of compounds present requires different complementary techniques for their characterization. Recent advances have closed the gap; using a suite of analytical techniques we can analyze a large range of organic compounds providing molecular-level characterization and, in some cases quantification, from small primary metabolites like glucose up to 800 Da poly-heterocycles. Previously such large complex molecules would have been characterized simply as lignin-like or tannin-like and assumed to have been recalcitrant. Molecular-level characterization, however, allows the calculation of NOSC for even these large complex molecules. These NOSC values are linearly correlated with the thermodynamic driving force allowing for an assessment of the quality of organic matter available for decomposition, which in many cases reveals that these complex molecules may actually be microbially degradable even under the anoxic conditions that prevail in wetlands.

Since introduction of O₂ allows organic matter of nearly all naturally observed NOSC values to be decomposed, herein we focus on changes in the organic matter and microbial proteomics which are likely to be the primary drivers in wetland (i.e., limited O₂) systems. However, all the techniques that we will discuss can be applied to organic matter from any ecosystem. Commonly, bulk measurements based on optical and fluorescence analyses have been used to assess organic matter quality³^,¹⁴. When using bulk measurements such as these, however, fine details are lost as large numbers of molecules are categorized together under generic terms like humics or fulvics. The definitions of these categories are not well-constrained and, in fact, may vary from study to study making comparisons impossible. Further, bulk measurements do not provide the molecular detail necessary for calculating the thermodynamics governing the system and therefore fall short of truly assessing organic matter quality¹⁵.

Individual techniques such as Fourier transform ion cyclotron resonance mass spectrometry (FTICR-MS), nuclear magnetic resonance (NMR) spectroscopy, gas chromatography mass spectrometry (GC-MS), and liquid chromatography mass spectrometry (LC-MS) do provide such molecular-level detail. While each of these techniques presents its own limitations, they also bring their own strengths that can be leveraged in an integrated approach to achieve the fine molecular detail necessary for quantifying organic matter quality in a rigorous thermodynamic sense. GC-MS is useful for identifying critical small metabolites that are likely to have proximal influence on CO₂ and CH₄ production (e.g., glucose, acetate, etc.); however, GC-MS requires verification against a standard and is therefore limited to already known compounds present in the database preventing identification of novel compounds. Furthermore, GC-MS is a semi-qualitative technique allowing inference about the changes in relative concentrations, but not providing the actual concentration information necessary for calculating Gibb's free energies for example. Finally, GC-MS requires derivatization of molecules prior to analysis which limits resolution to compounds smaller than ~400 Da and volatile alcohols are lost during the drying step.

One-dimensional (1D) ¹H liquid state NMR allows highly quantitative characterization of small metabolites (including primary small molecular weight metabolites and volatiles like alcohols, acetate, acetone, formate, pyruvate, succinate, short-chained fatty acids, as well as a range of carbohydrates notoriously absent or compromised from MS-based methods) and their concentrations are particularly useful for calculating thermodynamic parameters. Yet, like GC-MS, 1D NMR of complex mixtures requires standardization relative to a database and therefore does not alone allow easy identification of novel compounds that are likely to be abundant in complex natural and changing ecosystems. Additionally, NMR is less sensitive than the MS-based techniques and therefore, quantitative metabolite profiling is achieved only above 1 µM using NMR systems equipped with helium-cooled cold-probes. Not widely appreciated, some NMR cold-probes are salt-tolerant and allow environmental mixture analysis in the presence of millimolar salt concentrations when used in smaller diameter (< 3 mm outer diameter) sample tubes¹⁶. However, a further complication of NMR is that high amounts of paramagnetic metals and minerals (e.g., Fe and Mn above 1 - 3 wt%), which can be abundant in upland soils, can broaden spectral features and complicate interpretation of the NMR spectra. Using solid phase extraction (SPE) can aide in the interpretation of both NMR and MS-based metabolomics methods by reducing the mineral salts and increasing spectral quality.

FTICR-MS by direct injection is a highly sensitive technique capable of detecting tens of thousands of metabolites from a single sample, but it does not capture the critical small metabolites such as acetate, pyruvate, and succinate and is notoriously difficult to use for sugars and other carbohydrates¹⁷, nor does it provide quantitative information. However, unlike the other techniques, FTICR-MS excels at identifying and assigning molecular formula to novel compounds and therefore identifies the largest number of compounds providing more molecular information than any of the other described techniques. This is useful, because the molecular information provided by FTICR-MS (and other techniques) can be used to calculate NOSC which is related to the thermodynamic driving force governing the likelihood of certain reactions⁸ and their rates under certain conditions⁷. Furthermore, by coupling FTICR-MS with separation techniques, such as LC together with tandem MS, quantitative structural information can be attained, offsetting some of the disadvantages of this technique. LC-MS is useful for identifying lipid-like compounds and other metabolites that are not well-characterized by any of the other methods. Coupling LC FTICR-MS or LC-MS with a fraction collector and collecting fractions of specific unknowns of interest for structural elucidation by two-dimensional (2D) liquid state NMR is the ideal situation for identifying and quantifying unknown compounds¹⁸^,¹⁹. However, this is a very time-consuming step that could be used if and when needed. Taken individually, each of these techniques provide a different snapshot of the organic matter, and by integrating them, we can achieve a more complete understanding than using any of the techniques in isolation.

While the thermodynamic considerations set the ultimate constraints on what transformations are possible in a system, organic matter decomposition is mediated by microorganisms whose enzyme activities control reaction rates. Thus, fully understanding the controls on organic matter decomposition and ultimately greenhouse gas (CO₂ and CH₄) production from wetlands requires an integrated omics approach to characterizing the microbial enzyme activities as well as the metabolites. In this article, we describe a method for achieving such a comprehensive analysis from a single sample using a sequential approach that results in a fully paired analysis. This approach expands on the metabolite, protein, and lipid extraction (MPLex) protocol in which proteomics was coupled with GC-MS and LC-MS²⁰ to identify small metabolites, proteins, and lipids by incorporating quantitative metabolite information via NMR and identification of larger secondary metabolites via FTICR-MS. Slightly different to MPLex, we begin the protocol with a water extraction and then use sequential extraction with increasingly non-polar solvents. All extractions are done on a single sample which conserves sample when volumes are limited or difficult to obtain and decreases experimental error introduced through variation among aliquots from heterogeneous sample matrices (e.g., soil and peat) or differences in storage conditions and duration.

Finally, by coupling the OM analyses with proteomics analyses of the microbial community, we can build metabolic networks that describe the pathways and transformation of organic matter decomposition. This allows us to test specific hypotheses about how perturbations to the system will influence ultimate CO₂ and CH₄ production through alteration to the available organic substrates, electron acceptors, and the microbial communities mediating the reactions via the activity of enzyme catalysts.

The overall goal of this method is to provide a single throughput protocol for analyzing metabolites, lipids, and microbial proteins from a single sample thereby creating a fully paired dataset for building metabolic networks while constraining analytical errors.

Subscription Required. Please recommend JoVE to your librarian.

Protocol

1. Sequential Extraction of Organic Matter from Soil, Sediments, or Peat

Collect soil, sediments, or peat via coring and divide cores according to the hypothesis being tested (e.g.,depth). Store samples in polytetrafluoroethylene coated containers and freeze at -80 °C for storage prior to analysis.
NOTE: Approximately 25 mg C is needed for this protocol. For peat (typically 45% C), 50 mg of dried peat is required. Larger amounts of sample may be needed for low organic samples like mineral or forested uplands soils depending on the C content ( up to 5 g). Because extraction with organic solvents will pull any polyethylene glycol (PEG) into the extracts, which will negatively affect ionization during step 2.4, it is important to avoid allowing samples to contact plastic at any point during collection, storage or extraction.
When ready to analyze the samples, freeze dry to constant weight, then grind the samples in a high-speed ball mill using stainless steel grinding balls to homogenize and break up any aggregates.
NOTE: The protocol can be paused at this point and the material stored at -80 °C.
Using an ethanol-washed stainless-steel utensil, aliquot 50 mg of each of the dried samples into individual 2 mL glass vials. These samples will be sequentially extracted using a series of solvents to consecutively extract increasingly non-polar metabolites from each sample. Add 1 mL of distilled, degassed water (H₂O) to each sample, cap the vials and shake for 2 h on a shaker table.
After shaking allow the solutions to stand at room temperature (RT) for 20 min, then centrifuge at 15,000 x g for 30 min, decant and save the supernatant from each.
NOTE: These solutions will be injected into the FTICR-MS by direct injection.
Conduct a Folch extraction²¹ (also known as MPLEx²⁰) on the now water extracted residues by repeating steps 1.3 and 1.4 substituting 1 mL of a -20 °C 4:3 chloroform:methanol mixture for the water in step 1.3.
CAUTION: Chloroform and methanol are both highly flammable and toxic. Use appropriate personal protective equipment (PPE) to avoid skin contact and avoid open flame.
Carefully separate the two resulting solvent layers, which will be visually distinguishable, for separate analysis by FTICR-MS by using a separation funnel or simply remove the top layer by careful pipetting.
NOTE: The chloroform-containing fraction will be at the bottom while the methanol-containing fraction is less dense and will be on the top.
Dilute the chloroform extract (step 1.6) 1:1 in methanol and the water extract (step 1.4) 2:1 in methanol to improve the electrospray ionization (ESI) efficiency for FTICR-MS by direct injection.
NOTE: The methanol-containing fraction from step 1.7 does not need to be diluted further in methanol. The methanol layer will be run by direct injection on the FTICR-MS.

2.FTICR-MS Analysis

Calibrate the FTICR spectrometer by directly injecting 100 µL of a tuning solution (see Table of Materials) spanning a mass range of approximately 100 - 1,300 Da into the FTICR-MS.
Prepare the Suwannee River Fulvic Acid standard (see Table of Materials) by making a 1 mg mL^-1 solution in ultrapure filtered water and then diluting the resulting solution to 20 µg mL^-1 in methanol.
Direct inject 23 µL of this final solution to the ESI source coupled to the FTICR spectrometer through a syringe pump set to a flow rate of 3.0 μL min^-1. Set needle voltage to +4.4 kV, Q1 to 150 m/z and glass capillary at 180 °C. Inspect resulting spectra using the analysis software (see Table of Materials) to confirm the quality of the data.
Use HPLC grade methanol before running samples and throughout the sampling to monitor carryover. Introduce 23 μL of each extract via direct injection to the ESI source coupled to the FTICR spectrometer through a syringe pump set to a flow rate of 3.0 μL min^-1. Set needle voltage to +4.4 kV, Q1 to 150 m/z and glass capillary at 180 °C.
Adjust ion accumulation time (IAT) for each sample or group of samples to account for variation in C concentration. Typical values are between 0.1 to 0.3 s. Collect 144 scans for each sample, average the scans and then conduct an internal calibration using homologous CH2 (i.e., 14 Da separation) series.
NOTE: After FTICR-MS analysis, the decision can be made to either combine the water and methanol extracted fractions in order to streamline the remaining steps or the fractions can be kept separate throughout subsequent steps. The advantages and disadvantages of each approach are described at length in the Discussion. If doing so, combine the water and methanol-extracted fractions.
Dry extracts using a concentrator and save the remainder of the extracts (~ 1 mL) for subsequent GC-MS (water, methanol or water + methanol), LC-MS (chloroform) and NMR (water + methanol) analysis.
NOTE: The protocol can be paused at this point and the material stored at -20 °C.

3. FTICR-MS Data Processing

Co-align all sample peak lists for the entire dataset to reduce mass shifts and standardize peak assignments using Formularity software²² ahead of formula assignment.
Use the Formularity software²² to assign molecular formula using a signal to noise ratio greater than 7, mass measurement error < 1 ppm, and allowing C, H, O, N, S, and P while excluding all other elements.
If multiple candidate formulas are returned for a given mass (frequent above 500 Da) impose constraints consistent with the material being sampled. For example, in peat, typical constraints include: lowest mass error, lowest number of heteroatoms (N, S, P), and when present, P must be in oxidized form (i.e., there must be at least 4 O atoms for each P atom in the formula).

4. Chemical Derivatization for GC-MS

Prepare blank control samples of HPLC-grade hexane in GC-MS autosampler vials²³. Dissolve 100 mg fatty acid methyl esters (FAMEs: C8 - C28) mixture retention time standard in 200 µL of hexane.
To protect carbonyl groups, add 20 μL of 30 mg mL^-1 methoxyamine hydrochloride in pyridine to each of the methanol extracts and water extracts (or combined methanol/water extracts if using) from step 2.6, blanks and FAME calibration samples²³. Seal vials with caps.
CAUTION: Pyridine is both highly toxic and flammable. Wear appropriate PPE to prevent skin or eye contact and avoid open flame. In addition, pyridine volatilizes at RT causing harmful air contamination. Work only in well-ventilated spaces under a fume hood.
Vortex extracts for 20 s. Sonicate extracts for 60 s. Then, incubate extracts in a centrifuge at 37 °C for 90 min at 100 x g.
NOTE: An excessive amount of carbohydrates or salts can cause metabolites to crystalize after being dried down. Sonicating the samples will help to reconstitute the crystalized metabolites into the derivatizing reagent.
After incubation, add 80 μL of N-methyl-N-(trimethylsilyl)trifluoroacetamide with 1% trimethylchlorosilane to each sample, vortex extracts for 20 s. Sonicate extracts for 60 s and incubate extracts again in a centrifuge at 37 °C for 30 min at 100 x g.
Cool extracts to RT (20 - 24 °C), then transfer into GC-MS autosampler vials.

5. GC-MS Analysis

Tune and calibrate MS according to vendor recommendations (see Table of Materials) before analysis to make sure the machine read MS data correctly. Check that helium gas pressure is within specified tolerance.
Separate polar metabolites on a GC column (30 m × 0.25 mm × 0.25 μm). Set the oven temperature protocol as follows: (1) initial temperature of 60 °C for 1 min, (2) ramp to 325 °C at a rate of 10 °C min^-1, and (3) hold at 325 °C for 5 min.
Analyze extracts using a GC coupled to a single quadrupole MS. Set injection port temperature to a constant 250 °C. Inject 1 μL of each derivatized extract in the splitless mode.

6. GC-MS Data Processing

Inspect all the data files to ensure that they were correctly captured. Pay attention to potential shifts with regards to internal standard retention times and intensities to confirm that that the data was captured consistently throughout the analysis.
Convert the vendor specific MS data format to a general MS format if required. Process raw data files using MetaboliteDetector²⁴ calibrating retention indices based on the FAME internal standards. After aligning the retention times of all data files, continue with deconvolution and finally metabolite identification by matching retention indices and GC-MS spectra against the FiehnLib polar metabolite library²⁵.
Cross-check remaining unidentified metabolites against the NIST14 GC-MS library using spectral matching. Validate identifications individually to eliminate false identifications and reduce deconvolution errors.

7. Liquid State NMR Analysis

Dilute the remainder of the water extracts (~ 300 µL) by 10% (vol/vol) with a 5 mM 2,2-dimethyl-2-silapentane-5-sulfonate-d6 internal standard. Alternatively, combine water and methanol extracts from step 1 then resubstitute in water. By doing so however, some of the volatile compounds may be lost during the freeze-drying step. Typical final sample volumes are in the range 180 - 300 µL.
Transfer the mixture into a high-quality 3 mm outer diameter (O.D.) borosilicate glass NMR tube.
Collect spectra using an NMR spectrometer (ideally at least 600 MHz) equipped with a 5 mm triple resonance salt-tolerant cold probe and a cold-carbon pre-amplifier.
Collect 1D ¹H spectra using a 1D nuclear Overhauser effect spectroscopy (NOESY) presaturation experiment at 298 K with a 4 s acquisition time, 1.5 s recycle delay, 65,536 complex points, and 512 scans.
Collect 2D ¹H-¹³C heteronuclear single-quantum correlation (HSQC) and ¹H-¹H total correlation (TOCSY) spectra to help in assigning metabolites and validation.
Process, assign, and analyze all spectra using the NMR analysis software (see Table of Materials) to quantify intensities relative to the internal standard. Identify metabolites by matching chemical shift, J-coupling and intensity information in the samples against the library.
NOTE: The library is further enhanced by custom targeted metabolite standards and metabolite additions made on a regular basis. The protocol can be paused at this point and the material stored at -20 °C.

8. LC-MS Lipidomics Analysis

Rewet the dried chloroform extract generated during step 2.5 with 200 μL of methanol.
Inject 10 μL of each extract into an ultra-performance liquid chromatograph coupled to an Orbitrap mass spectrometer using a reversed phase charged surface hybrid column (3.0 mm × 150 mm × 1.7 μm particle size). Set a 34-min gradient (mobile phase A: acetonitrile/H₂O (40:60) containing 10 mM ammonium acetate; mobile phase B: acetonitrile/isopropanol (10:90) containing 10 mM ammonium acetate) at a flow rate of 250 µL/min. Use both negative and positive ionization modes with higher-energy collision dissociation and collision induced dissociation.
CAUTION: Acetonitrile is an eye, skin, and respiratory irritant. Wear appropriate PPE. In addition, acetonitrile is combustible. Do not allow to contact oxidizing agents, toxic fumes may be produced during burning.
Upload the raw LC-MS/MS data files into LIQUID along with the target file (i.e., list of > 25,000 lipid species) for the respective ionization mode (positive or negative). Process the raw file. Manually validate the resulting identifications by examining the MS/MS spectra for presence of diagnostic ions, if applicable, matching fragment ions (e.g., fatty acyl chains), isotopic separation, mass ppm error of the precursors, and the retention time. Export resulting list of confidently identified lipids as a .tsv file.
NOTE: The protocol can be paused at this point and store the material at -20 °C.

9. Proteomics Analysis

Extract proteins according to the MPLEx protocol²⁰ from the remainder of the methanol phase resulting from step 2.4, by washing the extract with 20 times the extract volume of additional cold (-20 °C) methanol.
Use a 1 mL/50 mg silica-based sorbent (see Table of Materials) to condition C18 SPE columns with 3 mL methanol, 2 mL of 0.1% trifluoroacetic acid (TFA) in water, followed by the addition of the extract from step 9.1 at a rate no greater than 1 mL/min.
Following the addition of the sample, wash the column with 4 mL of 95:4.9:0.1 water:acetonitrile:TFA, then allow to dry. Place a 1.5 mL collection tube under the SPE column, and elute the sample with 1 mL of 80:19.9:0.1 methanol:water:TFA.
Concentrate the extracts to 100 µL under vacuum-assisted freeze dryer, then measure the protein concentration by bicinchoninic acid (BCA) colorimetric assay²⁶ at a wavelength of 562 nm.
Centrifuge extracts at 10,000 x g for 10 min at 4 °C. Discard the resulting supernatant and dry the remaining pellet under vacuum for 5 min. Resuspend the protein pellet in water to a final concentration of 0.1 µg peptide per µL.
Add dithiothreitol to a final concentration of 5 mM and incubate at 60 °C for 30 min. Dilute 10-fold with 100 mM NH₄HCO₃/8M urea solution and incubate at 37 °C for 3 h in the presence of 1 mM CaCl₂ and porcine trypsin at a 1:50 enzyme to protein ratio.
NOTE: Protocol may be paused at this point and store material at -20 °C.
Separate extracts via liquid chromatography using an exponential gradient of a 0.1% formic acid in water mobile phase (A) and a 0.1% formic acid in acetonitrile mobile phase (B) at 10 kpsi and 500 nL min^-1.
Introduce resulting eluent into an ESI-coupled mass spectrometer collecting spectra from 400-2,000 m/z with 100,000 resolution at m/z 400 in a linear ion trap (LTQ) Orbitrap mass spectrometer.
Convert RAW spectra files to the mzML format using msConvert or ProteoWizard accepting all default parameters. Use the universal database search tool MSGFPlus to search resulting proteomes against a targeted protein database of protein sequences predicted from relevant metagenome assembled genomes.
Append common contaminants (e.g., trypsin, keratin, albumin) and remove all exactly duplicated protein sequences to improve peptide-to-mass-spectrum match statistics. Evaluate the resulting MSGF spectral probability scores to determine which peptide-to-mass-spectrum match is best. Use the Q-Value from MSGFPlus to filter the entire data pile to 1% false discovery rate (FDR).

10. Metabolomics Analysis and Metabolic Network Building

Compile all metabolite molecular formula identified in steps 3, 6, 7, and 8 into a single database of metabolites present in the samples. Combine these metabolites with the Enzyme Commission (EC) or KEGG Orthology (KO) numbers of enzymes identified in step 9. Search this combined dataset against the KEGG²⁷ database, metabolic pathways section.
Manually evaluate results to identify most probable pathways and integrate into a metabolic model.

Subscription Required. Please recommend JoVE to your librarian.

Representative Results

We performed the described complementary analysis protocol and compared peat with depth in the S1 bog in the Spruce and Peatlands Response Under Changing Environments (SPRUCE) site in Minnesota, USA. These results are compared to those from a permafrost bog and fen from northern Sweden to show how sites may vary in metabolite and enzyme activities. We identified 3,312 enzymes in the proteomics analysis. An analysis of the enzymes activities with depth reveals that the number of enzymes declines sharply between 15 cm and 45 cm in the SPRUCE bog (Figure 1).

While the proteomics results indicate which proteins are expressed, the metabolic data show which reactions are actually occurring. Overall, we identified 67,040 metabolites (including lipids) in all of the peat samples from the combination of FTICR-MS, NMR, GC-MS, and LC-MS analyses. Of these, we were able to assign molecular formula to 15,385 compounds (Figure 2). The combined metabolic data spans a range of oxidation states, masses, and compound classes.

This is typically visualized through the use of a van Krevelen diagram in which the atomic H/C ratios of identified formula are plotted against their atomic O/C ratios²⁸. We have included an added dimension to the typical 2D format, depicting NOSC by means of color coding the symbols representing individual formula (Figure 3 and Figure 4). With increasing depth in the SPRUCE bog, there is an increase in the total number of large secondary metabolites identified by FTICR-MS, specifically in highly condensed (lower left of the van Krevelen plot) and hydrogenated formulae (top of the plot) (Figure 4). Over the same depths there is a decrease in the number of small highly energetic compounds identified by GC-MS and in the lipids (Figure 5). This could suggest either that the lipids and small metabolites are consumed in the surface peat before reaching the deeper depths or that decomposition rates in the deep peat are faster than in the surface such that downward advecting compounds are rapidly consumed. Differentiating between these two competing hypotheses requires a process-level understanding of the C cycling at the site. Such a process-level understanding can only be gained by coupling the metabolomics and proteomics datasets. We accomplish this by cross-validating the combined FTICR-MS, GC-MS, LC-MS, and NMR identified metabolites against the KEGG database. In doing so, we find that the compounds we identified are involved in common metabolic pathways such as tricarboxylic acid (TCA) cycle, glycolysis, and sugar metabolism. Since individual metabolites can be involved in multiple pathways confirmation with the enzymes increases our confidence in assigning pathways (Figure 6).

Through this mapping we find evidence of sucrose and starch metabolism in the surface peat, while in the deeper depths pyruvate and other fermentation products build up (Figure 6). These results are consistent with the first hypothesis that sugars and other energetic metabolites are degraded in the surface peat and do not reach the deeper peat depths. As can be seen in Figure 5, sugars (NOSC = 0) and amino acids (0 < NOSC < 1) are consumed in the surface peat, while lipids (-2 < NOSC < -1) appear to accumulate with depth. This is consistent with expectations based on NOSC values that higher NOSC compounds are more readily degraded while lower NOSC compounds persist in the highly anaerobic (i.e., TEA-limited) conditions of subsurface peat. This approach also successfully distinguishes between soil types from different environments. For example, soil organic matter from boreal peatland appears to be compositionally different than permafrost peatland, as well as within fen and bog within the permafrost region. These results are consistent with a previous study that showed differences in site geochemistry between these habitats²⁹, suggesting that site geochemistry have a big effect on microbial degradation of C below ground.

Figure 1: Proteomics analysis indicates strong depth stratification in the number of enzymes. Bars indicate averages and error bars indicate one standard deviation of abundances. This suggests that microorganisms are most active in the peat surface. Please click here to view a larger version of this figure.

Figure 2: The number of compounds identified by each technique in the surface peat (< 30 cm) of each habitat as well as the mid (45 cm) and deep (87 cm) peat from the SPRUCE bog. Please click here to view a larger version of this figure.

Figure 3: The van Krevelen diagram (atomic H/C vs. O/C of each identified molecular formula) for the surface (15 cm) SPRUCE bog depth to demonstrate the coverage of compounds characterized by each technique. FTICR-MS allows us to identify the largest number of compounds (small circles), while GC-MS (triangles) is good for differentiating and identifying sugars (H/C = 2 and O/C = 1). NMR (squares) spectroscopy provides quantitative information on energetically important compounds such as sugars, amino acids, pyruvate, etc. LC-MS (diamonds) provides information on lipids. Please click here to view a larger version of this figure.

Figure 4: The van Krevelen diagram (atomic H/C vs. O/C of each identified molecular formula) for the deep (87 cm) SPRUCE bog depth to demonstrate the coverage of compounds characterized by each technique. Please click here to view a larger version of this figure.

Figure 5: The relative fraction of different chemical classes identified via the various techniques in the different depths. Bars are plotted as averages for each depth ± one standard deviation. Classes plotted include amino acids, sphingolipids (Cer), glycerophosphocholines (PC), phosphoethanolamines (PE), diacylclyceroltrihomoserine (DGTSA), diacylglycerol (DG), triacylglycerols (TG), and sugars. Please click here to view a larger version of this figure.

Figure 6: Combining results to create a metabolic network. Depth distributions of metabolites identified by NMR (a) and GC-MS (b), and a simplified metabolic map (c) showing a select number of transformations identified at SPRUCE bog. Green boxes indicate metabolites that decrease with depth, brown boxes indicate metabolites that increase with depth. Green connecting arrows indicate enzyme identified in our dataset that mediate the indicated transform (enzyme EC numbers indicated next to arrows). Gray connecting arrows indicate transforms for which enzymes were not identified in our dataset. Please click here to view a larger version of this figure.

Subscription Required. Please recommend JoVE to your librarian.

Discussion

The single-throughput, fully-coupled analysis stream used to characterize metabolites and the proteome provides insights into the pathways by which C cycling is occurring in a complex ecosystem. Soil and peat are heterogeneous matrices, and therefore, one of the critical steps of this method occurs in the earliest steps in ensuring that the starting peat or soil material is highly homogenous. It is preferable to grind the sample well as aggregates can reduce extraction efficiency. This is a particular problem for aggregated soils and soils with low C and high mineral contents that may require the use of a stainless-steel ball grinder to adequately homogenize. Because spatial heterogeneity of soils and peat within an experimental site may be high, biological replicates are highly recommended.

This method utilizes three solvents: water, methanol, and chloroform, which bias the types of compounds it is possible to extract. In principle, this sequential extraction method should solubilize compounds covering a wide polarity range. However, these solvents are not optimized for extracting organo-mineral complexes or highly stabilized complexes. if such compounds are of interest, harsher solvents such as strong acids and bases are preferred, but harsher extraction processes could alter the chemistry of the samples. Incorporating such solvents at the end of the extraction sequence may minimize this effect. Additionally, chloroform will inactivate clinically important soil and peat bacterial and viral pathogens, and other pathogenic disease-causing biological agents by dissolving cell membrane lipids. Thus, incorporating chloroform in the extraction protocol will reduce the risks involved in biology studies of samples potentially infected by pathogens from different regions of the world. If there are concerns about dead microbial cells, debris, or particulates following extraction, filtering the extracts through a 0.2 µm glass fiber filter is recommended.

To streamline the process, the water and methanol extracts could be combined for GC-MS analysis during step 4.1. However, these extracts must remain separated for FTICR-MS analysis due to ionization efficiency issues with the ESI source. The advantage of running the combined extract on the GC-MS is that more metabolites (covering a larger range of polarity) will be identified. The disadvantage of combining the extracts at this point is that one of the major metabolites important in microbial processing of organic matter is methanol. Traces of methanol will always remain in the combined sample, even after drying, so if methanol is a metabolite of interest, the water extract must be run separately. Having good controls with no analytes will also help in identifying potential contamination in the samples.

In step 7.1 of preparing extracts for NMR analysis, as in the GC-MS analysis, either the water extract alone or the combined water and methanol extracts can be used. The disadvantages of doing this are similar to those enumerated in the preceding paragraph. The advantage of combining the methanol and water extracts is that some of the less polar metabolites that are solubilized in the methanol fraction will be identified. However, because of the freeze-drying step, many more volatile compounds will be lost. This is a particular disadvantage if quantifying volatile fatty acids is a critical component of the experiment.

In this procedure for identifying proteomics, only fully tryptic peptides are searched, thus endogenous peptidase activity and in-source fragmentations will be missed. On the other hand, oxidized methionine may be considered as a post-translational modification for the peptide candidates as this modification commonly occurs during sampling processing and handling. Quantification of enzymes using peptide elution areas can be done, but is outside the scope of this project.

Many recent advances in the analysis of NOM and microbial parameters are providing a wealth of techniques for understanding organic C cycling. By combining these techniques in a single streamlined protocol, we gain a novel view of the processes that are taking place. Coupling the proteomics analysis with the metabolic analyses provides compelling corroborative evidence that a reaction is actually occurring. Metabolic analysis alone is limited in that it informs us which metabolites are higher or lower in concentration in comparisons among sites or after a treatment effect, but can't discern the causes for those changes. For example, we identified declining sugar concentrations with depth in the bog, but without further information it is unclear whether the decline with depth is due to slow inputs or fast degradation in the deep peat. Analysis of the proteome and the declining enzyme expression with depth allows us to reject the second hypothesis. Rather our results are consistent with fast sugar degradation in the surface peat limiting inputs to the deeper depths. These particular insights are only possible when all techniques are considered in tandem as each one provides a unique, but critical piece of the C cycling puzzle.

Subscription Required. Please recommend JoVE to your librarian.

Disclosures

The authors have nothing to disclose.

Acknowledgments

We would like to thank J.P. Chanton, J.E. Kostka, and M.M. Kolton for assistance with collecting peat samples. Portions of this work were conducted at the Environmental Molecular Sciences Laboratory, a DOE Office of Science User Facility sponsored by the Office of Biological and Environmental Research. PNNL is operated by Battelle for the DOE under Contract DE-AC05-76RL01830. This work was supported by the U.S. Department of Energy, Office of Science, and Office of Biological and Environmental research (grants: DE-AC05-00OR22725, DE-SC0004632, DESC0010580, DE-SC0012088, and DE-SC0014416).

Materials

Name	Company	Catalog Number	Comments
methoxyamine hydrochloride	Sigma Aldrich	226904	derivitization agent
5 mm triple resonance salt-tolerant cold probe	Bruker		instrumentation
capillary GC column HP-5MS column (30 m × 0.25 mm × 0.25 μm)	Agilent	AG19091S-433	instrumentation
reversed phase charged surface hybrid column (3.0 mm × 150 mm × 1.7 μm particle size)	ThermoFisher		instrumentation
2 mL glass vials	VWR International	46610-722	sample vials
autosampler vials	VWR International	97055-324; 9467671	sample vials
Chloroform	VWR International	JT9174-3	solvent
Ethanol	VWR International	BDH67002.400	solvent
methanol	VWR International	BDH85681.400	solvent
pyridine	VWR International	BDH67007.400	solvent
2,2-dimethyl-2-silapentane-5-sulfonate-d6	Sigma Aldrich	178837	standard
C8-C24 fatty acid methyl ester	Sigma Aldrich	CRM18918	standard
N-methyl-N- (trimethylsilyl)trifluoroacetamide	Sigma Aldrich	24589-78-4	standard
Suwanee River Fulvic Acid standard	International Humic Substances Society	2S101F	standard
trimethylchlorosilane	Sigma Aldrich	89595	standard
Tuning Solution	Agilent
FTICR-MS analysis software	Bruker	Compass DataAnalysis 4.1
Formularity Software	Pacific Northwest National Laboratory	Formularity	available for download at: https://omics.pnl.gov/software/formularity
GC-MS	Agilent	Agilent GC 7890A with MSD 5975C
silica-based sorbent	Phenomenex (Torrance, CA)	Strata C18-E (PN 8E-S001-DAK)
NMR TUBE 3MM 8 150 CS5	VWR International	KT897820-0008	NMR tube
Varian Direct Drive 600-MHz NMR spectrometer	Varian Inova	Varian Direct Drive 600-MHz	NMR spectrometer
Chenomx NMR Suite 8.3	Chenomx	Chenomx NMR Suite	NMR software
ultra-performance liquid chromatograph	waters	Aquity UPLC H	liquid chromatograph
Velos-ETD Orbitrap mass spectrometer	ThermoFisher	Thermo Scientific LTQ Orbitrap Velos	mass spectrometer