RESEARCH
Peer reviewed scientific video journal
Video encyclopedia of advanced research methods
Visualizing science through experiment videos
EDUCATION
Video textbooks for undergraduate courses
Visual demonstrations of key scientific experiments
BUSINESS
Video textbooks for business education
OTHERS
Interactive video based quizzes for formative assessments
Products
RESEARCH
JoVE Journal
Peer reviewed scientific video journal
JoVE Encyclopedia of Experiments
Video encyclopedia of advanced research methods
EDUCATION
JoVE Core
Video textbooks for undergraduates
JoVE Science Education
Visual demonstrations of key scientific experiments
JoVE Lab Manual
Videos of experiments for undergraduate lab courses
BUSINESS
JoVE Business
Video textbooks for business education
Solutions
Language
English
Menu
Menu
Menu
Menu
A subscription to JoVE is required to view this content. Sign in or start your free trial.
Research Article
Erratum Notice
Important: There has been an erratum issued for this article. View Erratum Notice
Retraction Notice
The article Assisted Selection of Biomarkers by Linear Discriminant Analysis Effect Size (LEfSe) in Microbiome Data (10.3791/61715) has been retracted by the journal upon the authors' request due to a conflict regarding the data and methodology. View Retraction Notice
This article presents a protocol for data processing of influenza viruses imaged using cryo-electron tomography and subsequent subtomogram averaging of the hemagglutinin glycoprotein. This protocol covers step-by-step data processing, from image preprocessing to final model refinement.
Cryo-electron tomography is a powerful tool to visualize heterogeneous samples, with one major application being structural characterization of pleomorphic viruses. In recent years, subtomogram averaging of viral glycoproteins has emerged as a method to directly visualize these crucial proteins on the surface of intact virions. One important target is the hemagglutinin (HA) glycoprotein of influenza virus, which densely covers the viral envelope and is responsible for influenza receptor binding and membrane fusion. While subtomogram averages of influenza HA have been reported, their resolutions have been limited due to the low signal-to-noise ratio inherent to cryoET as well as the manual effort required to analyze heterogenous influenza virions. Presented here is a cryoET analysis pipeline that integrates several software packages to analyze tomographic data of influenza virions efficiently and robustly. This protocol describes the structural determination of HA from influenza virions, through steps from initial motion correction to final model building. Following this pipeline, a HA reconstruction at 6.0 Å resolution was obtained from two cryoET datasets collected from the A/Puerto Rico/8/34 (PR8) influenza strain.
Cryo-electron tomography (cryoET) has been applied over the past decades to capture snapshots of protein complexes, viruses, cells, and organisms. A modality of cryo-electron microscopy (cryoEM), cryoET is a structural biology method where a biological sample is flash frozen, then imaged through a variety of orientations through tilting1,2,3. Images taken at each orientation are then computationally aligned to their common tilt axis and reconstructed into a tomogram to provide a three-dimensional view4.
Whereas X-ray crystallography and single particle cryoEM demand purified, structurally homogenous molecules, cryoET can image a molecule directly within its native context4. Therefore, one main advantage of cryoET is its ability to visualize pleomorphic samples, such as membranous viruses, including influenza5,6,7. Another promise of cryoET is its ability to image across scales. While tomograms are not typically resolved past 5-10 nm8, the integration of subtomogram averaging, where copies of the same particle are identified, aligned, and averaged, can result in near atomic resolution in some biological molecules such as ribosomes9,10. However, only limited types of molecules can reach this resolution; subtomogram averages do not typically surpass 10-15 Å resolution. In contrast, single particle cryoEM routinely achieves resolutions of 3-4 Å post the resolution revolution11. Recent advancements in both higher throughput of cryoET data acquisition and analysis software have allowed for subnanometer resolution structure determination of additional biological molecules within their native context12,13,14,15,16,17,18.
One common usage for cryoET is to visualize virus morphology, organization, and structure. Despite the lower resolution afforded by this technique compared to single particle cryoEM or X-ray crystallography, cryoET combined with subtomogram averaging can provide information on how viral proteins behave in their native environment and provide crucial details on their organization in the context of the virion. A common target for cryoET of viruses is the surface glycoproteins that are commonly used for host cell attachment and fusion, as they are often the main antigens and targets for therapeutics or vaccines. With recent advancements in cryoET processing packages, it has become increasingly feasible to achieve subnanometer resolution averages of these glycoproteins19,20,21,22. One such example is hemagglutinin (HA), the major protein on the surface of influenza virions. Not only does this protein conduct both receptor binding and membrane fusion, but it also covers the virion in an incredibly dense manner, with hundreds to thousands of HAs on a singular virion5. The protocol presented here (Figure 1) integrates several commonly used packages with in-house scripts to delineate stages from pre-processing to model refinement for a subtomogram average of influenza HA.
NOTE: Sample datasets used for this protocol can be accessed at EMPIAR-12864, which includes the two sets of tilt series used for this protocol. The tilt series are collected from manually plunged grids of purified influenza A virus at a physical pixel size of 2.09 Å/pixel to ensure a large enough field of view so that each tilt series contains several virions, and also to render the highest resolution reconstructions possible. For users' own datasets, it is recommended to start the workflow with raw tilt movies. These datasets were processed and visualized using high-performance workstations. The Table of Materials lists the hardware and software used for this protocol. All software packages used in this protocol are open source and available for download; installation links and instructions are listed in the Table of Materials. The recommended workstation for processing cryoET datasets should be equipped with at least an 8-core processor, a dedicated GPU card with 6 GB of VRAM, 64 GB of RAM, and 2 TB of local storage.
1. Data preprocessing of tilt movies and reconstruction of cryo-electron tomograms in Warp 23 and IMOD 24
conda activate warp_environmentWarpTools create_settings --folder_data path/to/.tif --folder_processing warp_frameseries --output warp_frameseries.settings --extension “*.tif” --angpix 1.04 --gain_path gain_file.mrc --exposure 3.07WarpTools fs_motion_and_ctf --settings warp_frameseries.settings --m_grid 1x1x5 --c_grid 2x2x1 --c_range_max 7 --c_defocus_max 10 --c_defocus_min 4 --c_use_sum --out_averagesWarpTools ts_import --mdocs path/to/.mdoc --frameseries /path/to/frameseries --tilt_exposure 3.07 --min_intensity 0.3 --output tomostarWarpTools create_settings --folder_data tomostar --folder_processing warp_tiltseries --output warp_tiltseries.settings --extension “*.tomostar” --angpix 1.04 --gain_path gain_file.mrc --exposure 3.07 --tomo_dimensions NxNxNWarptools ts_stack --settings warp_tiltseries.settings --angpix 8.35WarpTools ts_import_alignments --settings warp_tiltseries.settings --alignments warp_tiltseries/tiltstack/ --alignment_angpix 8.35WarpTools ts_aretomo --settings warp_tiltseries.settings --angpix 8.35 --alignz 1000 --axis_iter 3 --exe AreTomo_executiveWarpTools ts_ctf --settings warp_tiltseries.settings --range_high 7 --defocus_min 2 --defocus_max 10 --auto_hand 4export WARP_FORCE_MRC_FLOAT32=1WarpTools ts_reconstruct --settings warp_tiltseries.settings --input_data input file names --angpix 8.35 --dont_invert2. Tomogram preprocessing and particle picking
conda activate isonet_envmkdir tomo_folder
mv tomograms*.mrc tomo_folder/isonet.py prepare_star tomo_folder --output_star tomograms.star --pixel_size 8.35isonet.py deconv tomograms.star --snrfalloff 0.7 --deconv_folder deconvolveconda activate eman_env
e2projectmanager.pycd path/to/tomogramse2spt_boxer_convnet.py --label label_namee2projectmanager.py command.3. Particle curation
4. Iterative subtomogram averaging and classification
WarpTools ts_export_particles --settings warp_tiltseries.setting --input_star pts2star.star --coords_angpix 8.35 --output_star bin4_export.star --output_angpix 8.35 --box 48 --diameter 140 --3drelion_convert_star --i bin4_export.star --o bin4_conv.starhead -n 30 bin4_conv.star >> subset.star & tail -n +31 bin4_conv.star | shuf -n 2000 >> subset.star
mpiexec -n 3 relion_refine_mpi --o init_ref/job001/run --auto_refine --split_random_halves --i subset.star --firstiter_cc --ini_high 20 --dont_combine_weights_via_disc --pool 3 --pad 2 --ctf --particle_diameter 300 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 14 --offset_step 4 --sym C1 --low_resol_join_halves 40 --norm --scale --j 12 --gpu 0:1 --pipeline_control init_ref/job001mpiexec -n 3 relion_refine_mpi --o Refine3D/job001/run --auto_refine --split_random_halves --i bin4_conv.star --ref init_ref/job001/run_class001.mrc --firstiter_cc --ini_high 20 --dont_combine_weights_via_disc --pool 3 --pad 2 --ctf --particle_diameter 400 --flatten_solvent --zero_mask --oversampling 1 --healpix_order 2 --auto_local_healpix_order 4 --offset_range 16 --offset_step 4 --sym C1 --low_resol_join_halves 40 --norm --scale --j 12 --gpu 0:1 --pipeline_control Refine3D/job001relion_refine --o Class2D/job003/run --grad --class_inactivity_threshold 0.1 --grad_write_iter 200 --iter 200 --i Refine3D/job002/run_data.star --dont_combine_weights_via_disc --pool 3 --pad 2 --ctf --tau2_fudge 2 --particle_diameter 300 --K 20 --flatten_solvent --zero_mask -- strict_highres_exp 14 --center_classes --oversampling 1 --norm --scale --j 24 --skip_align --pipeline_control Class2D/job002.relion_star_handler --i input_file2.star --o output_good_class.star --select rlnClassNumber --minval goodclassnumber -maxval goodclassnumberrelion_star_handler --i "output_good_class1.star output_good_class2.star … output_good_classn.star" --o bin4_keep.star --combinerelion_star_handler --i input_file.star --o output_file.star --combinerelion_image_handler --i bin1_ref.mrc --o bin1_c3.mrc --sym c3MTools create_population --directory refine_m --name ha_finalMTools create_source --name source_1 --population refine_m/ha_final.population --processing_settings warp_tiltseries.settings
MTools create_species --population refine_m/ha_final.population --name ha_todaysdate --diameter 160 --sym c3 --temporal_samples 1 --half1 last_relion_refine/run_half1_class001_unfil.mrc --half2 last_relion_refine/run_half2_class001_unfil.mrc --particles_relion last_relion_refine/run_data.star --mask mask.mrcMCore --population refine_m/ha_final.population --refine_particlesMCore --population refine_m/ha_final.population --refine_particles --ctf_cs5. Model refinement
To demonstrate the utilization of this processing protocol (Figure 1), the previously outlined workflow was applied to two datasets of 25 tomograms combined, obtained from an H1N1 influenza A virus strain (A/Puerto Rico/8/1934). Data collection parameters are outlined in Table 1. Figure 2 illustrates a representative tomogram and zoomed-in views of pleomorphic influenza virions. Varied morphologies are captured in this tomogram, as virions range from spherical to oval/elongated in shape. While most influenza particles contain well-organized M1 and vRNP assemblies, some virions appear to be more disorganized and lack crucial structural components.
From this dataset, an initial set of 40,995 subtomograms was used for bin4 (8.35 Å/pix) reconstruction after particle picking and curation. The two datasets were initially processed independently in RELION4 with C1 symmetry and a wide spherical mask that encompassed the overall HA array. Three refinement cycles were carried out for these subtomograms, followed by 2D classification. Post-classification, poorly resolved subtomograms and junk particles were discarded; remaining subtomograms were extracted at bin2 (4.17 Å/pix) and the two datasets were combined. Bin2 subtomograms were first aligned together in a round of subtomogram averaging, then a cylindrical mask around the central HA was applied to focus alignment. At this stage, trimeric symmetry can be clearly visualized for the HA reconstruction. Additional rounds of refinement were carried out at bin2 and at 2.8 Å/pix. One last round of 2D classification was conducted with a small mask covering just the central HA trimer with aligned subtomograms. The major class, consisting of ~94% of remaining subtomograms, was extracted to unbinned particles and subjected to 3D refinement with C3 symmetry applied. Lastly, these subtomograms were exported to M, where particle poses and spherical aberration refinement cycles were carried out (Supplemental Figure 1).
The final subtomogram average (Figure 3A), consisting of 15,970 HA particles, reached a global resolution of 6.0 Å and a local resolution range of 5-7 Å (Figure 4). A model of PR8 HA was flexibly refined into the density; at FSC=0.5 and FSC=0.143, the map-to-model resolution was 8.1 Å and 6.6 Å, respectively. The architecture of the HA reconstruction greatly resembled previous cryoEM and cryoET maps. At this resolution, alpha-helices and beta sheets can be distinguished (Figure 3B); moreover, glycans can begin to be identified at four glycosylation sites on the HA head and stem (Figure 3C).
Our results show the suitability of cryoET in the reconstruction of HA from native influenza virions. Through the protocol, clear density for the cylindrical glycoprotein was observed perpendicular to the viral membrane, and the resolution improved through each step. For one's own data, it is suggested that the subtomogram averaging should start from a binned particle stack, and the results should be closely monitored at each refinement cycle. If glycoprotein density is not apparent from the beginning stages, it is recommended to map the subtomogram locations back onto the tomogram to verify accurate positioning. Otherwise, one can tweak alignment parameters or implement additional classification stages to achieve optimal results.

Figure 1: Overall pipeline for subtomogram averaging of HA from cryoET of influenza virus. The top panel represents a summary workflow for the two datasets used to demonstrate the protocol. The second panel corresponds to Section 1 of the protocol, the third panel corresponds to Sections 2-3, and the fourth panel corresponds to Sections 4-5. Please click here to view a larger version of this figure.

Figure 2: Representative tomogram of PR8 influenza virus. (A) Slice through reconstructed tomogram. Scale bar is 100 nm. (B-D) Zoomed in view of (B) spherical, (C) cylindrical, (D) M1-less PR8 virion. All scale bars in B-D correspond to 50 nm. Please click here to view a larger version of this figure.

Figure 3: Subtomogram average of PR8 HA. (A) Top and side view of the HA reconstruction at two contour levels flexibly fitted with a PR8 HA single particle cryoEM structure. (B) Clipped views through the HA reconstruction. Colored arrows correspond to colored boxes. (C) HA reconstruction shown at lower contour to unveil glycan density. Close-up views of glycans in stick representation in the cryoET map are also shown. Please click here to view a larger version of this figure.

Figure 4: Resolution estimate of HA subtomogram average. (A) Local resolution estimate of the HA subtomogram average mapped onto the reconstruction. Color bar depicts 5-7 Å on the blue-white-red palette. (B) FSC curves of half maps and of map-to-model resolution. The blue curve is of the unmasked reconstruction, and the red is of the masked. Please click here to view a larger version of this figure.
| Dataset 1 | Dataset 2 | |
| Pixel Size | 2.09 | 2.09 |
| Tilt range | 0 to ±54° | 0 to ±66° |
| Tilt step | 3° | 3° |
| Collection year | 2024 | 2021 |
| Defocus range | 4-8 μm | 4-8 μm |
| Total dose | 120 e-/Å2 | 120 e-/Å2 |
| # Subframes | 6 | 5 |
| # Tilt series used | 15 | 11 |
| # Particles | 3278 | 12692 |
Table 1: Data collection parameters for cryoET datasets of PR8 influenza virus.
Supplemental Figure 1: Subtomogram averaging workflow for HA. Iterative alignment, averaging, and classification are conducted for HA subtomograms through gradual unbinning. Please click here to download this File.
Better structural understanding of critical viral proteins can accelerate the discovery of novel treatments against these viruses. In the past decade, the resolution revolution has accelerated the determination of high-resolution viral structures using single particle cryoEM, but this method is limited to either purified proteins or non-enveloped viruses with icosahedral symmetry. In contrast, cryoET is capable of visualizing morphologically diverse membranous virions in 3D, but is of limited resolution. The pipeline presented here builds on the widely used Warp-RELION-M pipeline9,32 and integrates a set of custom scripts with the semi-automated convolutional neural network-based particle picker from EMAN2, in order to establish an end-to-end protocol and generate a subnanometer resolution reconstruction of influenza HA.
One of the more difficult problems in cryoET is choosing the appropriate suite of analysis tools after data collection, as numerous software programs exist for each step of tomographic analysis, from initial preprocessing to model building. The pipeline was adapted from the Warp-RELION-M9,32 pipeline as it is both well-documented, with several established protocols contributed by the developers and the community, and has also produced numerous high-resolution subtomogram averages in the past years, including our own work19,32,33,34. Additionally, this pipeline limits the transfer of metadata between software packages during the various processing stages, as different packages use varied conventions that may result in additional errors. Warp and M can connect the initial preprocessing stages with the final multi-particle refinement, which ensures the initial calculations of motion and CTF parameters are integrated with final refinement and map reconstruction. To this end, RELION4 was used for the majority of the subtomogram averaging stages, as Warp can generate particle stacks and metadata for refinement and classification in RELION. Post subtomogram refinement, results from RELION can be directly imported into M for multi-particle refinement. Lastly, while our particle alignment workflow was relatively straightforward, RELION is capable of more extensive classification and flexible refinement protocols if needed.
A particularly challenging aspect of influenza structural biology is the dense layer of glycoproteins that covers the surface of extensively pleomorphic virions. Compared to common cryoET targets such as the ribosome, they are too small to be accurately identified using 3D template matching. Moreover, hundreds to thousands of HA glycoproteins cover the viral surface; therefore, manual identification of particles of interest is time consuming and tedious. This aspect of influenza is unlike other viruses of a similar size, like SARS-CoV-26 or HIV-17,35 that only have a few dozen glycoproteins on each particle. Lastly, as these virions are not uniformly spherical or oval but of mixed morphology, the oversampling approach (most commonly implemented using the Dynamo package36,37,38) that is commonly employed for evenly shaped virions or viral-like particles may require additional effort to create a surface model for each virion. This aspect is especially relevant when working with large datasets or when combining datasets, as the amount of time spent on particle picking would scale with the number of tomograms analyzed. Of course, if working with a virion that is more uniform in morphology with a known radius, the oversampling approach can be more efficiently applied compared to this influenza dataset. To simplify manual input and decrease the time spent on particle picking, the CNN-mediated particle picking approach employed in this protocol aims to recognize the shape of HA in its local environment and can be applied to entire datasets of tomograms with limited training. Only several dozen particle images for both 'good' and 'bad' references are required to train a neural network; this process typically takes only a few hours for a user familiar with HA. Using this approach, an initial dataset of more than 40,000 subtomograms was curated. While not an absolute metric for final resolution, obtaining a large particle set is required if the goal is to obtain a subnanometer resolution reconstruction.
Another metric to enhance resolution is particle set curation, which was implemented in this protocol through initial coordinate-based filtering as well as subsequent rounds of classification. For influenza virions, not only was a neural net trained to recognize HA, the protein of interest, but a separate neural net was also trained to recognize the M1 protein layer beneath the membrane. This step is to ensure that the HA subtomograms that end up being integrated into our dataset are from intact virions. To further enhance the resolution of our final reconstruction, several 2D classification steps were included in this protocol, which is a computationally efficient way to parse and classify several tens of thousands of subtomograms. If one singular dataset does not generate adequate numbers of subtomograms, particles derived from several collections can be combined. To best ensure the compatibility of different datasets, particles were combined after preliminary rounds of subtomogram alignment and averaging. Each dataset was first pre-processed independently through Warp, as collection parameters across sessions can be different. However, these metrics are considered during final particle alignment in M, when final iterations of particle alignment are conducted to optimize for image deformation and electron-optical aberration parameters.
While our proposed protocol greatly improved subtomogram averaging resolution for HA, the Nyquist limit of ~4.2 Å was not reached. This limitation might be due to the final number of subtomograms incorporated into our reconstruction. While a dataset of ~15,000 subtomograms was curated, it remains possible that increasing the number of total tilt series would further improve the quality and resolution of the HA subtomogram average. Moreover, these data were acquired at a high defocus of 4-8 µm to ensure tomographic contrast. Tilt series acquisition at a lower defocus range would better retain high-frequency information that could improve the resolution of subtomogram averages.
This protocol shows an example of using cryoET and integrated analysis to determine a subtomogram averaging of influenza.
The authors have nothing to disclose.
The authors would like to acknowledge helpful discussions with the Schiffer Lab. We would also like to thank the UMass Chan cryoEM Core facility for their help with data acquisition and for providing us with support and advice. This work was supported by the National Institute of General Medical Sciences R01GM143773 to M.S. and R35GM151996 to C.A.S.
| AMD Ryzen Threadripper PRO 5965WX | AMD | https://www.amd.com/en/support/downloads/drivers.html/processors/ryzen-threadripper-pro/ryzen-threadripper-pro-5000wx-series/amd-ryzen-threadripper-pro-5965wx.html | |
| AreTomo 1.3.4 | UC San Francisco | https://drive.google.com/drive/folders/1Z7pKVEdgMoNaUmd_cOFhlt-QCcfcwF3_ | |
| EMAN2 2.99.52 | Baylor College of Medicine | https://blake.bcm.edu/emanwiki/EMAN2 | |
| IMOD 4.12.27 | University of Colorado at Boulder | https://bio3d.colorado.edu/imod/ | |
| Influenza analysis scripts | UMass Chan Medical School | https://github.com/jqyhuang/influenza-analysis | |
| IsoNet 0.3 | UCLA | https://github.com/IsoNet-cryoET/IsoNet | |
| M 2.0.0 | Genentech | https://warpem.github.io/warp/home/m/ | |
| NVIDIA A4000 | NVIDIA | https://www.nvidia.com/en-us/products/workstations/rtx-a4000/ | |
| Open3D | Intel Labs | https://www.open3d.org/ | |
| PHENIX 1.21-5207 | Lawrence Berkeley National Laboratory | phenix-online.org | |
| RELION 4.0 | MRC Laboratory of Molecular Biology | https://relion.readthedocs.io/en/release-4.0/ | |
| Ubuntu 20.04 | Ubuntu | https://releases.ubuntu.com/focal/ | |
| UCSF ChimeraX 1.6.1 | UC San Francisco | https://www.cgl.ucsf.edu/chimerax/ | |
| Warp 2.0.0 | Genentech | http://warpem.github.io/warp/ |