November 7th, 2025
This article presents a protocol for data processing of influenza viruses imaged using cryo-electron tomography and subsequent subtomogram averaging of the hemagglutinin glycoprotein. This protocol covers step-by-step data processing, from image preprocessing to final model refinement.
Our lab uses structural biology to study viruses and the evolution of drug resistance. In the current work, we are characterizing how influenza hemagglutinin changes conformation upon receptor binding. We have leveraged cryo-electron tomography to visualize viral morphology organization and structure.
Specifically, we have used this modality of cryo-electron microscopy to image the pleomorphic virus influenza. Our protocol allows us to study the organization and structure of influenza hemagglutinin in situ on intact viruses at sub-nanometer resolution. This higher resolution allows us to visualize formerly hidden interactions.
To begin, launch a Linux terminal for activating the Conda environment. Use the terminal to activate the Conda environment with ISO net installed. Create a sub-folder named tomo folder, then move all tomogram files into that folder.
Next, use the code to generate a star file in the project folder. Using a text editor, open the generated star file and input the approximate defocus value for the zero-degree-tilt images into the fourth column. Now run the CTFD convolution command in the terminal.
After deconvolving activate the EMAN2 GUI environment to begin pre-processing for particle picking. Under tomography, left click the arrow next to raw data, select import tomograms from the dropdown menu, and choose the files. Then, left click the arrow next to segmentation and choose pre-processed tomograms.
Use suitable default parameters. After pre-processing, an info directory with blank JSON files matching the base names of the process tomography files will be created automatically. Add angstrom pixel information to the files before proceeding.
To train a convolutional neural network, or CNN to recognize the HA glycoprotein, change the working directory to the location of the tomograms to be used. Then open the CNN training interface using the terminal. Confirm that four windows are open.
One containing the information on CNN parameters and tomogramss in the directory, and the other three windows containing images of good references, bad references and particles selected by the CNN. In the CNN GUI, click on the new option to initialize a CNN with default learning rate of 0.0001 and box size of eight. Under file name column, left click on a representative tomogram to open it.
Hover the cursor over the open tomogram and click the mouse scroll wheel to open a new window. Left click the N number slider to traverse the Z axis. Next, on the open tomogram, use the good references panel to select 10 to 20 features corresponding to the HA.Positive images appear in the positive window.
To remove one, hold shift and left click it. Now switch to the bad references panel and select 10 to 20 references corresponding to features such as empty membrane patches, the RNP density, fiducial markers, and debris. Set the number of iterations to 50 and click on the train icon to begin CNN training.
After training is completed, press apply. Selected particles will be displayed on tomogram as blue circles or can be seen in the particles window. Continue refining by selecting additional positive and negative references, periodically clicking save to preserve progress.
Once the CNN performs satisfactorily, click apply all to run particle picking. Then evaluate results on several tomograms and retrain as needed. To save coordinates for each tomogram, reopen the EMAN2 GUI using the command e2projectmanager.py.
Click the arrow under subtomogram average and select manual boxing. Now, input the tomogram's name and click launch. Then, sequentially click on file, save box cord, and tomogram_ha.
txt to save the coordinates. Navigate to the neural net sub folder and info sub folders to backup and net save. hdf, trainouts.
hdf, segouts. hdf, and boxes3Dref.hdf. Train a second CNN to detect the M1 protein layer and VRNP presence by changing the training box size from eight to 14.
To curate particles, first download notebooks from the GitHub repository. Then open the CNN particle cleaning the IPYNB notebook, and load the required modules. Load in the HA and M1 coordinate text files into the notebook.
View them as 3D clouds using the open 3D library. Next, filter out HA coordinate outliers using open 3D statistical outlier removal method. Next, calculate the distances between HA and M1 point clouds by identifying HA points more than 20 pixels from M1 as outliers and save remaining coordinates to an output text file.
Concatenate all particles and save as star files using pts2 star file.IPYNB. Using WarpTools, extract particles that bending factor four and perform initial subtomogram averaging in RELION minus four using the code. Convert the warp star file to RELION four format.
Then generate an initial reference using RELION refine MPI on a subset of particles. Perform 3D auto refinement on subtomograms with RELION refine MPI. Upon convergence, repeat auto refinement with titan translation offset to eight and step to two.
Leverage 2D classification to discard junk particles using RELION refine. Create star files with good classes using the code, then combine the individual star files. Representative tomograms revealed pleomorphic virions with shapes ranging from spherical to elongated forms.
The final subtomogram average reached a global resolution of 6.0 angstroms with local resolution ranging from five to seven angstroms. Alpha helices, and beta sheets were resolved in the final HA reconstruction, supporting the structural quality of the refined map. Glycosylation sites were identifiable at four positions on the HA head and stem, confirming visibility of glycan density in the final reconstruction.
View the full transcript and gain access to thousands of scientific videos
This article presents a protocol for studying the conformation of influenza hemagglutinin upon receptor binding using cryo-electron tomography. The protocol enables visualization of viral morphology and structure at sub-nanometer resolution.