$$\rightleftharpoonup{xx}$$
$$\longleftharp{xx}$$,
$$\longrightharp{xx}$$,
Tests of Cognition (ToC) were first popularized in the 20th century to investigate and characterize normal and abnormal or pathological cognitive behavior. Since their emergence, these tests have become widely adopted in research and clinical settings1. Many ToC were developed with simple response formats, such as speaking or writing/drawing using a pen and paper. As an example of the latter category, the Trail-Making Test (TMT) is a widely used representative ToC that is favored due to its sensitivity to cognitive impairment2. Comprised of two parts, TMT-A (numbers only) and TMT-B (numbers and letters), the test requires participants to use a pen to connect (link) 25 characters that are pseudo-randomly arranged on the page, in ascending sequential (and in the case of TMT-B, also alternating) order (i.e., TMT-A: 1-2-3-4-5-6…; TMT-B: 1-A-2-B-3-C…). To assess cognitive performance on the TMT, time to completion and errors are tabulated and compared to normative values, based on age range and education status2. The TMT is thought to recruit and assess complex cognitive processes, including task switching, visual search, memory, visuomotor control, and attention—all of which are important aspects of executive frontal lobe function1,3.
The TMT exhibits high sensitivity among ToC, but in terms of diagnoses, its poor specificity is well recognized as a limitation4. In general, sensitivity and specificity concerns are a drawback to the application and validity of ToC, particularly in clinical settings4. The traditional recourse to alleviate this concern has been to administer ToC in “test batteries” (often including the TMT) to improve discrimination between cognitively impaired and cognitively intact groups. However, test batteries are time-consuming, costly, and require considerable expertise to administer and analyze5. These logistical concerns, in turn, led to the development of “cognitive assessment” tools: substantially streamlined (and increasingly, computerized) test batteries for rapid administration in resource-limited settings (e.g., medical clinics), at the cost of some of the sensitivity and specificity gain. One example of such a tool is the Montreal Cognitive Assessment (MoCA)6.
Computerized assessments, such as the adapted MoCA, have been successfully validated through comparison to pen and paper analogs7, and to test batteries of ToC8. Yet fundamental limitations remain with all of these behavioral testing tools, including insufficient differentiation between appropriate and erroneous performance, focus on test scores for the entire test rather than intra-test effects, and limited insight into the various behavioral strategies and associated brain activity that underpin ToC performance4,9. However, these limitations may be overcome through research that combines detailed behavioral recordings, intra-task behavioral evaluation10, and functional neuroimaging (e.g., electroencephalography10, functional near-infrared spectroscopy11, and functional magnetic resonance imaging12).
Functional magnetic resonance imaging (fMRI) generates high-resolution images of brain activity by mapping hemodynamic response as a proxy for neural activation. Although expensive, the superior spatial resolution of fMRI over electroencephalography (EEG) and functional near-infrared spectroscopy allows for the localization of activity throughout the whole brain. Accordingly, the present work describes a novel administration method for ToC using the TMT as a representative example, which pairs fMRI with detailed, continuous, and simultaneous behavioral recording using computerized MRI-compatible tablet and eye-tracking systems. This multi-modal protocol offers greatly enhanced evaluation of the relationship between cognitive task performance and neural activity estimated by fMRI, useful to improve understanding of existing ToC and possibly providing insight for the development of enhanced ToC in the future.
Before providing a detailed description of the experimental setup to acquire tablet, eye-tracking, and fMRI data simultaneously, it is helpful to summarize the conceptual layout and approach (Figure 1). For MRI-compatibility and ergonomic reasons, the tablet system is slightly different from commercially available tablets. Popular tablets have a transparent touch-sensitive screen mounted on top of a computer display, enabling the user to look directly at the tablet and to receive visual input that seamlessly includes their stylus-based writing and drawing responses. In the present scenario, there is no computer display under the touch-sensitive screen. This design avoids the need for complex computer display electronics to operate safely in the intense magnetic field at the center of the magnet bore and without negatively impacting MR images. From an ergonomic perspective, space in the magnet bore is also rather limited, making it impractical for a research participant to view their hand directly while writing and drawing.
The experimental setup thus has participants perform tablet interactions on a support stand at their waist, while all visual information (test stimuli, stylus responses, video of their hand manipulating the stylus) is integrated together for viewing at the rear opening of the magnet bore through a mirror. The visual information is displayed on a rear projection screen using a commercially available, MRI-compatible projector (details provided below). Similarly, a commercially available eye-tracking system (details also provided below) is mounted in the rear magnet bore for rapid video recording of eye movements through the same mirror. The projector, screen, and eye-tracking apparatus must be arranged carefully so that they do not physically interfere with one another. Last, power and data connections to and from the tablet, projector, and eye-tracking system are made using various shielded cables, passing through the “penetration panel” of the radiofrequency shield that protects the magnet room and MRI system from surrounding electromagnetic interference. The data cables are under computer control, shown conceptually in Figure 1 as a single device under operator control in the MRI console area (distinct from the computer console used to operate the MRI system). As described below, multiple computers are involved in the present experimental setup.
Tablet system
The custom-built, computerized tablet system is comprised of MRI-compatible components (touch-sensitive surface, adjustable elevated support platform, force-sensitive stylus, projector system)12, including a video camera with a 4.3 mm lens (designated the “TabletCam” in the lab) and a custom light-emitting diode (LED) illuminator13, enabling administration of ToC and recording of naturalistic writing or drawing responses within the magnet bore during fMRI (Figure 2A,B). Located in the console area, two linked computers are used for system control: one associated with receiving and processing video data from the video camera (“Tablet Video Camera computer”) and the other for test administration, delivery of visual stimuli, logging of tablet data, and creation of a video file consisting of the time-dependent administered visual stimuli superimposed with stylus writing and drawing responses (“Stimulus/Response computer”; Figure 2C). The two-computer approach is chosen for unimpeded real-time performance of each set of latency-sensitive functions; modularity for research requiring different configurations (e.g., different tablet-based behavioral tasks, optional use of the video camera); and ease of compatibility (the only requirement is a compatible video output format).
The tablet system has been used previously in several fMRI studies of ToC, which all suggest its strong ecological validity14. The optional video camera is added to the original tablet configuration to provide the participant with visual feedback of hand position (VFHP) during task performance, in an interactive augmented reality (AR) environment, enabling viewing of task stimuli as well as stylus responses and hand movements superimposed in real time13 (Figure 2D). In the original implementation of the video camera data processing13, the hand and stylus were isolated from each video frame using a skin color detection algorithm, with the stylus implemented in red to fall within the red-green-blue (RGB) distribution for skin color. More recently, a “blue screen” approach has been adopted for its simplicity and other advantages. A blue backdrop is created by covering the touch-sensitive surface of the tablet with blue painter’s tape. It is then possible to segment the hand and stylus from the backdrop in each video frame based on the substantially different color distribution of the tape. At the same time, this process also enables the creation of a binary mask with a value of “one” at every location occupied by the hand or stylus, and “zero” elsewhere. The stimulus/response video and camera video are then superimposed by creating frames consisting of a) stimulus/response video data everywhere that a given mask equals zero, and b) camera (hand and stylus) video data everywhere that the given mask equals one. The painter’s tape has the additional benefit of introducing extra friction when the stylus tip is moved across the stylus surface, closer to the experience of writing with a pen or pencil on paper, in comparison to the low-friction “plastic on plastic” feel when the tape is removed. Overall, the resulting interactive AR environment further enhances the ecological validity of the tablet design, while reducing reliance on proprioception to execute fine motor movements (as occurs when VFHP is absent)13,15.
The tablet setup is used in conjunction with an MRI-compatible projector (Figure 2E) and a custom rear projection screen at the rear of the magnet bore. Participants view the screen through an angled mirror mounted on the head coil. Using a fingertip or stylus (which also includes a sensor to record contact force), the participant interacts with the touch-sensitive surface mounted on the support platform, which is positioned at the waist and is adjustable for each individual. Analog tablet signals pass through an electromagnetic interference (EMI) filter at the radiofrequency penetration panel, are transformed to touch data (surface location and force data) by a tablet interface box outside the magnet room, are logged and interpreted for graphical representation of touch responses on the Stimulus/Response computer, then are merged with visual stimuli and segmented hand and stylus video; and are presented to the participant using the projector.
TMT block design
The TMT is administered in a fixed block design consisting of alternating periods of TMT-A and TMT-B task performance, and of visual fixation to a central, black crosshair displayed on a white background. The overall task design was adapted from existing TMT literature1,16,17,18, where TMT-A involves linking circled numbers (1 to 25) pseudo-randomly distributed across the screen, in ascending order. Similarly, TMT-B involves linked circled numbers (1–13) and letters (A-L) in an alternating and ascending fashion. The visual fixation condition is included so that brain activity associated with TMT-A, and separately with TMT-B, can be analyzed as a statistical contrast been the activations of interest and that of a simple, stable condition with low cognitive demand. Due to the inherently low signal-contrast-to-noise ratio observed in fMRI experiments, each behavioral condition (TMT-A, TMT-B, visual fixation) is repeated in multiple trials, enhancing the statistical power to detect brain activity when the collective fMRI data are analyzed. The TMT plots for each trial are adapted from standard TMT layouts by either rotating the stimulus distribution by 180°, swapping number-only stimuli and number-letter stimuli, or both—thus minimizing visual and motor confounds due to differences in character and number distribution on the TMT-A and TMT-B plots18.
The present experimental and training tasks are implemented in commercially available stimulus presentation software for behavioral and neuroimaging research, for execution on the Stimulus/Response computer. Practically, the TMT is administered in two “runs”, each 4 min:50 s in duration. Each run consists of an initial 10 s block of resting fixation, followed by two trials of TMT-A task (40 s), resting fixation (20 s), TMT-B task (60 s), and resting fixation (20 s) (Figure 3). At the beginning of each run, participants are given instructions that mirror those used in standardized paper TMT testing16,17,18,19: connect the circles from “Begin” to “End” as fast and as accurately as possible, without lifting the stylus from the touch-sensitive surface. Unlike conventional paper TMT administration, the test administrator (a member of the research lab) does not stop and subsequently re-initiate TMT performance in the event that the participant makes errors. Instead, participants are instructed simply to continue to the next corresponding character link in the sequence. This modification eliminates any data analysis confounds associated with stopping and restarting eye-tracking and fMRI data collection within a given TMT trial. However, this then necessitates the implementation of error detection and categorization methods after the data are collected (see the protocol and discussion sections). In addition, the test administrator visually monitors the stylus responses in real time during TMT performance to record whether any errors were made, and to ensure that the touch-sensitive surface remains well-calibrated. In cases of tablet calibration errors and other hardware errors (e.g., power or equipment failure), the test administrator also decides whether to repeat the current TMT data acquisition run, possibly including recalibration of the touch-sensitive surface, or to stop and exclude use of the participant data in the subsequent analysis.
Eye tracking
When the human visual system processes a scene, such as during TMT performance, ballistic eye movements (saccades) are preceded and followed by periods of temporal stability (fixations)20. An MRI-compatible high-speed eye-tracking system is thus used in the present context to perform long-range monocular eye tracking of fixations and saccades with infrared illumination (910 nm wavelength) and 1 kHz sampling frequency (Figure 4A). From the position of the eye-tracking camera under the projection display, the eye of the participant is localized in the head coil mirror (Figure 4B-D). Note that the product head-coil mirror shipped with the MRI system was replaced by a front-surface mirror provided by the eye-tracker manufacturer, to enable high-quality tracking. The pupil is detected using a standard centroid-fitting algorithm that tracks corneal reflection (Figure 4D), and the following metrics are measured: fixations, saccades, as well as blink rate and pupil size, two additional quantities associated with cognitive processing (see Discussion). A trigger pulse emitted by the MRI system at the start of fMRI is used to time-synchronize the brain activation recordings with a) the TMT task stimulus delivery and stylus responses (as controlled by the Stimulus/Response computer); and b) the eye-tracking data with TMT performance. To facilitate data analysis, the eye-tracking data are additionally “time-stamped” to provide labels associated with key events during the experiment, including the start and end times of each TMT-A and TMT-block in a given run.
An additional lab member is primarily responsible for the eye-tracking setup with the participant, eye-tracking calibration, and real-time visual inspection of eye-tracking data acquisition. Calibration and validation of the eye-tracking system is performed prior to the first TMT run (Figure 4E), and in a "drift-checking” procedure between the first and second TMT runs, to ensure consistency of results while accounting for possible slight changes in head position (see Protocol below for exact specifications and sequence). The calibration consists of a nine-point eye-tracking test, with the participant required in each case to fixate at a target in the center of the display, followed successively by eight different peripheral targets, in pseudo-random order. For validation, the participant tracks the same nine targets again, and the calibration model is used to estimate the gaze position. This enables a set of error measurements to be collected, constituting the difference between the estimated gaze and the actual target location. Spatial error is reported in degrees of visual angle on test completion. The initial calibration and validation are acceptable if the average error is <0.5o and the maximum error is <1.0o, corresponding to the “GOOD” grading provided by the eye-tracking software. Other categories with successively worse errors are graded as, for example, “FAIR”, “POOR”, or “FAILED”, necessitating recalibration and validation. The lab member can also check for outlier errors, which may indicate a mis-fixation at one point, or systematic error patterns that suggest a setup issue with the eye tracker. Between runs, the drift-checking procedure consists of performing a validation test with fixation at the central target only. A successful check (maximum error < 2.0o) permits the second TMT run to proceed; otherwise, the lab member must perform calibration followed by validation until the average error is <1.0o, and the maximum error is <2.0o. All error values are logged for later evaluation. The standard settings of the eye-tracking system software are used to categorize the eye-tracking data into saccades and fixations. Saccades are classified by the following detection thresholds: motion 0.1o; velocity 30o/s; and acceleration 8,000o/s. All other eye-tracking data are classified as fixations.
Neuroimaging
A 3-Tesla MRI system is used with a 64-channel head coil to obtain high-quality neuroimaging data. Anatomical acquisition begins with a high resolution, three-dimensional, sagittal T1-weighted magnetization-prepared rapid gradient echo (MPRAGE) sequence (repetition time/echo time/inversion time/flip angle TR/TE/TI/FA=2,500 ms/4.37 ms/1,100 ms/7o, generalized auto-calibrating partially parallel acquisitions (GRAPPA) factor 2, 256 x 256 matrix, 192 slices, 1 mm isotropic voxels, 3 min:45 s imaging time). An indirect measurement of brain activity is then obtained by fMRI of blood oxygenation level-dependent (BOLD) signal contrast arising from neurovascular coupling21. For fMRI, the typical T2*-weighted BOLD acquisition uses echo-planar imaging (EPI, TR/TE/FA = 1,750 ms/30 ms/40o, slice acceleration 2, phase acceleration 2, 80 x 80 matrix, 60 slices, 2.5 mm isotropic voxels, 165 time points, 4 min:49 s imaging time). Two such fMRI runs are conducted for TMT (described above).