Open-Source Pipeline of Cursor-Tracking for Likert-Scale Data Collected in Qualtrics

Nellie Siemers; Zachary Jamieson; Xinran Gao; Mira Saad; Bärbel Knäuper

doi:10.5334/jors.690

(1) Overview

Introduction

The software pipeline presented here provides an open-source, end-to-end framework for collecting and processing cursor-tracking data in Likert-type rating scales administered via Qualtrics. It integrates a preconfigured survey template with embedded JavaScript for cursor data capture, Python scripts for deterministic preprocessing and generation of cursor-informed scoring outputs, and R scripts for visualization and analysis. The pipeline was developed in the course of an empirical feasibility study examining cursor-tracking during Likert-scale responding [1]. While the theoretical rationale and empirical application of the approach are described in the feasibility study, the present paper focuses on the software itself: its design, implementation, and reuse potential as a reproducible research tool.

Existing open-source cursor-tracking software for online surveys has primarily been designed for binary choice paradigms [2]. This limits its applicability given that many psychological measures rely on Likert-scale formats rather than binary responses [3, 4]. Prior research has also demonstrated cursor-tracking data during Likert-scale survey responding and has used cursor metrics to characterize group-level movement patterns, response styles, or condition differences (e.g., Cepeda et al., Cheetham et al., Dias et al., Weisgarber et al. [5, 6, 7, 8]). However, these approaches rely on aggregated cursor data and do not integrate cursor trajectories into item-level scoring or response reinterpretation. By item-level scoring, we refer to the integration of cursor movement data into the scoring of individual questionnaire items, such that each Likert-scale response can be adjusted based on the decision-making process captured during responding. For example, deviations between a participant’s cursor trajectory and their final selected response (e.g., initially moving toward a lower response option such as ‘2’ (disagree) and then shifting toward a higher response option such as ‘4’ (agree) before making a final selection ‘5’ (strongly agree) on a 5-point Likert scale ranging from 1 = strongly disagree to 5 = strongly agree) can be used to generate a corrected score for that specific item (e.g., a cursor trajectory from 2 to 4 to 5, resulting in a corrected score of 4). This is particularly important because traditional self-report responses capture only the final selection, potentially overlooking meaningful response dynamics that demonstrate uncertainty (e.g., moving the cursor back and forth between two response options) that occur during the response process [9, 10]. These corrected item-level scores can then be aggregated across items to produce revised scale-level scores, allowing for direct comparison with the original self-reported scores. Incorporating these cursor movement patterns at the item level may therefore improve the accuracy of measurement by capturing individuals’ uncertainty while responding that are not reflected in the final selected option alone (i.e., integrating their cursor movements for each item may show how much or how little people are uncertain of their selection) [1, 10]. The present pipeline extends existing survey-style cursor-tracking implementations by providing reusable infrastructure that supports cursor tracking at the item level within standard survey workflows (i.e., using Likert rating scales), enabling researchers to incorporate cursor-derived information alongside conventional self-report data in a transparent and reproducible manner.

Implementation and architecture

Our open-source cursor-tracking software builds on Mathur and Reichling’s [2] binary paradigm cursor-tracking version by adapting their Qualtrics template and JavaScript/CSS structure to accommodate Likert-type rating scales. Insights from Dias et al. [7] and related work [5, 11, 12] further informed our approach, particularly in incorporating metrics to capture features such as velocity, acceleration, and angular displacement. The resulting software provides a user-friendly data collection and analysis pipeline for cursor tracking for Likert scales in online surveys on Qualtrics. The Dataflow Architecture (Figure 1) provides a comprehensive pipeline for collecting, processing, and analyzing data related to cursor movements during responses to items or questions with Likert scales. All the sample materials are available on OSF (https://osf.io/ydwhn/) and the dataflow architecture is available on GitHub (https://github.com/nSiemers/Likert_CursorTracking). The data architecture integrates multiple tools and platforms, including Qualtrics, Python, and R, to ensure a smooth and reproducible workflow. The purpose of our data architecture is to capture and analyze Likert-scale cursor tracking metrics collected during participant interactions with survey questions. The purpose of these metrics is to gain insights into decision-making processes during responding. This architecture is designed to:

Facilitate Data Collection: Using Qualtrics surveys embedded with JavaScript, participants’ interactions are tracked and recorded in real-time, ensuring consistent and accurate data collection.
Streamline Data Processing: Python scripts process the raw data exported from Qualtrics, analyze the cursor movement metrics, compute corrected scores, and outputs a structured dataset.
Support Comprehensive Analysis: Processed datasets are imported into R, where statistical analyses can be conducted using a rebuilt data frame.
Enable Reproducibility: All resources including the Qualtrics survey template with the embedded JavaScript code, Python scripts with necessary packages, and R scripts are hosted on GitHub. All necessary information to utilize our cursor tracking method from survey setup to data analysis is provided here and on GitHub.

Integration and Accessibility

All components of the dataflow architecture are centralized in a publicly accessible GitHub repository, ensuring transparency, ease of use, and reproducibility for researchers. The repository contains: Survey Template: Pre-designed Qualtrics survey template embedded with the JavaScript for cursor tracking; JavaScript Code for Qualtrics: Integrated scripts that enable cursor tracking, including grid-based x and y position logging, response time captures, and interaction metrics; Python Scripts: Comprehensive scripts for processing raw survey data, analyzing cursor movements, and computing corrected scores; and R Analysis Script: R script to transform the processed JSON data into analysis-ready data frames and conduct statistical analyses.

The GitHub serves to provide the entire dataflow process, and the README file provides the necessary information to load collected data into Python for processing the cursor-tracking data. By following the README steps and using the GitHub data architecture, users will have the necessary environment setup, Python and R package dependencies, and installation steps. Moreover, the OSF link contains sample outputs from the feasibility study at each stage (e.g., raw CSV and processed JSON file, generated cursor-tracking images) to guide users in verifying their results.

Data collection via Qualtrics

The survey template is designed and implemented using the Qualtrics platform, leveraging embedded JavaScript software for capturing detailed cursor tracking data [2]. For optimal functionality, the survey must be built and run in Google Chrome, as using other browsers may alter certain functionalities and compromise consistency. The JavaScript integration positioned the Likert scale on a grid and recorded the x- and y-coordinates of the cursor throughout each interaction. Cursor position data are recorded at fixed, regular intervals. Specifically, cursor coordinates (x, y) and corresponding timestamps are collected every 25 milliseconds, as defined by the interval variable within the JavaScript function ‘getMousePosition.’ This parameter can be adjusted as needed to increase or decrease the temporal resolution of the recorded data. Automated alerts are included to enhance data quality by notifying participants of potential issues: a ‘Too Early’ alert prompts participants who attempted to proceed too quickly, a ‘Too Late’ alert encourages timely engagement when responses are delayed, and a ‘Screen Size’ alert notifies participants if their display dimensions do not meet the requirements for accurate tracking. Additional variables recorded include the time at which each question was fully loaded and ready for interaction (onreadytime), the exact timestamp when the ‘Next’ button was clicked (buttonclicktime), and the number of times response options were clicked. Please see Table 1 for all included variables. The ‘Loop and Merge’ function in Qualtrics is employed to control the sequence of survey items, facilitating consistent navigation and integrating cursor tracking across all items. To maintain alignment and presentation consistency, longer questions are separated into individual blocks so that the ‘Next’ button remains centered beneath the Likert scale.

Table 1

Definition and location of variables.

VARIABLE	UNIT	LOCATION	MEANING
Mid	int	Python	Location of where in the array the answer was clicked
xPos	Px	Javascript	x-coordinate of cursor relative to upper left-hand corner of browser
yPos	Px	Javascript	y-coordinate of cursor relative to upper left-hand corner of browser
time	ms	Javascript	time elapsed from beginning of survey
onLoadTime	ms	Javascript	time where the specific question started loading
onReadyTime	ms	Javascript	time at which the page for each trial was loaded
buttonClickTime	ms	Javascript	time at which the question to a specific question was answered
pageSubmitTime	ms	Javascript	time at which subject proceeded to next trial by clicking ‘Next’
windowWidth	px	Javascript	Width of subject’s browser window at beginning of trial
windowHeight	px	Javascript	Height of subject’s browser window at beginning of trial
Latency	ms	Javascript	Time between OnReadyTime and first cursor movement
alerts	int	Javascript	Alerts received during each trial:
			0 = None
			1 = Started too early
			2 = Started too late
			3 = Surpassed time limit for trial
			4- window too small to fully display experiment
browser_Browser	String	Javascript	Internet Browser
browser_Version	String	Javascript	Browser version
browser_Operating System	String	Javascript	Operating system
browser_Resolution	String	Javascript	Browser resolution
UserLanguage	String	Javascript	Language used by participant
NextButtonLeft	px	Javascript	Location of Next button left most boarder
NextButtonRight	px	Javascript	Location of Next button right most boarder
NextButtonTop	px	Javascript	Location of Next button top most boarder
NextButtonBottom	px	Javascript	Location of Next button bottom most boarder
Question Variables	px	Javascript	Location of each button in the x,y plane of the browser
QuestionCounter	int	Javascript	The number of times a person presses an answer to a specific question
AmbivalenceScore (Axis Crossing)	String	Python	Whether the cursor remains on one side of the Likert-scale or crosses over response option 3 (Undecided)
Speed	ms	Python	How long a person took to answer a specific question
Velocity	px/ms	Python	How fast one is traveling through the browser at any given time
Acceleration	px/ms^2	Python	How much one’s velocity has changed when traveling through the browser at any given time
Angular Displacement	int	Python	How much one’s direction changes when traveling through the browser at any given time
CorrectedScore	String	Python	The score that is corrected according to regression
StartDate	String	Javascript	Date and time of when the person started
EndDate	String	Javascript	Date and time of when the person ended
Duration	ms	Javascript	Duration that the participant took to answer the survey
Finished	Boolean	Javascript	Whether or not the participant finished the survey
ID	String	Python	Unique ID for each Participant

[i] Note. Px = Pixel; Ms = Millisecond; Int = Integer; Boolean = True or False; String = List of Characters.

Additionally, to ensure that the cursor-tracking software works as intended, participants are provided with a step-by-step instructional video (see the OSF link for video materials or Siemers et al. [1] for more information) to clarify the cursor-tracking procedure and ensure they understand how to interact with their cursor. This video also provides a visual demonstration of the cursor-tracking interface as implemented within Qualtrics. Moreover, a practice set of items is included at the beginning of the survey to familiarize participants with the cursor tracking interface and ensure they understand the procedure. Following the practice round, the main items (i.e., questionnaire) are presented, where the actual data is collected. This combination of JavaScript integration, browser, and survey design aims to ensure consistent data collection and participant comprehension.

Data export and initial processing

Once participants complete the survey, the data is exported directly from Qualtrics as a raw CSV file. This file contains all recorded metrics and participant responses, preserving the integrity and structure of the data for subsequent processing. The raw CSV file includes the comprehensive set of variables captured during the survey, specifically those related to cursor tracking and participant interactions. Cursor tracking metrics include the x- and y-coordinates of the cursor representing movement across the screen, the total number of clicks on the Likert-scales response options (Question Counter) and time-related variables such as the ‘onreadytime’ function. Response-related variables as well as survey logic data such as triggered alerts (e.g., ‘Too Early’ or ‘Too Late’ warnings) or screen size adjustments are also included. Additional metadata includes block and question identifiers, browser type and participant identification codes (see Table 1). This raw data serves as the foundation for further analysis, enabling detailed examination of cursor tracking data and responses. The next step involves processing this data in Python to compute corrected scores and other derived variables essential for downstream analysis.

Data processing with Python

The data processing begins by loading the raw CSV file exported from Qualtrics into Python. Detailed, step-by-step instructions for this process are provided in the GitHub README file. Users are guided on how to install the required Python environment and necessary packages listed in the README, navigate to the appropriate script in the repository, and load the CSV file by specifying its path in the script or using provided commands. Once the raw data is loaded, our Python script (‘analysis.py’) processes and transforms the dataset. Cursor tracking metrics are analyzed by extracting and examining the x- and y-coordinate data to map cursor movements, measuring the number of clicks (i.e., how many times someone selected a response option on the Likert-scale), and calculating response times based on ‘onreadytime’ and ‘buttonclicktime’ for each item. A corrected score variable is also computed to reflect decision-making dynamics inferred from cursor movements (i.e., uncertainty).

Additionally, the script generates 20 examples of cursor tracking visuals (found in the Results/Generated_Images folder) based on randomly selected survey takers from an available directory (i.e., an uploaded data file). It assigns each participant a random survey date and a randomly chosen question they answered. Using Matplotlib, the code plots their real cursor trajectory in brown, a regression-generated path in red (from which the corrected score is derived), and the ideal cursor movement path in pink. This visualization helps analyze how users interact with the cursor-tracking procedure and compare their movement patterns to an expected trajectory (i.e., the ideal movement path). Additional transformations, such as calculating angular displacement and other derived variables (see Table 1), are also provided for future analyses.

Moreover, the JavaScript in our Qualtrics survey and our Qualtrics Template setup are configured for the current number of blocks and items in the provided survey template. If these numbers change, both the Qualtrics JavaScript and the ‘Loop and Merge’ function will need to be updated. The Python code is configured to adapt to changes in the Qualtrics Template and JavaScript as long as the total number of items remains the same. Specifically, in the ‘Loop and Merge’ function, this is where the questionnaire items are listed in Qualtrics. If the number of items changes, the ‘Loop and Merge’ section must be edited accordingly. In the JavaScript, the number of question items is defined under the line ‘var howManyRealImages = 39.’ This value must be updated if the number of practice or questionnaire items changes. In the Python code, at the top of the script, the variable ‘Total_Question_Num = 38’ specifies the total number of items to be analyzed. This value is one less than the JavaScript setting (39) because Python indexing starts at 0. Currently, with seven practice questions and 32 main questionnaire items (39 total), this results in ‘var howManyRealImages = 39’ in JavaScript and ‘Total_Question_Num = 38’ in Python.

The current setup includes one block in Qualtrics with seven questions for the practice section, followed by the main questionnaire, which is divided into two blocks: Block 1 contains 31 shorter questions, and Block 2 contains one longer question. The longer question is placed in a separate block to maintain consistent placement of the ‘Next’ button below the Likert-scale response options.

If the number of blocks in the Qualtrics template is modified, the JavaScript must be updated accordingly (i.e., add the cursor-tracking JavaScript into those additional blocks); however, all other steps in the workflow remain unchanged. However, if the total number of items change (e.g., from 7 practice items to 8, and from 32 items on the questionnaire to 28) the python script will require a minor edit to change the total number of items that need to be analyzed. However, changes to the number of Likert-scale response options (e.g., from 5 to 8) require major modifications in both the Qualtrics configuration and the Python code.

After processing, the script saves the transformed data as a JSON file, which includes the corrected scores for each item response, the aggregated cursor tracking metrics (e.g., average time on each item, total clicks, and the additional Metadata). This JSON file is designed to be well-structured and easily compatible with R for further analysis while maintaining the integrity of the processed data. It ensures that the data is organized and ready for use in hypothesis testing, model building, or further exploratory analyses.

Further analysis in R

The processed JSON file, generated by the Python script, can then be imported into R for further analysis. The GitHub repository provides an R script (under the R Analysis folder) that specifies the necessary R packages (e.g., jsonlite for JSON file handling) to load the JSON file into an R environment and to verify the successful import of data by inspecting the structure and content of the JSON file in R. The rebuilt data frame is structured to support advanced statistical analyses and exploratory data visualization. In addition, an accompanying R script provides step-by-step guidance for conducting key analyses (e.g., assessing normality and comparing uncorrected and corrected scores) and is available on the OSF link alongside the sample raw and processed data. By leveraging R’s analytical capabilities and familiarity for psychological researchers, this step facilitates the extraction of meaningful information from the cursor tracking and survey response data.

Likert-Scale cursor-tracking variables and formulas

Primary variable: Corrected scores

As detailed in the feasibility study [1], the cursor-tracking software records participants’ cursor movements as sequences of 𝑥- and 𝑦-coordinates while they respond to items on a 5-point Likert scale, where each response option of the Likert scale is mapped to a fixed location on the screen. A regression line is then fit to the observed trajectory, summarizing the overall path of the cursor. The response option whose position lies closest to the point where this regression line intersects the Likert axis is assigned as the corrected score. This allows participants’ final responses to be adjusted with their cursor path based on their cursor movements (e.g., moving towards ‘2 = Disagree’ before selecting ‘4 = Agree’), thereby providing a score that reflects the decision-making processes captured during responding. The calculation of the corrected score is as follows:

First, the slope coefficient is calculated:

Y_{i} = B_{0} + B_{1} X_{i} + e_{i}

Y_{i} = B_{1} X_{i}

B_{1} = \frac{Y_{i}}{X_{i}}

B_{1} = \frac{1}{n} Σ_{t = n}^{t = 0} \frac{y_{t}}{x_{t}}

Once B₁ is calculated, it defines the regression line summarizing the cursor trajectory (y = B₁𝑥). The Likert-scale response options are represented as a horizontal line at (y = Yf). The corrected score is determined by finding the intersection of these two lines:

x_{f} = \frac{Y_{f}}{B_{1}}, y_{f} = Y_{f}

The corrected score is then assigned to the Likert-scale response option whose x-coordinate lies closest to this intersection point:

Correct Score = min [X_{f} - X_{LikertScale}]

Y_{f} = Y Axis Location for All Likert Scale Response Options

\begin{matrix} X_{f} = X Intercept Between the Regression Line & \\ the Line that Intercepts the Likert Scale \end{matrix}

X_{LikertScale} = X Coordinate  for the Item

Overall, the corrected score provides a measurable and applied way to leverage cursor tracking data to reflect participants’ decision-making processes during Likert-scale responses. This makes it the central variable in our cursor tracking analysis for Likert-scale data. While our software also collects additional variables related to cursor movements (see the next section for more details), these variables are not used for further analysis as we currently lack a validated reference point to interpret them meaningfully [13, 14].

Additional variables for future use

Axis crossing

The Axis Crossing, reflected by the ‘Ambivalence Score’ in the Python script and ‘Ambivalence Value’ in the R script, represents whether a participant’s response option selection crosses the midpoint of the Likert scale. The ambivalence score is derived from the JavaScript array that stores the x-coordinate each time the participant clicks a response option. For each question, every x-coordinate is classified relative to the scale midpoint (1 = Strongly Disagree and 2 = Disagree: negative side (i.e., disagreement); 3 = Undecided: neutral; 4 = Agree and 5 = Strongly Agree: positive side (i.e., agreement)). The score is ambivalent if clicks occur on both sides of the midpoint (at least one negative and one positive), regardless of any neutral clicks. Otherwise, the score is positive if clicks are confined to the positive side (including neutral), negative if confined to the negative side (including neutral), and neutral if all clicks are at the midpoint only.

For example, if individuals click the following response option combinations:

2 then 4 \to ambivalent

4, 5, 3 \to positive (i . e ., agreement)

2, 2, 3 \to negative (i . e ., disagreement)

3, 3 \to neutral

Speed

Speed is calculated as the distance between the start and endpoint of a question divided by the time taken to answer it. This calculation provides an average speed for a single question. Note that this approach does not include a standard deviation, as it is based on a single data point. To calculate a standard deviation, speed would need to be measured across multiple questions, allowing for an average and variability to be computed. Following the transformation done in the previous example, the equation is as follows:

Speed = S = \frac{Δ D}{Δ t} = \frac{\sqrt{{(x_{n} - x_{0})}^{2} + {(y_{n} - y_{0})}^{2}}}{t_{f} - t_{in}}

Velocity

Unlike speed, velocity is calculated at every point along the participant’s path. Like speed, velocity is determined by dividing the distance traveled by the time taken to travel that distance. However, velocity is represented in vector form, providing both magnitude (speed) and direction. This allows us to capture how fast the participant was moving at each point along their path to a response. As velocity is calculated at multiple points for each question, both the average velocity and its standard deviation can be computed for each respective question.

Velocity = V_{n} = \frac{Δ D_{n}}{Δ t_{n}} = \frac{\sqrt{{(x_{n} - x_{n - 1})}^{2} + {(y_{n} - y_{n - 1})}^{2}}}{t_{n} - t_{n - 1}}

To make an average of these velocities, they need to be summated and then divided by the n number of samples in the data set as shown below:

V_{avg} = \frac{1}{n} Σ_{t = n}^{t = 0} V_{t}

The standard deviation can be calculated as follows:

V_{std} = \sqrt{\frac{Σ {(V_{t} - V_{avg})}^{2}}{n - 1}}

Acceleration

Measuring acceleration along the chosen path indicates how much a participant speeds up or slows down on their way to selecting a response option. By calculating acceleration at multiple points, both the average acceleration and its standard deviation can be determined for each response. See the equation below:

Acceleration = A_{n} = \frac{Δ V}{Δ t} = \frac{V_{n} - V_{n - 1}}{t_{n} - t_{n - 1}}

To make an average of these accelerations, they need to be summated and divided the product by n number of samples in the dataset as shown below:

A_{avg} = \frac{1}{n} Σ_{t = n}^{t = 0} A_{t}

As seen with velocity, the standard deviation can be calculated as follows:

A_{std} = \sqrt{\frac{Σ {(A_{t} - A_{avg})}^{2}}{n - 1}}

Angular displacement

To evaluate how sharp the cursor movements turns are, the angular displacement is calculated by using each point as a reference to the next point after the transformation. This is accomplished by measuring the displacement in the x- and y-coordinates and using simple trigonometry as shown below:

\tan θ = \frac{Δ x}{Δ y}

θ = {tan}^{- 1} \frac{Δ x}{Δ y}

θ_{n} = \tan^{- 1} \frac{| x_{n} - x_{n - 1} |}{| y_{n} - y_{n - 1} |}

To make an average of the angular displacement, they also need to be summated and divided the product by n number of samples in the dataset as shown below:

θ_{avg} = \frac{1}{n} Σ_{t = n}^{t = 0} θ_{t}

As seen with velocity and acceleration, the standard deviation can be calculated as follows:

θ_{std} = \sqrt{\frac{Σ {(θ_{t} - θ_{avg})}^{2}}{n - 1}}

These additional variables generated by the software are not currently utilized for further analysis as we lack reference (i.e., comparable) data to interpret their meaning in relation to decision-making processes in Likert-scale data. Specifically, we do not have established benchmarks to determine what certain velocities, or angular accelerations signify in the context of decision-making processes for Likert-Scale cursor-tracking data [13, 14].

Quality control

The software underwent staged testing prior to the empirical feasibility study [1]. Initial pre-pilot testing was conducted by lab members of the McGill Health Psychology Laboratory to verify successful deployment of the pipeline, including survey setup in Qualtrics, cursor data capture, data export, and execution of the Python and R scripts. The software was subsequently tested in the empirical feasibility study in which the full pipeline was deployed and exercised end to end on live survey data collected from a random sample of participants. This included survey administration in Qualtrics, real-time cursor data capture via embedded JavaScript, export of raw survey data, Python-based preprocessing of cursor trajectories, and downstream visualization and analysis in R. Because all components were used together under realistic data collection conditions, the feasibility study functioned as both functional and integration testing of the software in a real-world survey context. This process allowed the authors to verify that the pipeline executed reliably across its components and produced expected outputs when applied to realistic survey data.

Having outlined the data architecture, variables, and formulas of the cursor-tracking software, and its testing, we now turn to examples of the output it generates. The examples shown below are drawn from the feasibility study [1] and are provided here solely to illustrate the type and structure of data produced by the software, as well as to allow users to verify that the pipeline is functioning as intended. Table 2 summarizes technical parameters such as browser information, window size, and alerts; Table 3 shows the detailed variable output for a single questionnaire item, including regression slope, corrected score, ambivalence value, velocity, acceleration, and angular displacement; Figure 2 presents visualizations of our additional variables: velocity, acceleration, and angular displacement, offering an additional way to view the cursor movement data generated by the software.

Table 2

Technical characteristics collected.

	BROWSER TYPE	BROWSER VERSION	OPERATING SYSTEM	ALERTS - TOO EARLY	ALERTS - TOO LATE	WINDOW WIDTH	WINDOW HEIGHT
C1	Chrome	109.0.0.0	Windows NT 10.0	0	2	1440	757
C2	Chrome	125.0.0.0	Windows NT 10.0	0	0	1592	723
C3	Chrome	126.0.0.0	Windows NT 10.0	4	1	1680	889
C4	Chrome	125.0.0.0	Windows NT 10.0	1	0	1536	730
C5	Chrome	125.0.0.0	Windows NT 10.0	1	1	1280	897
C6	Chrome	125.0.0.0	Windows NT 10.0	1	0	1920	919
C7	Chrome	96.0.4664.110	Linux x86_64	7	1	1821	890
C8	Chrome	96.0.4664.110	Linux x86_64	3	3	1821	890
C9	Chrome	125.0.0.0	Windows NT 10.0	0	1	1373	773
C10	Chrome	124.0.0.0	Windows NT 10.0	5	0	1920	911
V1	Chrome	116.0.0.0	Macintosh	2	1	1440	815
V2	Chrome	125.0.0.0	Windows NT 10.0	1	0	1536	730
V3	Chrome	124.0.0.0	Macintosh	0	1	2160	1056
V4	Chrome	125.0.0.0	Macintosh	1	2	1920	1084
V5	Chrome	125.0.0.0	Windows NT 10.0	2	2	1536	825
V6	Chrome	125.0.0.0	Windows NT 10.0	3	0	2049	910
V7	Chrome	125.0.0.0	Macintosh	1	1	1440	728
V9	Chrome	125.0.0.0	Windows NT 10.0	3	4	1920	869
E2	Chrome	125.0.0.0	Windows NT 10.0	4	0	1707	748
E3	Chrome	125.0.0.0	Windows NT 10.0	0	0	1920	953
E4	Edge	125.0.0.0	Windows NT 10.0	1	1	2048	1048
E6	Chrome	125.0.0.0	Windows NT 10.0	0	0	1707	801
E7	Chrome	124.0.0.0	CrOS x86_64 14541.0.0	3	0	1707	791
E8	Chrome	125.0.0.0	Windows NT 10.0	0	1	1680	849
E9	Chrome	125.0.0.0	Windows NT 10.0	1	3	1536	730
E10	Chrome	125.0.0.0	Windows NT 10.0	0	0	1920	911

[i] Note. C1-E10 are de-identified participant identifiers.

Table 3

Overview of collected variable output for a single questionnaire item.

VARIABLE NAME	DATA TYPE	EXAMPLE VALUE
Answer	character [1]	1 = Strongly Disagree
onLoadTime	integer [1]	71103
onReadyTime	integer [1]	71111
buttonClickTime	integer [1]	77487
NextClickTime	integer [1]	78592
windowWidth	integer [1]	1440
windowHeight	integer [1]	757
alerts	character [1]	NONE
latency	integer [1]	2684
NextLeft	integer [1]	670
NextRight	integer [1]	753
NextTop	integer [1]	63
NextBottom	integer [1]	29
AmbivalenceScore	integer [1]	264
Origin	integer [2]	–5
(Question Variables) mid	integer [1]	48
(Question Variables) xPos	integer [62]	0
(Question Variables) yPos	integer [62]	7
(Question Variables) time	integer [62]	73795
(Question Variables) AmbivalenceValue	character [1]	Negative
(Question Variables) QuestionCounter	integer [1]	1
(Question Variables) Regression Slope	double [1]	–0.502
(Question Variables) Corrected Score	character [1]	2 = Disagree
(Question Variables) Speed	double [1]	–0.1331
Velocity (Real)	list [48]	0.0000
Velocity (Normalized)	double [48]	–0.861
Velocity (Average)	double [1]	0.047
Velocity (SD)	double [1]	0.9591
Velocity (Max)	double [1]	3.8172
Velocity (Min)	double [1]	–0.8608
Acceleration (Real)	double [48]	0.0000
Acceleration (Normalized)	double [48]	–0.0719
Acceleration (Average)	double [1]	0.0851
Acceleration (SD)	double [1]	1.4292
Acceleration (Max)	double [1]	4.3457
Acceleration (Min)	double [1]	–5.7688
Angular Displacement (Real)	double [48]	0.0000
Angular Displacement (Normalized)	double [48]	–0.892
Angular Displacement (Average)	double [1]	–0.1819
Angular Displacement (SD)	double [1]	0.8541
Angular Displacement (Max)	double [1]	2.359
Angular Displacement (Min)	double [1]	–0.8918

[i] Note. This data is from participant C1. SD = Standard Deviation; Max = Maximum; Min = Minimum; The character type double is a numerical data type used to represent real numbers with decimal points.

Additional variables cursor movement visualizations.
*Note*. These are examples output from the optional ‘2_Visualizations’ script found in the Python folder on GitHub, which enables users to visually inspect cursor trajectories for each participant and questionnaire item. Plotted here are the real (non-normalized) values for velocity (left), acceleration (middle), and angular displacement (right) across a single participant’s responses to a questionnaire item (participant C1, and question ‘8_URICA’). The script can also display normalized values if desired. These visualizations are supplementary and do not affect the data pipeline; they are provided as an optional tool for users who wish to explore individual-level cursor dynamics in more detail.

(2) Availability

Operating system

Compatible with Windows 10 or later; macOS 10.15 or later; Linux distributions supporting Google Chrome.

Brower requirement: Google Chrome 100 or later.

Programming language

JavaScript (Qualtrics embedded)

Python 3.12

R 4.4.1

Additional system requirements

Input device: computer mouse or trackpad.

Web browser: Google Chrome (latest stable version recommended).

Internet connection for survey deployment.

Sufficient memory and disk space to run Python and R (no special hardware requirements, however, when downloading Python, be sure to select ‘Add Python.exe to PATH’).

Dependencies

Python libraries and R packages required to run the pipeline are listed in the GitHub repository documentation, including version specifications (Python 3.12; matplotlib 3.8; numpy 1.26; Pandas 2.1; scikit-learn 1.4). All data processing and analysis components rely on open-source software. The survey component requires access to the Qualtrics platform.

List of contributors

Nellie Siemers, Zachary Jamieson, and Bärbel Knäuper.

Software location

Additional archive

Name: Open Science Framework (OSF)
Persistent identifier: https://doi.org/10.17605/OSF.IO/YDWHN
Licence: MIT License
Publisher: Nellie Siemers
Version published: v1.0.0
Date published: 20/01/2026

Code repository

Name: GitHub
Identifier: https://github.com/nSiemers/Likert_CursorTracking
License: MIT License
Date published: 24/08/2025

Emulation environment

N/A

Language

English

(3) Reuse Potential

The software can be reused by researchers working with Likert-scale self-report data across a wide range of fields, including psychology, behavioral medicine, decision science, political science, and other disciplines that rely on survey methodology. Within these research contexts, the pipeline can be used to examine decision uncertainty, ambivalence, or response dynamics at the item level in questionnaires assessing motivations, intentions, attitudes, beliefs, or preferences. The software may also be applied in fields such as human-computer interaction and marketing research to study response behavior and cursor movement patterns during survey-based decision-making. Because the pipeline integrates with standard Qualtrics workflows, it can be readily incorporated into online surveys by altering questionnaire content, allowing researchers to adapt it to their specific subject matter.

Extensibility and future development

While the current implementation focuses on regression-based item-level correction, the software is designed to support future extensions using additional cursor-derived metrics such as velocity, acceleration, and angular displacement. Currently, the additional cursor movement variables: velocity, acceleration, and angular displacement are not incorporated into corrected score calculations, as we lack comparable reference data to establish their meaning in the context of Likert-scale responses [13, 14]. As a result, our method for calculating the average slope (i.e., the corrected score via the best-fit line regression) weights all data points equally, whether from periods of significant active movement or near stillness. This can distort the average slope, particularly if a user remains stationary near the ‘NEXT’ button, where even small cursor shifts towards the right or left of the button can create steep slopes that skew the overall trajectory.

Future work may address this limitation by incorporating velocity, acceleration, and angular displacement into the slope calculation, thereby both reducing the potential distortions and further refining corrected scores through richer use of cursor metrics. For instance, higher velocity, combined with higher acceleration and lower angular displacement, may be associated with greater decisional certainty (e.g., a quick, consistent, and direct cursor movement toward a single option). However, this relationship may not be linear—both excessively high and low velocities may indicate uncertainty (e.g., very fast movements may reflect rushing through items without deliberate consideration, while very slow movements with multiple shifts or detours may reflect indecision), with an optimal midrange velocity potentially reflecting certainty. Approaches such as weighted regression methods [15, 16] and machine learning models trained on larger datasets to identify optimal ways of weighting different cursor movement features [17] could be used to dynamically adjust the slope coefficient. Incorporating such methods could further refine the generation of corrected scores and expand the range of cursor-tracking metrics available for interpretation.

Additionally, the current version of the software requires the survey structure to remain largely consistent to function as intended. Minor adjustments, such as changing the total number of items, can be accommodated with small edits to the Python script. However, more substantial modifications, such as altering the number of Likert-scale response options (e.g., moving from the current 1–5 scale to a 1–3 or 1–8 scale), would require significant reprogramming of the Python code. At present, the software is not automated to handle these types of changes. Future development should therefore look at automating the Python code to adapt flexibly to different survey lengths and response scales (e.g., refactoring the code into composable functions) and integrating unit testing, thereby improving usability and making the tool more accessible for researchers irrespective of their programming skills.

Furthermore, the integration of the software within Qualtrics introduces platform-specific constraints, such as the inability to modify the appearance of navigation elements like the ‘Next’ button. While these limitations are minor, they could be eliminated by developing a standalone platform. However, Qualtrics was chosen for its widespread use and accessibility, ensuring the software’s utility for researchers familiar with the platform [18]. Moreover, the software is currently compatible only with Google Chrome, which may limit accessibility for users reliant on other browsers. While this restriction aligns with common research practices requiring specific browser use [19, 20], expanding compatibility in future iterations could enhance its usability.

In sum, the contribution of this paper is the provision of a transparent, open-source software pipeline that makes Likert-scale cursor-tracking practical and reproducible for researchers across disciplines. The software is supported through publicly available documentation and example materials provided in the GitHub repository and on the Open Science Framework. Users may report issues or request features by contacting the first NS and second author ZJ. While no formal user support or maintenance guarantees are provided, the open-source nature of the project allows researchers to inspect, adapt, and extend the code as needed.

Author Contributions

Nellie Siemers: Conceptualization, Methodology, Programming, Writing, Editing;

Zachary Jamieson: Methodology, Programming, Writing;

Xinran Gao: Pilot-testing, Writing, Editing;

Mira Saad: Pilot-testing, Review, Editing;

Bärbel Knäuper: Conceptualization, Methodology, Editing, Supervision.