CORESTA Guidelines for Descriptive Consumer-Reported Outcome Measures in Tobacco and Nicotine Research

Lai Wei; Emilie Clerc; Stacey McCaffrey; Mohamadi Sarkar; Christelle Chrea; Krishna Prasad

doi:10.2478/cttr-2026-0001

Full Article

INTRODUCTION

The wide variety and diversity of available tobacco- and nicotine-containing products (TNPs) and the continued development of new products require a constant evolution of self-report measures to adequately assess TNP use and related constructs (1, 2). Many of the currently available measures are based on or adapted from questionnaires/items initially developed for people who smoke conventional cigarettes (3). New consumer-reported outcome measures (CROM ⁽¹⁾), which are data collected by self-report from research participants pertaining to perceived states, behavior, and/or understanding of messages, are needed to build the foundation for standards of measurement in research on TNPs in the ever-changing TNP landscape. For example, the use of electronic nicotine delivery systems (ENDS) has increased in the United States (U.S.) and internationally over time. ENDS are a diverse product category with variation across products in aerosol production and nicotine levels and flavorings contained within the liquids (4). Similarly, nicotine-containing oral products, such as nicotine pouches and lozenges, and heated tobacco products (HTPs) also exhibit substantial variability in product design, nicotine delivery, and usage patterns. A lack of common definitions of patterns of use and user types, in addition to high variability across these product categories, limits consistency in measurement and complicates comparisons among research studies (4, 5).

Some measurement and standardization initiatives have been proposed to assess consumer perception and behavior associated with TNPs (2, 3, 6), allowing for a comparison of results across similar studies in research on TNPs. For example, the PhenX Tobacco Regulatory Research Toolkit, funded by the U.S. National Institutes of Health (NIH) and the U.S. Food and Drug Administration (FDA) Center for Tobacco Products, was developed to expand the breadth and depth of tobacco product-related measures to enhance cross-study analysis in large-scale research (2). Similarly, the Assessment of Behavioral Outcomes related to Tobacco and nicotine products (ABOUT) Toolbox was developed to allow for comparisons of consumer perceptions and behaviors across various TNPs and across research studies, to facilitate informed decision-making regarding the regulation of TNPs, and to improve surveillance on the impact of TNPs on public health (3). Additionally, in 2022, the U.S. FDA published final guidance for designing and conducting tobacco product perception and intention (TPPI) studies that may be submitted as part of a modified risk tobacco product application (MRTPA), a premarket tobacco product application (PMTA), or a substantial equivalence (SE) report (7). These guidelines provide general recommendations related to the selection, development, and adaption of perceptions, comprehension, and behavioral intention CROM, but do not provide specific guidance related to the optimal CROM to assess these domains, nor do they provide recommendations related to other types of Descriptive CROM commonly used in TNP research, such as tobacco product use patterns. Despite these ongoing efforts, as TNPs continue to evolve, consensus on survey measures in research on TNPs is especially challenging, particularly for emerging TNP categories.

Descriptive CROM are self-reported survey outcome measures that are intended to measure observable characteristics and behaviors. Examples of Descriptive CROM include sociodemographic variables, product use behaviors, and transitions between use states. The purpose of this document is to address gaps in existing toolboxes and guideline documents by providing more comprehensive guidance to researchers with respect to Descriptive CROM. Specifically, the guidelines provide recommendations on foundation definitions and recommended Descriptive CROM, as well as guidance related to the development, modification, and application of Descriptive CROM, and Descriptive CROM data collection, analysis, and reporting. These recommendations are developed to resultantly reduce sources of measurement error and bolster validity of data collected from Descriptive CROM and to promote data comparability and cross-study analyses.

METHODS

The development of the guidelines to provide recommendations on the selection, development, implementation, and analysis of Descriptive CROM in TNP research was initiated by the Cooperation Centre for Scientific Research Relative to Tobacco (CORESTA) Consumer-Reported Outcome Measures Task Force (CROM TF) ⁽²⁾ Descriptive CROM working group in 2021. The CROM TF consists of members from seven contributing manufacturers, and its primary objectives are 1) to guide the development, modification, and application of CROM, and 2) to facilitate the identification of and access to recommended CROM.

The CORESTA Scientific Commission oversees the consortium to ensure the work conforms to CORESTA standards. The best practices and guidelines developed by the CROM TF focus on CROM for adult consumers above the legal age to purchase TNPs. A consortium approach, with contributions from manufacturers and industry partners, has been taken to develop a scientific framework based on the following shared vision:

To work together to create a paradigm shift in the way CROM are conceptualized and implemented in research on TNPs,
To work with subject matter experts (SMEs) to establish guidance for developing and validating new measures,
To establish consensus on existing survey measures and research methods,
To use a core set of concepts and tools to facilitate sharing, comparing, and replicating findings, and integrating data from multiple sources.

The best practices and guidelines on Descriptive CROM were proposed by Descriptive CROM WG core team members and reviewed by advisory board members (8). CROM TF Descriptive CROM WG also collaborated with other CORESTA subgroups, including the CORESTA Product Use Behavior subgroup, the In Vitro Toxicity Testing subgroup, and the Tobacco and Tobacco Products Analysis subgroup, to align product category-specific measures and definitions. Additional SMEs were invited to provide written feedback or to participate in discussions via virtual meetings. These individuals brought diverse expertise from academia, industry, and public health, including behavioral science, epidemiology, toxicology, regulatory science, and product assessment. We acknowledge the reviewers in the Acknowledgement section of this manuscript. Consensus among SMEs was reached through iterative discussions during virtual meetings and written feedback exchanges. Agreement was documented through meeting summaries and collaborative revisions to the guideline drafts. This approach was deemed appropriate given the focused scope of the working group and the specialized expertise of the participants. During the guideline development phase, the CROM team also actively engaged with subject matter experts (SMEs) from academic and public health communities through scientific meetings to solicit feedback for refining and updating the guidelines. This collaborative approach fostered consensus on survey measures and research methodologies.

RESULTS

Descriptive CROM best practices and guidelines include four main sections:

1)
Foundational definitions,
2)
Recommendations based on existing descriptive CROM,
3)
Development and modification of descriptive CROM, and
4)
Descriptive CROM data collection, analysis, and reporting.

In this article, we will focus exclusively on the recommendations provided in Sections 1, 2 and 3. As for recommendations in data collection, analysis, and reporting, the readers can refer to the Descriptive CROM guideline published on the CORESTA website (8).

Section 1: Foundational definitions

In this section, we introduce TNP classifications with detailed descriptions of each category and TNP use state definitions to facilitate survey instrument development and secondary analysis of survey data. Clear classification of TNP categories and TNP use states is essential to facilitate accurate assessment of product use behavior to allow for comparisons of results across surveys. Additionally, the definitions of TNP use states would affect the development of survey conditional branching (i.e., skip logic). Consistency of definitions is essential for data analysis and reporting as it will improve harmonization in research findings and make research findings comparable.

Table 1 shows a classification system developed for TNPs that separates products into two main categories: combustible and non-combustible. This classification aims to provide a clear overview of existing TNPs on the market and to support the development of CROM that assess TNP use behavior by categorizing and describing each product. A brief description of the product categories with example product images in questionnaires can further enhance survey clarity.

Table 1.

Classification of tobacco- and nicotine-containing products (TNPs).

Category		Subcategory	Category/subcategory description
Combustible products	Cigarette	Manufactured cigarette Roll-your-own cigarette	A cigarette is a tube-shaped tobacco product that is made of finely cut, cured tobacco leaves wrapped in thin paper. A cigarette is lit on one end, and the smoke is inhaled.
			Roll-your-own cigarettes are made of loose tobacco that is placed inside rolling paper. As with manufactured cigarettes, one end is lit, and the smoke is inhaled.
			(Source: Cigarettes \| NCI (content as of Apr 11, 2022), Cigarettes \| FDA (content as of Apr 29, 2021, accessed Dec 27, 2021), and Roll-Your-Own Tobacco \| FDA (content as of Dec 21, 2019, accessed Feb 1, 2026))
	Cigar/cigarillo	Traditional cigar Cigarillo Little filtered cigar	A cigar is a roll of tobacco wrapped in leaf tobacco or in a substance that contains tobacco. They vary in size—from smaller cigars, such as little filtered cigars or cigarillos, to larger ones, such as large so-called premium cigars. The cigar is lit on one end and smoked, but the smoke is usually not inhaled into the lungs.
	Cigar/cigarillo	Traditional cigar Cigarillo Little filtered cigar	(Source: Cigars, Cigarillos, Little Filtered Cigars \| FDA (content as of Jun 11, 2021, assessed Dec 27, 2021), Cigars \| NCI (content as of October 23, 2023))
	Pipe	—	Pipe tobacco is generally loose-leaf tobacco burned in a traditional smoking pipe with a bowl. A pipe is a device with a mouthpiece at one end of a tube, and a small bowl at the other end that is filled with tobacco, which is lit and smoked. The smoke from a pipe is usually not inhaled into the lungs.
	Pipe	—	(Source: Pipe Tobacco \| FDA (content as of Oct 06, 2020, accessed Dec 27, 2021), Pipe (NCI) (accessed Dec 27, 2021))
	Hookah (shisha or waterpipe tobacco)	—	Hookah tobacco (also known as waterpipe tobacco, maassel, shisha, narghile, or argileh) is smoked with a hookah (waterpipe). A form of moist tobacco is placed in the head of the hookah with charcoal placed on top (often separated by perforated aluminum foil) to provide a heat source.
			The heated air, passing over the charcoal, contains charcoal combustion products, passes through the tobacco, and the mainstream smoke aerosol is produced. The smoke then passes through the waterpipe body, bubbles through the water in the bowl, and is carried through the hose and inhaled or puffed by users via a mouthpiece.
			(Source: Hookah Tobacco (Shisha or Waterpipe Tobacco \| FDA (content as of Jan 03, 2020, accessed Feb 1, 2026), Water pipe \| NCI (accessed Dec 27, 2021), Sutfin et al. (78), Waterpipe Tobacco Smoking \| WHO (accessed Jun 07, 2022))

Non-combustible products	Electronic nicotine delivery systems (ENDS)	E-cigarette E-cigar E-pipe E-hookah	ENDS are battery-powered devices that are designed to electrically heat a liquid (may also be called an e-liquid), to produce an inhalable aerosol. The most common ENDS are ‘electronic cigarettes’, also known as ‘e-cigarettes’. There are currently four major types of ENDS products: disposable ENDS products, ENDS products with replaceable pre-filled cartridges or pods, tank systems that can be filled with liquids, and modular systems that can be filled with liquids. Several terms and acronyms are used to describe this product category, including e-vapor, vapes, vaporizers, vape pens, etc. Other subcategories of ENDS could include e-cigar, e-pipe and e-hookah. Some ENDS products are manufactured with non-tobacco nicotine (i.e., synthetic nicotine) ^a. Additionally, the WHO refers to electronic non-nicotine delivery systems as ENNDS ^b.
	Electronic nicotine delivery systems (ENDS)	E-cigarette E-cigar E-pipe E-hookah	(Source: Vaporizers, E-Cigarettes, and other Electronic Nicotine Delivery Systems (ENDS) \| FDA (content as of Sep 17, 2020, accessed Dec 27, 2021), World Health Organization (79) and CORESTA (80))
	Heated tobacco products (HTPs)	—	HTPs contain a tobacco substrate that is designed to be heated and not combusted by a separate source (e.g. electrical, aerosol, carbon, etc.) to produce a nicotine-containing aerosol. Regulatory agencies, researchers, and manufacturers use a variety of terms and acronyms to describe this product category, such as tobacco heating systems (THS), heat-not-burn tobacco products (HnB), etc.
	Heated tobacco products (HTPs)	—	(Source: CORESTA Product Use Behavior Subgroup Heated Tobacco Products (HTPs): Standardized Terminology and Recommendations for the Generation and Collection of Emissions (content as of Oct 24, 2023, accessed Oct 24, 2023), Heated Tobacco Products \| CDC (content last reviewed, 2020 Dec 16, accessed Dec 27, 2021))
	Smokeless tobacco products	Chewing tobacco	Chewing tobacco is cured tobacco in the form of loose leaf, plug, or twist. The product is chewed during use and subsequently discarded. Loose-leaf chewing tobacco typically consists of loosely packed, cut, or granulated stem-free tobacco leaf to which additional ingredients may be added.
			Plug chewing tobacco typically contains flaked tobacco leaves to which additional ingredients may be added. The product has the appearance of a compressed tobacco brick wrapped inside a natural tobacco leaf. Twist chewing tobacco has the appearance of thick rope-like twists of tobacco.
			(Source: Smokeless Tobacco Products, Including Dip, Snuff, Snus, and Chewing Tobacco \| FDA (content as of Jun 23, 2020, accessed Dec 27, 2021), CORESTA Tobacco and Tobacco Products Analysis Sub-group (i.e., Smokeless Tobacco Sub-group)
		Moist snuff/Dip	Moist snuff/Dip is cut tobacco that can be loose or pre-portioned (i.e., pouched), placed in the mouth, and discarded after use. Moist snuff/Dip is finely ground tobacco packaged in cans or pouches. It may have flavorings added. Moist snuff is commonly placed between the cheek and gum during use and discarded after use.
		Moist snuff/Dip	(Source: Smokeless Tobacco Products, Including Dip, Snuff, Snus, and Chewing Tobacco \| FDA (content as of Jun 23, 2020, accessed Dec 27, 2021), CORESTA Tobacco and Tobacco Products Analysis Sub-Group (i.e., Smokeless Tobacco Sub-group)
		Dry snuff	Dry snuff is loose, finely cut, or powdered dry tobacco that is typically sniffed through the nostrils.
		Dry snuff	(Source: Smokeless Tobacco Products, Including Dip, Snuff, Snus, and Chewing Tobacco \| FDA (content as of Jun 23, 2020, accessed Dec 27, 2021), CORESTA Tobacco and Tobacco Products Analysis Sub-group (i.e., Smokeless Tobacco Sub-group)
		Snus	Snus is cut tobacco that is processed into fine particles. The products are usually placed between the upper lip and gum and are discarded after use. Products are available as loose tobacco or as individually portioned pouches.
		Snus	(Source: CORESTA Tobacco and Tobacco Products Analysis Sub-group (formally known as Smokeless Tobacco Sub-group))
		Dissolvable tobacco products	Dissolvable tobacco products are finely ground tobacco pressed into shapes such as tablets, sticks, or strips. Dissolvable tobacco products can be sold as lozenges, orbs, strips, or sticks. Lozenges resemble pellets or tablets, orbs resemble small mints, sticks have a toothpick-like appearance, and strips are thin sheets that work like dissolvable breath strips or medication strips. Dissolvable tobacco products are placed in the mouth and allowed to dissolve during use.
		Dissolvable tobacco products	(Source: Dissolvable Tobacco Products \| FDA (content as of Jun 14, 2018, accessed Dec 27, 2021), Smokeless Tobacco \| CDC (content as of May 14, 2021, accessed Dec 27, 2021), CORESTA Tobacco and Tobacco Products Analysis Sub-group (i.e., Smokeless Tobacco Subgroup)

	Nicotine-containing oral products	Nicotine pouches Gums Lozenges	Nicotine-containing oral products contain a base substrate, nicotine, and added flavors, but not tobacco leaf. The nicotine can either be derived from tobacco, or synthetic nicotine. These products are exclusively intended for oral use. The products come in a variety of forms, such as pouches, gums, and lozenges. Regulatory agencies, researchers, and manufacturers use a variety of terms to describe this product category, such as ONPs (oral nicotine pouches) and MOPs (modern oral products).
	Nicotine-containing oral products	Nicotine pouches Gums Lozenges	(Source: Pouwels et al. (81), CORESTA Product Use Behavior Subgroup)

a

FDA has begun to regulate tobacco products containing nicotine from any source including tobacco products containing NTN, that is, nicotine not made or derived from tobacco, such as synthetic nicotine (Effective Date: April 14, 2022).

b

ENNDS: The “WHO report on the global tobacco epidemic 2021: addressing new and emerging products” explains that ENNDS are included in WHO reports because they are almost indistinguishable from ENDS. They often have enhanced flavors that appeal to young people and may be perceived as being a safer, less addictive option. Although ENNDS are marketed to not contain nicotine, many e-liquids have been found to contain nicotine when tested. Furthermore, depending on the device used, users may be able to select e-liquids that contain nicotine or not (79).

The CORESTA CROM TF classifies survey respondents into various TNP use states based on lifetime, past, and current TNP use. While it may not be possible to list every combination of tobacco use states, key transitions are addressed in a simplified conceptual flow diagram, showing changes in use behaviors within a single TNP category (‘Figure 1). The conceptual framework includes the following use states:

Never or Experimental Use (blue),
Current Use (yellow),
Established Use (orange), and
Former Use (orange).

Please refer to the Definition of Terms table in Descriptive CROM Guideline for detail definitions of the terms mentioned below (8). The flow chart begins with a ‘Never Use’ state. After trying a TNP once, the individual moves to the ‘Ever Use’ state (i.e., becomes an ever user of a TNP), which is often called initiation. TNP initiation generally refers to the first use of a given TNP. When the individual starts using the product, they move to a ‘Current Use’ state. While ‘Current Use’ is often defined as using the product ‘every day’ or ‘some days’, we can also define ‘Current Use’ based on past 30-day usage, specifically, ‘having used the TNP in the past 30 days’. ‘Current Use’ can be further categorized as ‘Current Experimental Use’ and ‘Current Established Use’. If the individual has not reached the predefined criterion for ‘Lifetime Established Use’ (e.g., 100 cigarettes), the individual may be classified into the ‘Current Experimental Use’ state. If an individual reaches the lifetime established use criterion while using the product, the individual could be classified into the ‘Current Established Use’ state. ‘Current Use’ state can be further characterized by daily versus non-daily use based on product use being reported every day or some days (or the product reported being used 30 days out of the past 30 days), or frequent versus infrequent use based on the number of days the product was used in the past 30 days (e.g., greater than versus less than 20 days). CORESTA CROM Task Force recommends using the suggested lifetime established use criterion (Table 2) to distinguish individuals who are experimental users from those who are established users, as distinct differences have been found between these two groups for various TNP categories (9,10,11,12). Additionally, dual and poly-use states can be met if an individual is classified as a current user of two or more TNPs from different TNP categories or subcategories. Context should be provided for studies that focus on dual usage, including the TNP categories that are being assessed, the definition of use (e.g., current, past 30-day, past year, or ever use), and inclusion or exclusion of other TNPs. Some existing research has further categorized individuals who are dual users into four segments depending on their use frequency (daily or frequent use) of the two products due to the heterogeneity found within these individuals (13,14,15,16,17).

Table 2.

Adult lifetime established use criterion for tobacco- and nicotine-containing product (TNP) categories.

Category	Threshold type	Suggested criterion for established use	References
Cigarette	Numerical	Having smoked 100 cigarettes	(82,83,84,85)
Cigar	Numerical	Having smoked 50 cigarillos/traditional cigars/filter cigars	(86,87,88,89)
Cigar	Non-numerical	Having smoked cigarillos/traditional cigars/filter cigars fairly regularly	(87, 89)
Pipe	Numerical	Having smoked 50 bowls filled with pipe tobacco	(12, 82)
Pipe	Non-numerical	Having smoked pipe tobacco products fairly regularly	(90)
Hookah	Numerical	Having smoked hookah 20 times ^a	(89)
Hookah	Non-numerical	Having smoked hookah products fairly regularly	(12, 90, 91)
Electronic nicotine delivery systems (ENDS)	Numerical	Having used ENDS products 20 times ^a	(89)
Electronic nicotine delivery systems (ENDS)	Non-numerical	Having used ENDS products fairly regularly	(5, 12, 92)
Heated tobacco products (HTPs)	Numerical	Having used 100 or more heatsticks	(93)
Heated tobacco products (HTPs)	Non-Numerical	Having used HTP fairly regularly	—
Smokeless	Numerical	Having used smokeless tobacco 20 times ^a	(12, 94, 95)
Smokeless	Non-numerical	Having used smokeless tobacco fairly regularly	(12, 96)
Snus	Numerical	Having used snus 20 times ^a	(12, 89, 97)
Snus	Non-numerical	Having used snus tobacco fairly regularly	(12, 96)
Dissolvable	Numerical	Having used dissolvable TNPs 20 times ^a	(89)
Dissolvable	Non-numerical	Having used dissolvable TNPs fairly regularly	(12, 90)
Nicotine-containing oral products	Numerical	Having used nicotine-containing oral products 20 times ^a	—
Nicotine-containing oral products	Non-numerical	Having used nicotine-containing oral products fairly regularly	—

a

‘One time’ refers to a typical session when the participant picks up the product to use it. For example, the description in the PATH Wave 5 questionnaire is “the participant picks up the ENDS product to use it. Multiple puffs can be taken within one session.”

Lastly, cessation is defined as stopping the use of the TNP after having used the product to at least its lifetime use criterion. Individuals who are former users need to report having not currently used the product for a predefined timeframe and can be further classified into ‘Recent Former’ and ‘Long-Term Former’ states based on when the TNP was last used (e.g., < 1 year versus ≥ 1 year, based on predefined timeframe). If the individual is classified into a ‘Recent Former’ user state and reports ‘having completely quit’ the TNP, the product use state is characterized as ‘Recent Quitting’. Individuals initially characterized as ‘Long-Term Former’ are further characterized as ‘Successful Quitting’ if the individual remains abstinent for a predefined period (e.g., 1, 2, or 3 years) based on research objectives. Lastly, relapse/re-initiation has been used in the literature to define restarting the TNP after a period of abstinence (e.g., 1 year) and may occur during the former use states. Existing research demonstrates that such ‘relapse/re-initiation’ is less likely after being abstinent for longer than 1 year (18, 19).

Section 2: Recommendations based on existing Descriptive CROM

The Descriptive CROM conceptual domain framework (Figure 2) was developed based on a comprehensive review of the extant literature and existing TNP surveillance surveys conducted at a national and international level and is used to guide the development of Descriptive CROM recommendations. We selected fifteen national or international surveys, covering a wide range of Descriptive CROM for adults who use TNPs. An overall summary of the fifteen surveys is shown in Appendix Table 1 (see also Appendix Table 2 for a summary of survey methodology). Surveys were selected based on their representativeness, diversity in domains, geographic coverage, and accessibility of questionnaire data. All selected surveys are ongoing, with a regular data collection schedule. The review focused on the most recent survey questionnaires being administered among adult populations. Most of the selected surveys were designed to monitor TNP use trends and participant health status at a national level, and hence, were conducted among nationally representative samples of the respective adult populations. Only publicly available surveys were included, and some surveys were not included due to overlap in Descriptive CROM domains. Survey items from the most recent survey questionnaires were discussed and grouped into three main domains based on the CROM Conceptual Domain Framework:

1)
Population-level,
2)
Product category-level, and
3)
Poly-/cross-category-level.

The population-level domain includes survey measures that are typically posed to all survey respondents to evaluate demographics and socioeconomic status (SES, e.g., income, education, etc.). The product category-level domain includes survey measures that evaluate TNP consumption, brand and flavor preferences, initiation and cessation of use, and reasons for use for each TNP category among users of a particular TNP. Lastly, the poly-/cross-category-level domain includes survey measures that evaluate dual/poly use and switching between TNP categories.

2.1.

Recommendations based on existing Descriptive CROM (population-level domain)

2.1.1.

Demographics and socioeconomic status (SES)

Key socio-demographic concepts of interest for research on TNPs that may influence product use patterns, include demographic and socioeconomic variables, such as age, sex at birth, gender, race, ethnicity, level of education, and income. Other socio-demographic concepts that may be considered according to study objectives and endpoints include occupation, work status, nationality, and/or residency. Tobacco use among minority groups and people of various sexual orientations and/or religions may also be of interest in some studies.

2.1.1.1.

Assessing age

Age may be evaluated in several ways. In US-based surveys, participants are often asked to provide their date of birth (e.g., MM/DD/YYYY), which should be considered as Protected Health Information (20). Participants may also be given the option to refuse to answer or to respond with “I don’t know”. If either option is chosen, a participant may be asked to provide age in a numerical format. Confirmation of participant age may also be added after the date of birth is given (21, 22). A response range (e.g., 1 to 120 for age) has also been used (23). In European and international surveys, participants may be more likely to be asked about their age or year of birth (24, 25) because the exact date of birth may be considered sensitive information or may be perceived as highly confidential.

2.1.1.2.

Assessing sex at birth and/or gender

Sex at birth and gender are two distinct constructs. These recommendations are developed based on several existing guidelines (26, 27). Sex at birth refers to the sex recorded on a person’s birth certificate. Sex at birth is based on biological attributes, commonly external genitalia, and typically consists of two categories: male and female. Intersex is a third potential category, which corresponds to people born with biological characteristics diverging from the male and female categories (27, 28). However, this category is not commonly assessed in surveys. Recommended survey items include ‘What is your sex?’ (22, 29) or “What sex were you assigned at birth?” (30). In addition to the typical “Male” and “Female” response options, “I don’t know” and “I prefer not to answer” should also be included. Participants should choose only one answer for this question.

Gender is a multidimensional construct that has psychological, social, and behavioral dimensions, including gender identity and gender expression. Gender identity refers to a person’s internal sense of gender, and gender expression refers to how a person expresses their identity through appearance and behavior (28). The most common gender identities are man and woman, matching the sex they were assigned at birth. On the contrary, the gender identity of people who are transgender does not match the sex they were assigned at birth (31). Also, transgender is an umbrella term that includes a wide range of gender identities, including transgender men, transgender women, queer, gender variant, transsexual, and cross-dresser (30). Recommended survey items for gender identity include “How would you describe your gender?” (32), or “How do you describe yourself?” (27, 28). The recommended response options include “Male”, “Female”, “Transgender”, or “I do not identify as female, male, or transgender” (27, 28). An “I prefer not to say” response option may also be included. Participants should choose only one answer for either of these questions.

Sex at birth and/or gender may be assessed in a TNP use survey depending on the specific interests of the study. In a two-step approach, both questions may be posed to participants, starting with the sex at birth question, followed by the gender identification question (28). For studies interested in evaluating the beliefs, perceptions, and behaviors associated with the use of tobacco and/or nicotine products in sexual minorities, additional questions aimed at identifying gender identity and gender expression may be considered. The Gay and Lesbian Alliance Against Defamation (GLAAD) Association provides a comprehensive glossary of terms that can be used to help understand and differentiate gender identities and expressions (31), and the European Union Lesbian Gay Bisexual Transgender (EU LGBT) Survey - Technical Report is a resource for methodology and survey questions that can be used to properly identify and express transgender identities (30).

2.1.1.3.

Assessing sexual orientation

Sexual orientation has three main dimensions: sexual attraction, sexual behavior, and sexual identity (27, 33). The CORESTA CROM TF recommends that sexual identity be assessed using a single item, i.e., “Do you consider yourself to be...” with the following response options: “Straight”, “Lesbian or Gay”, “Bisexual”, “Something else” (22). Additional response options may include “Not sure” and “I prefer not to say”. Participants should choose only one answer to this question. If sexual attraction and/or sexual behavior are areas of interest, the Federal Interagency Working Group on Measuring Sexual Orientation Gender Identity provides an overview of current measures of sexual orientation in U.S. federal surveys (33) and is a relevant resource to identify survey items that can be used to further investigate sexual orientation. In European and other countries, sexual orientation may be considered highly confidential information.

2.1.1.4.

Assessing race and/or ethnicity

Race and ethnicity are two distinct concepts, that have evolved over time (34) and have been extensively defined and re-defined in social science and epidemiologic research. While ethnicity is defined as “a group of people that identify with each other based on shared ancestry”, race is more ambiguous and has been described as “a sociopolitical construct used to categorize individuals into social groups” (27, 35). Different racial categories are anchored in a historical context of colonialism, in which individuals sharing a common race are perceived as a homogeneous group with respect to biological inheritance. However, biological inheritance is not observable; therefore, it would be more accurate to refer to the assessment of race as a perception based on external features and phenotypes (34).

Methods to evaluate race and ethnicity may vary by geographic location, which may be due to perceived sensitivity associated with survey items about race and/or ethnicity and/or differences in reporting requirements by country, traditions, and customs within each country, and/or national census classifications. For example, in U.S. national surveys, participants may be asked about race directly using a single item (e.g., “What is your race?”) that is followed by a second item asking about ethnicity, in addition to asking if the participant is Hispanic (Latino/Latina), or of Spanish origin (22). However, race may be assessed using more elegant phrasing (e.g., “Which of these groups describe you?” (21)). In U.S. national surveys, racial response categories are commonly asked as multiple-choice questions with responses that include “White”, “Black or African American”, “American Indian or Alaska Native”, “Asian”, “Native Hawaiian or other Pacific Islander”, while ethnicity response categories include Spanish origin categories such as “Mexican”, “Puerto Rican”, “Cuban”, etc. Surveys in countries with large Asian populations can also differentiate Asian populations, such as “Asian Indian”, “Chinese”, “Filipino”, “Japanese”, etc. In the U.S., a high non-response rate has been observed among Hispanic respondents when questions about race and ethnicity are separated into two survey items. As a result, Weinberger et al. suggested combining questions about race and ethnicity into one survey item (27). They also emphasize that standard categories may not be sufficient to comprehensively capture racial/ethnic backgrounds and may be supplemented by additional categories relevant to the population of interest and the study outcomes.

In European-based surveys, race and ethnicity are not assessed under these terms. Instead, survey items ask how the respondent would describe their ancestry (24). Response categories vary according to the prevalence of race/ethnicity in the country of interest. Most European-based surveys reviewed for this project do not assess race and/or ethnicity. In international-based surveys, race and ethnicity are merged into one item that asks about the respondent’s racial/ethnic background, in which response categories are country-specific (32).

2.1.1.5.

Assessing socioeconomic status (SES)

SES is typically assessed by asking about the level of education, occupation, and income and can be measured at the individual or household level. Additional indicators may include wealth and savings (36). Indexes have also been developed and further revised to assess SES, including Duncan’s Socioeconomic Index and the economic, social, and cultural status index. The description hereafter focuses on the assessment of the three most used indicators (income, level of education, and work status/occupation).

2.1.1.5.1.

Income

Income is typically assessed over the past 12 months and may be measured at the individual or household level. In U.S. surveys, response categories cover income ranges (e.g., less than $10,000, $10,000–$14,999, $15,000–$19,999, etc., (22)), and it should be specified whether the respondent should consider taxes and compulsory deductions in their response. Responses to this item may be inaccurate, depending on who in the household is answering the survey. An option to answer with either of “I don’t know” or “I prefer not to say” should be included. In addition to income range, other items may include the source(s) of income (e.g., salary, self-employment, pension, etc.), the number of people living in the household, and how many family members contribute to the household income. Assessments of income may be tailored to specific countries to account for region-specific variability in economy, salary, and cost of living. Methods for developing items to assess income for harmonization across countries are available and have been applied in multinational surveys (24, 37, 38). Recommended survey questions include “Which of the following categories best describes your total household income in the past 12 months?” (22) or “Please tell me which letter describes your household’s total income, after tax and compulsory deductions, from all sources? If you don’t know the exact figure, please give an estimate.” (24). The European Social Science (ESS) survey also asks respondents how they feel about their income (e.g., “living comfortably” to “finding it very difficult”) to allow for comparability among surveys.

2.1.1.5.2.

Level of education

Education systems and degrees vary considerably among countries. Therefore, the assessment of education level needs to be tailored based on regional differences. Typically, the CORESTA CROM TF recommends assessing education level using a single choice item asking about the highest level of education achieved. Response options are usually based on a particular country’s educational system, are ordered from lowest level to highest level, and include a “I prefer not to say” option. This item may be complemented by a second item asking about the number of years of education, which would be provided in a numerical format. The level of education may be difficult to assess in the context of multi-national research because it requires a harmonization step to make the data comparable.

2.1.1.5.3.

Work status and occupation

Work status, type of work, and work environment are correlated with TNP use status. Social or cultural effects related to occupation are important determinants of smoking (39), and substantial differences in smoking prevalence have been observed across industry and occupation groups (40). Survey items assessing work status may ask about employment status (22) and alternatives to working for profit status categories, such as student, unemployed, and in compulsory military service (41). Surveys may also include categories indicating other work-related activities, such as currently seeking employment (24). Some surveys have an employment module, which includes additional questions about work status, while others include questions about the workplace environment and policy to assess the use and exposure to smoking at the workplace. The 2017–18 U.S. National Health and Nutrition Examination Survey Occupation Questionnaire also collects data on employment and variables related to the work environment (42). The survey recall period may vary from the past year to the past 7 days. Survey items assessing current occupation may be organized by labor groups. The International Standard Classification of Occupations (ISCO-08) provides a four-level hierarchically structured system for classifying and aggregating occupational information, which allows all jobs in the world to be classified into 436 unit groups (43). These unit groups form the most detailed level of the classification structure, which are then aggregated into 130 minor groups, 43 sub-major groups, and 10 major groups, based on their similarity in skill level and skill specialization required for each job. The CORESTA CROM TF recommends using this classification system, which allows for the compilation of detailed and internationally comparable data and provides summary information for only 10 groups at the highest level of aggregation.

2.1.1.5.4.

Residency

Residency is often used as an eligibility criterion. Residency may be assessed to confirm the place of residence in a specific country using a dichotomous response option (yes/no) or may suggest a list of countries from which the respondent should select one option. Country-specific items may be developed to establish the state and/or region of residence.

2.1.1.5.5.

Religion

Responses to items about religion may be country-specific. If assessment of religion is of interest to the study outcomes, the most prevalent religion(s) in the country in which the study takes place should be identified and listed as response options. Other response options should include “Other”, with the possibility to enter text only if the study budget and timeline allows for qualitative data analysis, and a “I prefer not to say” response option.

Lastly, items that assess demographics and SES allow for the identification of “vulnerable populations” (44) with regards to the use of TNPs, such as minoritized sex, non-traditional gender, and sexual orientation identities; persons with minoritized racial and ethnic backgrounds; persons with lower SES; persons with lower health literacy; and persons with mental health concerns. These groups present a higher prevalence of TNP use, are under-represented in TNP research, and may experience tobacco-related health disparities (27, 45). Therefore, if the research plans to study tobacco-related health disparities, CORESTA CROM Task Force recommends conducting a subgroup analysis for these groups.

2.1.2.

TNP use prevalence

TNP prevalence (or product use rate) is the proportion of individuals in a population of interest who use the TNP at a specified point in time or over a specified period. Prevalence can be evaluated at a product category level (e.g., ENDS) or at a product subcategory level (e.g., e-cigarette, e-hookah, e-cigar, etc.) (see Table 2 for classification of TNPs). TNP prevalence can be measured in a variety of ways depending on the timeframe, use frequency, and other use behaviors.

2.1.2.1.

Lifetime use prevalence

Lifetime use prevalence is the proportion of individuals in a population of interest who at some point in their lives have ever used TNPs. Among survey respondents or subpopulations, examples of lifetime use prevalence include ‘ever use’ and ‘lifetime established use’ (see Figure 1). ‘Ever use’ prevalence is usually determined based on the question “Have you ever used [TNP] even one or two times?”. For emerging TNP categories, it is common to ask an awareness question first, such as “Have you seen or heard of [TNP] before this study?” (22). For lifetime established use, a typical question for cigarette smoking is “Have you smoked at least 100 cigarettes in your ENTIRE LIFE?” (23). In some surveys, the cigarette lifetime established use criterion (i.e., having smoked 100 or more cigarettes) is used to define ‘ever smoking’ (23), which should be viewed as ever-established cigarette smoking based on our definition of ‘ever use’ and ‘established use’. The Population Assessment of Tobacco and Health (PATH) Study also includes a non-numerical threshold measure of ‘Have you ever used [TNP] fairly regularly?’ to define ‘ever established use’. In addition to this non-numerical threshold of established use, suggested numerical thresholds to define ‘lifetime established use’ for other TNP categories are summarized in Table 2. CORESTA CROM Task Force recommends including lifetime established use questions with numerical thresholds or with non-numerical criteria in surveys to distinguish experimental users and established users.

2.1.2.2.

Point use prevalence

Point use prevalence is the proportion of individuals in a population of interest who use a TNP at a specific point in time. An example is ‘current use’, which is usually assessed by asking ‘Do you currently use the [TNP]’ with ‘every day’, ‘some days’ and ‘not at all’ (22, 23) or ‘daily’,’ less than daily’, and ‘not at all’ (32) as response options.

2.1.2.3.

Period use prevalence

Period use prevalence is the proportion of individuals in a population of interest using a TNP during a given period of interest. Examples of period prevalence include TNP use over the past 7 days, past 30 days, and past 12 months. Typical questions to assess period prevalence include ‘In the past [time period], have you used [TNP] even one or two times?’ (22) and ‘How long has it been since you last smoked part or all of a cigarette?’ (21) with response options, such as ‘Within the past 30 days’, ‘More than 30 days ago but within the past 12 months’, etc. CORESTA CROM Task Force recommends measuring usage in the past 30 days as an indicator of recent use to increase harmonization in research on TNPs. However, measures to determine recent TNP use should also be selected based on the study objectives. For example, TNP use in the past 7 days would be an important measure for studies that include biomarker assessments.

Prevalence estimates for a TNP category may vary due to differences in product category descriptions, survey measures (current vs. past 30-day), and mode of survey administration (46,47,48,49). Point and period prevalence estimates are usually assessed among those who report ‘ever use’ of a TNP. ‘Current (every day or some days) use’ and ‘past 30-day use’ are the most used measures for current TNP use prevalence. Studies have also shown considerable variability in prevalence when it is estimated by different frequencies of use in the past 30 days (5, 50,51,52). Amato et al. showed that a threshold of current ‘≥ 5 days during the past 30 days’ for ENDS could restrict prevalence estimates to non-experimenters because experimenters are more likely to use the product infrequently and to discontinue use (51, 52). Either a lifetime ever use criterion (Figure 1) or a threshold of current use frequency can be used to exclude experimenters from individuals who are currently using the TNP. CORESTA CROM Task Force recommends applying the lifetime established use criteria shown in Table 2 to identify experimental and established users of TNPs for use prevalence estimations.

2.2.

Recommendations based on existing descriptive CROM (Product category level domain)

Domains and subdomains in this section are usually evaluated at a product category or subcategory level. Participants should be provided with a clear description of each category of TNPs, which should be supported by product category images. The description and image(s) allow participants to differentiate among TNP categories and to prevent potential measurement errors, which may occur if respondents lack knowledge about product attributes and confuse a given product with other TNP categories. This may occur in the assessment of use behaviors associated with novel TNPs, such as heated tobacco products (HTPs), which may be confused with e-cigarettes.

2.2.1.

Consumption

The primary subdomains under consumption are the number of days used in a pre-specified time frame (e.g., in the past 30 days), units used per day on days used, and type/form of products used (e.g., disposable ENDS products, ENDS products with replaceable pre-filled cartridges or pods, tank or modular systems that can be filled with liquids). Questions about consumption can be asked for each subtype within each TNP category. The questions corresponding to subcategories, subtypes, or type/form of TNPs used are discussed for each TNP category.

2.2.2.

Number of days used in the past 30 days

The number of days used in the past 30 days measure is used to evaluate TNP use frequency. Responses to this question typically include an option of ‘0–30’ days among current or past 30-day users. To avoid inconsistency in responses when the question is asked in conjunction with the current use question, it can be assumed that the respondent’s answer is ‘30 days’ for those who report using a TNP ‘every day’. Therefore, only respondents who report the use of a TNP product on ‘some days’ should be asked about the number of days used in the past 30 days.

2.2.3.

Type/form of the TNP(s) used and units used per day on days used

The number of units used per day on the days used measure can be asked in conjunction with the number of days used in the past 30 days to obtain the past 30-day use of a TNP. The unit used in the measure is based on the subcategory or type of TNP.

Implementation of an upper limit for response options that use an interval scale (i.e., a range of units) should be considered to prevent an invalid response. For example, in the National Health Interview Survey (NHIS) survey, a response option of ‘95’ is coded for smoking 95 or more cigarettes per day on days smoked (23). Additionally, instead of asking for numerical inputs of units used, the response option could be changed into categorical scale. For example, response options in the National Survey on Drug Use and Health (NSDUH) survey include “less than one cigarette per day”, “1 cigarette per day”, “2 to 5 cigarettes per day”, “6 to 15 cigarettes per day (about ½ pack)”, “16 to 25 cigarettes per day (about 1 pack)”, “26 to 35 cigarettes per day (about 1½ packs)”, and “more than 35 cigarettes per day (about 2 packs or more)” (21).

As the product measurement unit would differ per product category, the CORESTA CROM TF summarizes the commonly used measures for each category below. The selection of measurement unit may depend on study objectives.

2.2.3.1.

Conventional cigarettes

Manufactured cigarettes and roll-your-own cigarettes are two common cigarette types included in survey questionnaires. The unit commonly used is the individual cigarette stick. Many surveys also remind respondents how many cigarettes are in a pack (20 in most jurisdictions) because people who smoke sometimes think of their cigarette consumption in packs.

2.2.3.2.

Cigars

Cigar subcategories include traditional cigars, cigarillos, and filter cigars. Blunts, which are modified cigars of any type in which the tobacco is removed and replaced with marijuana, are sometimes of interest when studying cigar usage. The unit commonly used is the individual cigar or cigarillo.

2.2.3.3.

ENDS products

Development of standardized self-report survey measures of ENDS product consumption is challenging due to the several forms of ENDS products that are available (5, 6, 53,54,55). The type of ENDS product used is critical to understanding how ENDS products are consumed (6). There are currently three major types of ENDS products: disposable ENDS products, ENDS products with replaceable pre-filled cartridges or pods, and tank or modular systems that can be filled with liquids. In addition to these three major types, there are rechargeable and non-rechargeable devices. Other relevant and important measures for ENDS consumption include the use of products with or without nicotine and nicotine concentration, if applicable (6), as different product use behaviors have been observed when comparing nicotine-containing and non-nicotine-containing products (56).

Currently, there is little research to demonstrate the reliability and validity of the various unit measurements for ENDS products. Liu et al. conducted a qualitative assessment of e-cigarette use and concluded that ‘number of times and/or puffs taken in a day’ is the most common approach to describe quantity used compared to device-specific terms (i.e., replacement of disposable devices, cartridges/pods, use of e-liquid) and perceived equivalence to a quantity of traditional cigarettes (55). In the PATH Wave 4 Survey Questionnaire, the ‘number of times’ used measure was defined as the ‘number of times one picks up the ENDS product to use it’ (22). The questions in PATH Wave 4 were asked as follows: “On average, on the days that you use, how many times each day do you pick up your electronic nicotine product to use it, whether you take one puff or several?” and “Each time you pick up your electronic nicotine product to use it, about how many puffs do you take?”. The combination of responses to these two questions accounts for potential differences in daily use patterns. Additional device-specific unit measures for ENDS products include number of disposable ENDS products used, number of replaceable prefilled ENDS cartridges used, the frequency of filling the ENDS product with e-liquid, and the number of milliliters of e-liquid the device holds. CORESTA CROM Task Force recommends the number of times or use occasions and puffs per time or use occasion measure if the research objective is to study overall ENDS category-level consumption and/or to report usage patterns. However, it is worth noting that a recent study has shown that the number of puffs per use occasion may be underestimated. The number of puffs may indicate relative heaviness of use across individuals but may not be a reliable measure to quantify the amount of nicotine taken over the course of several days (57). Device-specific unit measures should be considered if the research objective is to study a specific ENDS product or subcategory.

2.2.3.4.

Smokeless tobacco products

Smokeless tobacco product subcategories include moist snuff/dip, dry snuff, snus, and chewing tobacco. Two general types of smokeless tobacco include loose smokeless tobacco products and smokeless tobacco products in pouches. The units commonly used for smokeless tobacco products include number of times used, the number of pouches used for smokeless tobacco in pouches, and the number of cans used for loose smokeless tobacco products. CORESTA CROM Task Force recommends the number of times or occasions of use measure if the research objective is to study smokeless category-level consumption and report general usage. However, subcategory-specific units should be specified if the research objective is to study a specific smokeless product or a subcategory, such as use of a pouched product.

2.2.3.5.

Nicotine-containing tobacco-free oral products

Tobacco-free oral nicotine products are available in pouches and other forms. CORESTA CROM Task Force recommends the number of times/occasions measure if the research objective is to study general usage of tobacco-free nicotine-containing oral products. However, form-specific units (numbers of pouches, chewable pieces, lozenges, etc.) should be used if the research objective is to study a specific product or a subcategory.

2.2.3.6.

Heated tobacco products (HTPs)

The unit commonly used for HTPs is the tobacco stick, which is specifically engineered to be heated to temperatures below the point of combustion by a battery-powered holder (58, 59).

2.2.4.

Brand usage

TNP users who have a regular brand or who own a TNP are typically asked about the brand of TNP they use most often or used last. CORESTA CROM Task Force recommends asking the brand that is used most often in the past 30 days to assess brand usage. When applicable, product images can be provided to facilitate the selection.

2.2.5.

Flavor usage

Like usual brand used, respondents can be asked about first flavor used, flavor(s) used most often, usual use, or last used, in addition to what flavor(s) they used in the past 30 days. First flavor(s) used upon initiation of use of a TNP (e.g., ENDS) has been studied to evaluate trends associated with TNP experimentation, subsequent tobacco use, and TNP use progression (60, 61). CORESTA CROM Task Force recommends asking flavor(s) used in the past 30 days as individuals using TNPs are likely using multiple flavors, especially for emerging TNP categories. When applicable, product images can be provided to facilitate the selection of flavor usage. When conducting secondary analysis of survey data, researchers should be aware that flavors may be misclassified depending on the brand selected by survey participants. For example, Villanti et al. (62) observed inconsistencies among people who smoke cigarettes and their reporting of use of menthol vs. non-menthol flavored products where individuals may report that their usual brand was non-menthol while the brand selected could be a non-menthol brand and for which at least 99% of sales for that brand were menthol.

2.2.6.

Initiation

Initiation of use of a TNP generally refers to the first use of that product. Commonly used survey measures to study TNP initiation include age/year of first use, age/year of first daily/regular use, and length of time as a daily/regular user of the product. Information on the first TNP a respondent tried can be used to understand an individual’s TNP use trajectory. Current and established use, subcategories, types, and flavors as mentioned in previous sections are also relevant measures to understand TNP use initiation.

2.2.7.

Cessation

TNP cessation occurs when an individual stops using a TNP after having used the product to at least its lifetime use criterion (i.e., after established use). Former established users are often asked how long it has been since they last used the product and if they have completely stopped using it. Based on when the product was last used, respondents could be categorized as ‘recent former users’ or ‘long-term former users’ (Figure 1). Respondents could be asked about TNP use patterns before cessation and alternate TNPs used to understand further events that may lead up to the cessation of TNPs.

2.2.8.

Quit-related measures

Attempts to quit the use of a TNP (“quit attempts”) refers to having stopped using the product for > 1 day during a specified time frame (e.g., past 12 months) because they were trying to quit using the product (63). Quit attempts are considered an important intermediate step in TNP cessation (64). Additional relevant survey items include ever tried to quit, interested in quitting, number of quit attempts over a timeframe, duration and recency of quit attempt(s), methods used in quit attempts or in successful quitting, and attempts to decrease consumption during the (recent) quit attempt.

There have been various instruments developed to measure intention to quit smoking, such as Stages of Change measure (65), the Motivation to Stop Scale (MTSS), and a Likert scale measure (66). The MTSS developed by Kotz, Brown, and West has been shown to provide a strong prediction of attempts to quit smoking and is a candidate to monitor a user’s level of intention to quit smoking (67). The MTSS has also been shown to have comparable construct and predictive validity compared to other instruments (66). Additionally, readiness to quit can be studied using a pre-specified time frame of planning to quit the product or based on the stages of change in the process of quitting (68).

2.2.9.

Relapse/re-initiation

Relapse/re-initiation are terms often used to refer to use of a TNP after a period of abstinence (e.g., 1-year). In general, use in this context refers to current use of the TNP.

2.3.

Recommendations based on existing Descriptive CROM (poly/cross-category level domain)

2.3.1.

Dual/poly usage

The growing diversity of TNPs available in the market has led to an increase in prevalence of concurrent use of two or more TNPs (69, 70). In epidemiological studies, dual and poly use are typically derived variables and are often operationalized based on measures of the current use of TNPs. Dual use is typically defined as concurrent use of two TNPs from different TNP categories or subcategories (e.g., dual use of cigarettes and ENDS, dual use of ENDS and HTPs, etc.). Similarly, poly use is usually defined as using three or more TNPs concurrently. Due to the heterogeneity found among dual users, dual users can be further categorized into subgroups, such as ‘Dual Daily’, ‘Predominant A’, ‘Predominant B’, and ‘Concurrent Non-Daily’ states (13,14,15,16,17) (Figure 1).

2.3.2.

Switching and transitions

‘Transition’ refers to a change in a use state based on the TNPs used before and currently. ‘Switching’ refers to completely transitioning from the current TNP to another TNP. Individuals who switch may be a subpopulation of quitters who no longer use the product that they used before. Transition or switching behaviors are sometimes directly evaluated using retrospective measures in cross-sectional surveys or they may be derived based on use of one or more TNPs at each time point in longitudinal surveys.

2.4.

Recommendations for selecting an existing Descriptive CROM

The selection of CROM should be based on the target study population, study objectives, and research hypothesis, and/or driven by regulatory requirements. When considering the most appropriate Descriptive CROM for a study, it is always recommended that researchers start by clearly defining what needs to be measured to facilitate CROM selection. The researcher should be as specific as possible when defining what needs to be measured, considering the study in which the Descriptive CROM will be used. For example, a researcher may need to measure the number of days participants who smoke cigarettes in an actual use study used the candidate product during the past week. This definition refers to several study-specific factors, including:

1)
The target population, or end-users of the descriptive CROM (“participants who smoke cigarettes”, which would be further defined in the study protocol);
2)
The timeframe (recall period) (“over the past week”);
3)
The behavior to be measured (“use of the candidate product”); and
4)
The units of measurement (“number of days”).

Some constructs (e.g., health and functioning ⁽³⁾) may require either Psychometric or Descriptive CROM, depending on what the researcher intends to measure. If Psychometric CROM are needed, the reader is referred to the “Consumer-Reported Outcome Measure (CROM) Best Practices and Guidelines with Respect to Psychometric CROM for Use in Research on TNPs” (71). If the recommended Descriptive CROM is not appropriate for the purposes of the study, it is recommended that the researcher select Descriptive CROM that have evidence of validity from peer-reviewed literature or national/international surveys.

Section 3: Development and modification of Descriptive CROM

If the researcher is not able to identify an existing Descriptive CROM appropriate for the study, an existing CROM may need to be modified or a new Descriptive CROM may need to be developed to fit the study’s requirements (e.g., modifying the recall period, modifying the CROM to reference a different product category, etc.). The next sections discuss recommendations and best practices for modifying existing descriptive CROM and recommendations for developing and validating a new Descriptive CROM.

3.1.

Modification of an existing CROM

Although it may seem inconsequential to modify a Descriptive CROM and use the modified CROM without testing, modifications can, and are often intended to, influence participant responses and the validity of the data being collected. Modifying existing CROM is often necessary (e.g., adapting an item to reference a new product category, changing the product image that is included in a product use item, etc.), and researchers are strongly encouraged to consider if and how the modification might impact the validity of the CROM before implementing the CROM in research. Depending on the extent of the modification and whether the modification could reasonably be expected to impact participant responding, researchers may utilize qualitative and/or quantitative approaches to evaluate the modified item before use. This section discusses the types and the extent of modifications that can be made to CROM, as well as strategies that can be used to gather evidence to support the modification. Prior to modifying any CROM, copyright clearance should be obtained if applicable.

The researcher may modify an existing Descriptive CROM in various ways to make it fit the needs of their study, such as modifying the CROM content, administration, and/or application. These types of CROM modifications are further defined in Table 3. In practice, a CROM modification may impact multiple areas. For example, modifying a Descriptive CROM pertaining to consumption of cigarettes to reference consumption of ENDS and administering it to people who use ENDS (as opposed to people who smoke) involves modifications to both content and application. Depending on the type and extent of the modifications and the content of the CROM, it may be helpful for the researcher to consult the literature, SMEs, or individuals representing the end-users of the CROM (the intended population of respondents to whom the CROM will be administered) when revising the CROM.

Table 3.

Types of Descriptive CROM modifications.

Type of modification	Illustrative examples (non-exhaustive)
Content: Modifying the instructions, items, and/or response options	Removing or introducing a response option of “I don’t know” Changing the number of response categories (e.g., increasing the granularity of an item asking about household income) Changing response category labels Changing instructions and/or item content to reference a different product category (e.g., “Electronic Nicotine Delivery Systems (ENDS)” instead of “cigarettes”) or a specific brand Changing language/terminology (e.g., changing “e-vapor” to “e-cigarettes”) Adding product images to items asking about use of that product Changing the recall period (e.g., “in the past 30 days” to “in the past 7 days”)
Administration: Changing the mode, method, and/or format of administration	Administering a Descriptive Consumer-Reported Outcome Measure (CROM) developed for paper-and-pencil electronically Changing the method of administration from self-completed to interviewer-administered Changing the order of item administration (items asking about the use of different tobacco nicotine products (TNPs) are presented in a random order instead of fixed)
Application: Applying the CROM in a new way, such as to a new population or product (from which it was originally developed / validated)	Modifying and applying measures of cigarette consumption to the consumption of a new TNP category ^a Translating a Descriptive CROM into a different language and administering it to a new population (i.e., individuals whose primary language differs from languages the CROM has been validated for) Administering a CROM to individuals from another culture (i.e., individuals whose cultural background differs from the background of individuals for whom the CROM was originally validated for)

a

This would be an example of modifying CROM content as well; as stated above, it is not uncommon for different types of modification to occur in tandem.

CROM modifications also differ with respect to the extent of the modification. In theory, modifications fall on a continuum and range from very minor to substantial, with some substantial modifications departing so grossly from the original CROM that the modified CROM should be considered a new CROM (in these circumstances, the researcher is advised to follow guidance in Section 3.2. Development and validation of a new or substantially modified CROM). Within the context of these guidelines, we adopt definitions of “Minor” and “Substantial” modifications taken from “Consumer-Reported Outcome Measure (CROM) Best Practices and Guidelines with Respect to Psychometric CROM for Use in Research on TNPs” (71). Minor modifications are modifications that are not reasonably likely to impact end-users’ interpretation of CROM content and response to the CROM, above and beyond changes to interpretation and response that are a result of improving clarity/reducing measurement error. Substantial modifications could reasonably change end-users’ interpretation of the CROM content and response to the CROM items. Illustrative, non-exhaustive examples of Minor and Substantial modifications that could be made to Descriptive CROM are presented in Table 4. Distinguishing between Minor vs. Substantial modifications is important as these classifications are linked to different recommendations pertaining to the need for empirical evidence to support the modification(s). For Minor modifications, no additional evidence is needed to support the modification. Additional qualitative evidence may still be helpful in some circumstances to support the modification, such as evidence from cognitive interviews (discussed below in Section 3.2. Development and validation of a new or substantially modified CROM). Conversely, qualitative evidence, such as determining content validity from data collected from individual cognitive debriefing interviews, is typically recommended to support Substantial modifications. In some instances, the researcher may also decide to collect quantitative evidence, such as evaluating convergent and discriminant validity, predictive validity, known groups validity, test-retest reliability, etc., to help support substantially modified CROM. However, in most cases, quantitative evidence is not necessary, and qualitative evidence is sufficient to support the modification.

Table 4.

Recommendations pertaining to CROM modifications.

Modification	Minor	Substantial
Examples	Making the text bold and underlining the recall period in the instructions (“In the past 7 days”) for visibility and emphasis Changing font size or font style Adding additional clarifying language to an item or instruction Adding an image of the product being referenced Adding an “I don’t know” response option Administering a paper-and-pencil consumer-reported outcome measure (CROM) electronically without changing the presentation of the CROM Changing the item to reference a different brand	Substantially changing the content of the CROM (e.g., changing the consumption measure of moist smokeless tobacco from times per day to cans per day) Applying the CROM to tobacco nicotine products (TNPs) for which it was not developed Modifying and administering the CROM to a population for which it was not developed Translating a CROM into a new language and administering it to this new cultural population
Recommended approach(es) to support modification	Generally, no evidence is needed In certain circumstances, qualitative evidence may be helpful (e.g., to ensure that new clarifying language added to instructions is clear) Usability testing may be helpful in some circumstances, such as when modifying a paper-and-pencil CROM for electronic administration	Qualitative evidence would likely be helpful and is generally recommended to support the modification. Qualitative evidence may be particularly helpful in the following circumstances: If CROM content is substantially changed ^a If responses from two versions of a CROM are being directly compared in a study When administering a CROM to a new population and/or applying the CROM to a new product, and such changes could reasonably impact respondents’ interpretation of the CROM and response to the CROM. When translating a CROM into a new language ^b In some instances, quantitative evidence may also help support the modification.

a

Depending on the modification, qualitative evidence is generally helpful to ensure that participants understand the new content. For example, response options may be too granular for participants to respond accurately (e.g., the exact number of cigarettes smoked in the past 30 days, the household income from last year), or recall periods may be inappropriate (asking participants to recall their product use from several years ago may yield inaccurate responses due to limitations with memory).

b

It is generally recommended that the researcher work in close in collaboration with an expert or organization specialized in linguistic services to determine and execute the most appropriate linguistic and cultural validation strategy for developing or modifying an existing CROM.

The responsibility of determining the classification of a modification as Minor or Substantial and defending the decision to collect or not to collect evidence to support the modification ultimately falls on the researcher and should be justified. While some modifications may seem to fall between Minor and Substantial, into a “Moderate” category, the researcher will need to decide whether to take the more conservative approach and collect evidence to support the modification (consistent with the recommendations in Table 4 for Substantial modifications), or decide that additional evidence to support the modification is not warranted (following the recommendations for Minor modifications in Table 4). It is also worth noting that if modifying an existing CROM from the literature or elsewhere that does not have any evidence of validity, collecting qualitative evidence to support the modified CROM is generally recommended.

3.2.

Development and validation of a new or substantially modified CROM

3.2.1.

Drafting CROM content

In some instances, the researcher may determine that, after clearly defining what they intend to measure, a new Descriptive CROM is needed to meet the needs of their study. Depending on the CROM content, consulting the literature, SMEs, and/or individuals representing the end-users of the CROM (the intended population of respondents to whom the CROM will be administered) may be helpful when drafting the new CROM content. Recommendations to consider when drafting a new CROM are described in Table 5 (71,72,73).

Table 5.

Recommendations to consider when drafting a new CROM.

Global recommendations	CROM instructions/item content	Response options
Use simple language (be cognizant of reading level ^a,b) and avoid technical terminology, slang, idiomatic expressions, or colloquialisms (if possible) Use direct, unambiguous language Avoid leading questions and biasing language Use of images can be helpful to aid comprehension/reduce confusion	Each item should communicate a single concept (no “double-barreled” items) Avoid hypothetical questions Recall period should be relevant and appropriate	Response option labels should be appropriately labeled and relevant Response option labels should be appropriately granular (to meet the study’s needs, while also balancing limitations in participant’s memory, for example) Response options should cover the full range of potential responses (a response of “another reason not listed” [or another similar response option] may be helpful) Avoid response option labels that may bias the direction of the responses Use of “not applicable” should be avoided when possible (items should be applicable for participants, and skip patterns can be used to avoid administering items to participants for whom they are truly not applicable) “I don’t know” (or other similar response options) should be visually distinct from the other response options, and should be placed last in the response set

a

The researcher can assess reading level (Flesch-Kincaid grade level) using a feature in Microsoft Word or another program.

b

FDA TPPI Guidance 2022 (7) recommends that the reading level be “appropriate for those with less than a high school education” (p. 14).

3.2.2.

Qualitative and quantitative strategies to collect validity evidence of a new CROM

Once the CROM is drafted, several qualitative and quantitative strategies can be used to evaluate the psychometric functioning of the new Descriptive CROM. There is no single “correct” approach for validating Descriptive CROM. The most appropriate approach for collecting validity evidence of a new CROM will depend on the CROM content and should be determined in consultation with experts in measurement. It is not always necessary to formally evaluate the psychometric functioning of a new Descriptive CROM. Descriptive CROM with simple, direct, unambiguous language measuring straight-forward characteristics or behavior that are not likely to be misunderstood by the respondent may not need evidence of validity. For example, a question asking whether the participant has ever used “X” product even once (with an image of the product) is likely “face valid” and collecting validity evidence for this item is likely unnecessary. Conversely, it is best to collect validity evidence when an item asks about characteristics or behaviors that could potentially be misinterpreted/interpreted differently across participants (e.g., whether participants have used a product “fairly regularly” or some other qualitative descriptor of product use behavior). Researchers are encouraged to consult measurement experts when deciding whether qualitative and/or quantitative psychometric validation is necessary for a new Descriptive CROM, and this decision will need to be justified. When in doubt, the more conservative approach is to collect qualitative and/or quantitative data to support the new Descriptive CROM.

Cognitive debriefing interviews are a commonly used approach to evaluate the content validity of a new CROM (or a substantially modified CROM). These interviews are conducted to verify that the CROM language is clear and is understood as intended (e.g., for a descriptive CROM that asks about using a product “fairly regularly”, is the phrase “fairly regularly” understood as intended?), that the recall period is appropriate, that the response options are appropriate, etc. Issues with the CROM or opportunities for improving the CROM identified during cognitive testing can also be used to improve the CROM. The participants included in cognitive testing should represent the end-users of the CROM, and should be appropriately diverse with respect to demographics, TNP use history, etc. The researcher may choose to oversample specific groups of interest (e.g., individuals with low health literacy and individuals with TNP use histories) to ensure that the perspective of individuals from these groups is captured during cognitive interviews. For example, a Descriptive CROM may be accurately understood among those with normal health literacy but may be misunderstood among those with limited health literacy, who represent an important segment of end-users of the CROM. The mode and method of administration should also be considered and the new or modified CROM should be tested using all available modalities to avoid needing to modify it for alternative administration methods in the future. Many books, articles, and guidance documents exist to provide interested readers with greater detail regarding the conduct of these interviews and the analysis of cognitive interviewing data (74, 75).

In some circumstances, the researcher may decide to collect quantitative validity evidence of a new Descriptive CROM. There is no “standard” approach that is appropriate for all Descriptive CROM, and the psychometric properties to be evaluated and analyses chosen to evaluate these properties will depend on various factors, such as the content and purpose of the CROM. For example, a participant’s response to a Descriptive CROM asking about whether the participant has ever been diagnosed with asthma should not vary over a brief period (e.g., 1-week), unless the participant received a diagnosis between assessment points. Conversely, a participant’s response as to the number of cigarettes smoked each day may indeed vary over time. In the first example, evaluating test-retest reliability of this asthma CROM over a 3-day period would be appropriate; as the participants’ responses would largely be expected to be stable over this brief period and high consistency between responses would support reliability of the new CROM. Conversely, in the later example (cigarettes smoked per day), the researcher would not be interested in evaluating test-retest reliability of this CROM because participants’ responses are expected to fluctuate. Other examples of psychometric properties of Descriptive CROM that might be evaluated include convergent validity (evidence that the new Descriptive CROM is related to other measures that it should theoretically be related to) and discriminant validity (evidence that the new Descriptive CROM is not related to other measures that it should not theoretically be related to). Convergent validity includes both concurrent validity (evidence that the new Descriptive CROM is related to another measure that it should be related to, and these two measures are completed at the same time [e.g., within the same survey]) and predictive validity (evidence that the new Descriptive CROM is related to another measure that it should be related to, assessed at a later point in time). Known-groups validity (when responses to the new Descriptive CROM vary as expected between groups of respondents) is another type of quantitative validity evidence that may be helpful to evaluate to support a new Descriptive CROM.

As another strategy to support the validity of a new Descriptive CROM, participants’ responses to the new Descriptive CROM can be compared against “real-world” data (results from medical tests, diagnoses from medical charts, amount of product that the participant was directly observed to use during an in-clinic product use assessment, etc.). Of course, whether this is relevant or appropriate will depend on the new CROM, and this approach is not always feasible for many reasons (e.g., the confidentiality of medical records, the additional cost that is often associated with collecting or accessing real-world data, etc.). Although not a strategy to collect validity evidence, usability testing may be helpful in certain circumstances, such as when modifying CROM formatting developed for a full-sized computer screen or a paper survey to fit a small-screen electronic device (smartphone).

3.3.

Multinational, multiregional, multicultural (3MC) research

Large-scale surveys are sometimes conducted at multinational, multicultural, and/or multiregional levels. The main objective of 3MC research is to compare data and outcomes among countries, “cultures”, and/or regions. A primary challenge of 3MC research lies in maximizing comparability, quality, and equivalence of items and constructs across surveys, as in general, languages, social systems, “cultures”, and societies are substantial and relatively complex and differ across countries and within countries.

Ensuring measurement equivalence ⁽⁴⁾ in 3MC survey research is necessary to produce data that are comparable across contexts (e.g., countries and cultures). Comparison errors arise when comparing survey data collected in several countries/regions if the questionnaire content (instruction, item stem, items, response options) and/or methods of conducting the survey differ. This affects data quality and the interpretability of the results and related research conclusions.

Ensuring linguistic equivalence is needed to make data comparable by ensuring that each aspect of a questionnaire (i.e., questionnaire name, instructions, item stems, items, response options) when presented in two or more languages conveys the same meaning and can be understood equally by respondents from each language. Simple literal translations, even when correct, may not be enough to ensure equivalence. The basic principles of linguistic equivalence include:

ensuring consistency in question formulation across different languages;
using simple vocabulary;
conveying the same meaning in questions across different languages;
being consistent in the structure and layout of the questionnaire across different languages;
ensuring the structure of response scales and continuums.

If the participants in 3MC research are asked to provide data that may differ across groups, it is necessary to use crosswalks to ensure comparability. See references (38, 76, 77) for best practices for ensuring data comparability.

DISCUSSION

The guideline provides a conceptual introduction to Descriptive CROM, focusing on providing consistent definitions and descriptions of product and exposure categories and patterns of use. The Descriptive CROM WG engaged consultants from various sectors with subject matter expertise related to research methods in general and research methods specific to TNPs. With these engagements, the CORESTA CROM TF developed recommendations for Descriptive CROM based on existing Descriptive CROM from surveys that include modules to assess the use of TNPs. The consensus Descriptive CROM, along with consistent definitions, could facilitate comparisons across studies, aggregation of data sets, and eventually, improve harmonization in research findings.

With the changing TNP landscape, developing and identifying optimal Descriptive CROM becomes challenging, especially for emerging TNPs. CORESTA CROM Task Force recommends modifying an existing Descriptive CROM, when suitable, before developing a new Descriptive CROM. When modifying an existing Descriptive CROM, depending on the type and extent of modifications needed, qualitative and/or quantitative evidence should be gathered to support the modification. The development and validation of Descriptive CROM should follow a well-established framework with an initial, qualitative assessment phase followed by a quantitative phase to assess the validity, reliability, and other properties of the CROM as deemed necessary. Lastly, proper survey instrument design and post-survey data processing are also essential components to ensure Descriptive CROM data quality.

CONCLUSIONS

These recommendations on the development, modification, and application of Descriptive CROM are grounded in scientific rationale and developed with consensus from the CORESTA TNP research community. To properly implement this guideline, readers are advised to obtain appropriate technical training or to engage technical consultants and to refer to guidance documents referenced in this report. Finally, as the best practices and guidelines may evolve over time, the CORESTA CROM Task Force will continue to update best practices and guidelines based on the dynamic TNP landscape and regulatory requirements to advance TNP research.

The CROM Taskforce has made the decision that CROM is for "consumer reported outcome measures" (plural) but could also be used in the singular.

Consumer Reported Outcome Measures Consortium | CORESTA (https://www.coresta.org/groups/consumer-reported-outcome-measures-consortium)

For example, if the researcher intends to assess the presence or absence of cough during the past week (yes/no), a Descriptive CROM would be used. Conversely, if a researcher is looking to assess severity of respiratory symptoms over the past week, this would be estimated using a Psychometric CROM because the researcher is trying to estimate an underlying construct (respiratory symptomatology).

Measurement equivalence implies that the instrument measures the same concept in the same way, across various subgroups of respondents […] it does not mean that there are no differences between the populations regarding a measured construct. Rather, it implies that respondents from different groups that have the same position on a trait of interest should provide a similar response (89).

CORESTA Guidelines for Descriptive Consumer-Reported Outcome Measures in Tobacco and Nicotine Research

Full Article

Paradigm

My account