1. INTRODUCTION
Innovation has undergone a fundamental transformation over the past decades, shifting from closed processes towards more collaborative models that integrate external knowledge sources and user input (De Vita & De Vita 2021).
Within this paradigm, living labs (LLs) have emerged as a practice-oriented framework that operationalises open innovation in real-life settings. They are open innovation ecosystems that bring together multiple stakeholders to co-create, develop and test innovations in real-world environments (Gualandi & Romme 2019). Beyond the methodological dimension, LLs are often described as socio-technical networks or platforms that foster continuous learning and iterative experimentation among participating actors. Thus, by blending user-centric design, stakeholder engagement and contextual testing, LLs enable the transition from abstract ideas to deployable solutions, effectively serving as intermediaries between knowledge creation and application (De Vita & De Vita 2021).
Empirical studies demonstrate that firms participating in LL initiatives report measurable innovation benefits such as incremental product improvements, enhanced knowledge flows and access to external expertise. However, many of these outcomes tend to be short-term and incremental, while evidence on radical or long-term innovation impacts remains limited (Alexandrakis et al. 2022). By integrating real-life experimentation, LLs aim to enhance user responsiveness and foster democratic innovation processes. However, when initial funding ceases, LLs often struggle to exist autonomously, which raises concerns about the durability and systemic impact of their innovations. Studies further highlight the difficulties in embedding co-created solutions into mainstream institutional or commercial practices, thereby limiting the transformative potential of LLs (Gualandi & Romme 2019). Essentially, without robust mechanisms for evaluating and sustaining their outcomes, LLs risk becoming transient pilot initiatives rather than enduring innovation infrastructures.
Thus, it is evident that there is a gap regarding the assessment of the effectiveness of LLs. While numerous case studies and practitioner accounts highlight LLs as enablers of innovation, rigorous evaluations of their actual outcomes remain scarce (De Vita & De Vita 2021; Kalinauskaite et al. 2021; Saleme et al. 2023; Schrevel et al. 2020). As Paskaleva & Cooper (2021) note, LL research often focuses on process descriptions and stakeholder roles rather than on outcomes or impact, while standardised metrics and evaluation frameworks tend to be overlooked. This makes it difficult to compare results across LL initiatives or to determine their success. This issue is corroborated by practical challenges regarding the long-term viability and scalability of LLs: many initiatives rely on public or project-based funding and lack sustainable business models (Gualandi & Romme 2019). Consequently, the empirical evidence and the purported benefits of LLs rest on anecdotal rather than systematic findings. This paper contributes to this lack of systematic operationalisation of LLs. The research question being investigated is:
What parameters are used in the pertinent literature to assess the efficacy of LL in supporting innovation?
A systematic literature review is used to focus on two core dimensions:
Success parameters attributed to LLs, both related to tangible (e.g. products, services, knowledge) and intangible (e.g. user satisfaction, policy impact) outcomes.
The evaluation frameworks of the evidence presented in support of these outcomes. It thus contributes to a more structured understanding of LL performance: insights crucial for enhancing the design, guiding policy and funding decisions, and ensuring that LLs evolve into sustainable, high-impact innovation ecosystems.
2. THEORETICAL BACKGROUND
In theoretical terms, an LL is a framework for fostering collaborative research and development in real-world contexts, particularly in the domain of sustainability, societal challenges and public sector innovation. LLs are defined as user-centred ecosystems that integrate research and innovation processes within real-life communities and settings. They enable co-creation among different stakeholders such as users, researchers and businesses (Ballon & Schuurman 2015; Guzmán et al. 2013; Leminen 2015; Santonen et al. 2022; Steen & Van Bueren 2017). These approaches emphasise active engagement and the incorporation of diverse perspectives and solutions.
Nyström et al. (2014) propelled academic research and literature on LLs by developing theoretical frameworks for LLs as innovation intermediaries. Notably, they introduced so-called actor roles, such as enablers, providers, utilisers and users, underscoring the dynamic orchestration and governance of innovation processes (Leminen et al. 2012, 2016; Leminen & Westerlund 2008, 2016, 2019; Nyström et al. 2014). This emphasises the role of LLs as facilitators of co-creation among stakeholders in real-life environments and highlights an actor-centric view of LLs.
Especially noteworthy are urban living labs (ULLs), which represent a special category of LLs that focus on urban environments and their complex challenges. These innovative platforms have emerged in response to the pressing need for sustainable urban development and governance. ULLs integrate diverse stakeholders, including municipal governments, academic institutions, businesses and citizens, into collaborative processes aimed at co-creating and testing solutions for urban sustainability challenges. The geographical embeddedness characteristic of ULLs allows for localised experimentation directly relevant to the specific urban context and capable of highlighting the unique social dynamics and environmental circumstances of the area under study (Bulkeley et al. 2016; Scholl & Kraker 2021; Veeckman & Temmerman 2021). This localisation enables ULLs to act effectively as arenas for social innovation where ideas can be rapidly piloted and evaluated within the community (Steen & Van Bueren 2017; Von Wirth et al. 2019).
Furthermore, ULLs serve as critical infrastructures for participatory governance. Through their iterative experimentation and civic engagement practices, ULLs facilitate a deeper understanding of the urban fabric and enable inclusive decision-making that can contribute to more resilient cities (Sharp & Salter 2017; Voytenko et al. 2016). However, despite their potential, ULLs often face hurdles in sustaining their initiatives beyond initial funding periods, which raises concerns about the long-term viability of the innovations they cultivate (Du 2020; Steen & Van Bueren 2017). As a result, rigorous evaluation frameworks are necessary to assess not only the immediate outputs of such initiatives but also their lasting impacts on urban transformation and sustainability (Van Geenhuizen 2019; Von Wirth et al. 2019). These characteristics make ULLs an especially insightful lens through which to study both the potential and limitations of LLs in effecting systemic innovation—even though the present study does not focus exclusively on them. Instead, ULLs are considered within a broader typology of LLs to inform a comparative and more generalisable understanding of success and evaluation across LL contexts.
To summarise, LLs are predominantly conceptualised as instruments for supporting transformation and user-centric open innovation. However, despite a significant scholarly interest in the concept as such, the question of the performance assessment of LLs is largely under-researched. That is because studies tend to focus on the methodology and composition of LLs and their stakeholders: they either focus on physical spaces for co-creation and learning, driving exploration and forming collective ideas (Hossain et al. 2019), or on the processes and actors involved in LLs, while the question of success and impact remains understudied (Paskaleva & Cooper 2021). Pertinent research approaches LLs from different perspectives and an inclusive perspective on LLs is crucial to be able to systematically operationalise evaluation criteria and understanding LLs as innovation intermediaries and defining parameters of success (Ballon et al. 2018; Fauth et al. 2024; Mastelic et al. 2015; Paskaleva & Cooper 2021; Schuurman et al. 2019). The next sections will identify and systematically structure LL types, evaluation frameworks and success parameters to systematically conceptualise their role in enabling innovation, drawing on a systematic literature review.
3. METHODS
The literature on LLs was systematically reviewed to extract performance and success parameters. This systematic review involved publications that contained either (1) systematic or comprehensive or comparative or critical literature reviews on LLs and their utilisation; or (2) analyses of the effectiveness, impact, performance or success of LLs to gain insights into contemporary research routes and perceptions. The databases used for this research were primarily Google Scholar, JSTOR and one of the largest public libraries in Germany: the OPAC—Bayerische Staatsbibliothek (Bavarian State Library) in Munich. This search was limited to articles published from 2005 onwards, due to European Union adoption of LLs in 2006.
Searches were conducted in English using combinations of keywords targeting LL evaluation logics, such as:
‘living lab’ AND evaluation
‘living lab’ AND (impact OR effectiveness OR success)
‘urban living lab’ AND (‘evaluation framework’ OR assessment)
‘living lab performance’
‘living lab’ AND ‘outcome measurement’.
Search terms were applied to titles, abstracts and author-provided keywords. No restrictions were placed on study design, allowing the inclusion of conceptual papers, literature reviews, empirical case studies and mixed-method analyses (Table 1).
Table 1
Inclusion and exclusion criteria of living labs (LLs).
| INCLUSION CRITERIA | EXCLUSION CRITERIA |
|---|---|
|
|
The initial search yielded over 2000 records across all platforms. As Google Scholar produced more than 30,000 hits for several search strings and indexed non-curated sources, it was used only as a supplementary tool. Thus, only the top 150 results per search string were screened, which aligns with common practice in systematic reviews using Google Scholar (Gusenbauer & Haddaway 2020). After removing duplicates, titles and abstracts were screened according to the abovementioned inclusion and exclusion criteria, resulting in 120 publications retained for full-text review. During the full-text assessment, publications were excluded if they did not contain evaluative content, lacked methodological detail relevant to LL outcomes, or did not contribute to understanding success or impact parameters. A total of 51 publications met all criteria and were included in the final synthesis (Figure 1).

Figure 1
Living labs’ (LLs) success parameters and outcomes: selection of publications.
Given the terminological heterogeneity in LL research, several core concepts guiding the analysis were defined (Table 2).
Table 2
Definitions of the key analytical concepts of living labs (LLs).
| PERFORMANCE | EFFICACY | IMPACT | SUCCESS |
|---|---|---|---|
| Performance denotes the degree to which LL processes, methods or practices function effectively to achieve goals and deliver expected outputs (Neely et al.2005) | Efficacy refers to the capacity of an LL to generate its intended effects under real-world conditions, distinct from mere outputs or activity counts (Rossi et al.2004) | Impact refers to the extent to which observed changes can be causally attributed to LL activities, including medium- and long-term outcomes (White 2009) | Success parameters are measurable criteria used to assess whether a given initiative has achieved its intended outcomes, e.g. metrics or benchmarks against which success is judged (Shenhar et al.2001) |
These definitions served as sensitising concepts in the coding process and facilitated analytical consistency across diverse sources. A total of 51 publications were analysed with the help of MAXQDA software. To capture both explicit evaluation constructs and emergent conceptual patterns, the data extraction procedure followed an iterative, inductive coding approach in three stages (Table 3).
Table 3
Coding process.
| STAGE 1: OPEN CODING | STAGE 2: CATEGORY DEVELOPMENT | STAGE 3: CONSOLIDATION |
|---|---|---|
| Relevant segments were identified based on their connection to LL evaluation, success parameters, performance indicators or impact mechanisms | Recurring patterns were grouped into broader analytical categories. Initial codes were refined, merged or differentiated through constant comparison | The final coding framework consisted of three major categories:
|
[i] Note: LL = living lab.
Across all codes, a total of 2015 sections were coded and utilised for analysis, building the conceptual base for the following analysis. Coded material was mapped to establish LL types to identify recurring evaluation patterns, forming the basis of the conceptual matrix. Owing to the broad field of literature, the matrix should be understood as an initial synthesis open to empirical refinement.
4. ANALYSIS AND RESULTS
4.1 TYPES OF LLs
The results of the analysis can be summarised as follows and reflect overarching patterns that became evident. These will be discussed and contextualised in the subsequent section. Leminen et al. (2012) derived four key types of LLs based on a comprehensive qualitative study: provider-, utiliser-, user- and enabler-driven LLs. Their consideration follows the actor perspective, which focuses on who is driving the activities of a respective LL. Building on their work, this paper validates LL typologies by examining their presence in case studies, reviews and theoretical literature, and adds a complementary type of LL described in recent literature: network-driven LLs (Table 4).
Table 4
Overview of living lab (LL) types.
| LL TYPE | CHARACTERISTICS | OUTCOMES | REFERENCES |
|---|---|---|---|
| Provider-driven LLs |
|
| Berniak-Woźny & Szelągowski (2023); Nyborg et al. (2024); Rogers et al. (2023); Satria et al. (2023) |
| Utiliser-driven LLs |
|
| Cigir (2018); Lin et al. (2013) |
| User-driven LLs |
|
| Chen et al. (2010); Mulder & Stappers (2009); Pascu & van Lieshout (2009) |
| Enabler-driven LLs |
|
| Galbraith et al. (2008); Grüneis et al. (2020) |
| Network-driven LLs |
|
| Del Vecchio et al. (2017); Nguyen et al. (2021); Merino-Barbancho et al. (2023) |
Provider-driven LLs are typically initiated by institutions such as universities, educational organisations or consultancy firms. For example, Berniak-Woźny & Szelągowski (2023) discuss the university-based LL model that nurtures experiential learning and theoretical advancements through systematic co-creation, underpinning the provider-driven approach. Such LLs not only facilitate cutting-edge research but also contribute to pedagogical innovation, as seen in higher education settings.
BarLaurea in Finland is a provider-driven LL. It is an educational LL run by Laurea University in Finland, integrating a restaurant with pedagogical and research objectives. Students co-create services while engaging with real users in a semi-public learning space, generating experiential knowledge and service innovations. It typifies a provider-driven LL, as the initiating and controlling actor is an academic institution whose primary goal is research, knowledge creation and pedagogical development (Mäkäräinen-Suni 2008).
Utiliser-driven LLs are established by private-sector companies seeking to enhance their business development processes. These labs serve as platforms for the iterative testing and refinement of products and services in real-world contexts. By leveraging user data and experiential feedback, they function as strategic tools for innovation and market alignment. Notably, industrial collaborations under LL methodologies promote the co-development of smart service systems, further reinforcing the utiliser-driven perspective (Lin et al. 2013).
An example for a utiliser-driven LL would be LabCampus in Germany, which is an innovation district at Munich Airport where companies co-locate to co-develop technologies and services in a curated, innovation-friendly environment. LabCampus exemplifies a utiliser-driven model as corporate tenants and stakeholders initiate and shape the innovation activities primarily to derive strategic or commercial benefits from applied collaboration and proximity-based synergies (LabCampus GmbH 2025).
User-driven LLs emerge primarily from grassroots initiatives, with users themselves serving as the primary instigators for innovation. These labs are characterised by informal governance structures and a bottom-up coordination mechanism to focus collaboratively on everyday challenges. The coordination in such labs is typically informal, relying on voluntary engagement and community-driven problem-solving. For example, Mulder & Stappers (2009) report on co-creation practices in European LLs, underscoring the active participation of users as co-creators rather than passive participants. Other publications articulate urban community initiatives that leverage everyday user practices and encourage voluntary engagement (Chen et al. 2010).
A user-driven LL is Kommune Niederkaufungen in Germany, a long-standing intentional community focused on sustainable living, social experimentation and cooperative decision-making. As a user-driven LL, residents themselves initiate, steer and implement experiments in areas such as mobility, housing and energy, with the lived experience of users being both the source and subject of innovation (Notz 2006).
Enabler-driven LLs are generally initiated by public sector bodies or non-governmental organisations. Their goal is to support the development of regional innovation ecosystems by fostering collaboration among diverse stakeholders. These labs emphasise inclusivity and long-term societal impact over immediate commercial outcomes. For instance, Grüneis et al. (2020) and Jiang et al. (2020) highlight enabler-driven initiatives that align long-term societal impacts with regional development strategies by engaging policymakers, citizens and non-profit organisations in co-creation processes.
LiCalab focuses on user-centric testing of health and care innovations by connecting businesses with older adult users and care professionals in Flanders, Belgium. As an enabler-driven lab, it was initiated by the city of Turnhout and is coordinated by an intermediary institution (Thomas More University of Applied Sciences) that facilitates co-creation and evaluation processes without being the main user or utiliser, thereby fostering collaboration among diverse actors (Vermeylen et al. 2023).
Network-driven LLs integrate a multi-actor model beyond traditional stakeholder roles. This type represents an original contribution of this paper. Based on the idea of innovation through the quintuple helix (Carayannis & Campbell 2010), this type emphasises the inclusion of environmental and societal dimensions alongside academia, industry and government. LLs thus function as dynamic ecosystems to orchestrate value creation through bidirectional interactions and reciprocal knowledge flows (Merino-Barbancho et al. 2023). This perspective complements the traditional four types by underscoring that the success of LLs depends on complex actor networks where roles are not fixed but evolve over time through continuous co-management and mutual learning.
An example for a network-driven LL is Josephs in Germany, a publicly accessible, interdisciplinary innovation space in Nuremberg that brings together companies, researchers and citizens to jointly explore early-stage product and service ideas. Operating at the interface of science, business and civil society, Josephs exemplifies a network-driven LL: no single actor dominates its activities—instead, innovation emerges through orchestrated multi-stakeholder interactions and a flexible, rotating set-up of thematic modules. Its open structure, combined with real-time user engagement and continuous feedback loops in a physical downtown space, positions Josephs as a prototypical network-driven lab (Srinivasan 2023).
4.2 LL EVALUATION FRAMEWORKS
This research identifies distinct yet interrelated categories that collectively capture the diversity of evaluation approaches within LLs. In this review, outcomes refer to tangible product and process-related results, impact refers to long-term, systemic changes and sustainability transitions, and effectiveness refers to the operational and procedural dimensions that underpin their overall performance. Each dimensions is anchored in the findings from the literature (Table 5).
Table 5
Evaluation framework type overview.
| EVALUATION FRAMEWORK TYPE | EVALUATION FRAMEWORK VALUE | REFERENCES |
|---|---|---|
| Outcome-oriented frameworks | Focus on assessing immediate, tangible results, such as innovations, prototypes, user satisfaction and learning enhancements | Ballon et al. (2018); Emblen-Perry (2019); Leminen et al. (2016); Santally et al. (2014); Ståhlbröst & Holst (2017); Van Geenhuizen (2018, 2019); Veeckman et al. (2013) |
| Impact-oriented frameworks | Assessing broader, longer term effects and societal contributions beyond just the immediate outputs | Bouwma et al. (2022); Bronson et al. (2021); Ceseracciu et al. (2023); Ciliberti et al. (2022); Grüneis et al. (2020); Paranunzio et al. (2023) |
| Effectiveness-oriented frameworks | Providing insights into operational mechanics of living labs (LLs) and evaluate whether the intended goals are achieved and LLs deliver on their claims of promoting innovation | Banerjee (2022); Berniak-Woźny & Szelągowski (2023); Guzmán et al. (2013); Huang et al. (2024); Kalinauskaite et al. (2021); Logghe & Schuurman (2017) |
| Hybrid frameworks | Attempt to bridge these gaps but highlight persistent challenges related to methodological integration and standardisation | Bronson et al. (2021); Paskaleva & Cooper (2021); Rosa et al. (2024); Schafer et al. (2024); Toffolini et al. (2021) |
Outcome evaluation frameworks typically focus on performance indicators, based on immediate, tangible results produced by LLs, that quantify outputs and measure progress against predefined benchmarks. Several publications highlight the importance of evaluating learning outcomes and co-created innovations. For example, Van Geenhuizen (2018) offers a framework that emphasises system-level performance by including questions on user feedback absorption, actor satisfaction and the openness of the LL to external networks, thereby providing a set of key performance factors with an outcome-oriented focus; whereas Leminen et al. (2016) examine how variations in network structure within LLs correlate with the production of radical versus incremental innovations, suggesting that outcome measures should also account for qualitative differences in innovation novelty.
Impact-oriented evaluation frameworks focus on the broader, long-term changes in the social, economic and environmental domains triggered by LLs. These frameworks extend beyond immediate outputs to address sustainability transitions, policy shifts and systemic transformations. For example, by centring user needs within real-life settings, LLs contribute to the formation of sustainable public policies and cross-sector collaborations (Ciliberti et al. 2022). Grüneis et al. (2020) further demonstrate that LLs in rural regions serve as platforms for innovating sustainable business models, with their evaluation frameworks incorporating long-term stakeholder partnerships and socio-economic development metrics. Additionally, Paranunzio et al. (2023) illustrate the transformative potential of LLs in coastal cities by evaluating participatory approaches that enhance climate resilience.
Furthermore, there are effectiveness evaluation frameworks concerned with the operational aspects that determine how LLs function as innovation ecosystems. These frameworks cover aspects such as stakeholder alignment, operational processes, iterative feedback loops and value creation. Their focus is on understanding and fine-tuning the mechanisms for innovation outcomes and impacts. Frameworks aimed at evaluating operational effectiveness typically stress the importance of multi-stakeholder co-creation and network analysis. Kalinauskaite et al. (2021) propose a conceptual framework that integrates co-creation methods at both macro- and meso-levels, emphasising iterative feedback loops and stakeholder alignment as critical for effective collaboration. Another approach found in the literature is the use of action research as a robust methodology to evaluate the day-to-day operations of LLs, highlighting aspects such as adaptation speed and responsiveness to user input (Logghe & Schuurman 2017). Additionally, Berniak-Woźny & Szelągowski (2023) show that embedding LL methodologies in educational contexts can enhance process thinking and thereby contribute to overall operational effectiveness.
Schafer et al. (2024) provide a critical discussion of the challenges of employing standardised evaluation frameworks in LLs. They examine the inherent tension between qualitative and quantitative measures in assessing both short-term outcomes and long-term impact, advocating for hybrid methodologies that capture the multidimensional nature of LLs. Similarly, Toffolini et al. (2021) present a case study that distinguishes the roles within LLs (e.g. informants and contributors), merging operational insights with assessments of systemic innovation outcomes. Rosa et al. (2024) contribute to this by proposing good practices in healthcare LLs that stress not only immediate value creation but also the sustainability of projects over time.
4.3 LL SUCCESS PARAMETERS
The literature also describes a variety of success parameters. Whereas the assessment parameters are rather diverse, consensus on standardised metrics to measure success is not evident from the literature. Additionally, the perspective on the success and respective parameters to assess this can vary depending on the context and goals of LLs. For example, some LLs focus more on social innovation or policy impacts, while others prioritise economic outcomes. The analysed dataset provides a comprehensive overview of the utilised parameters in both theory and actual use cases (Table 6).
Table 6
Overview of the identified living lab (LL) success parameters.
| ANALYTICAL DIMENSION | SUCCESS PARAMETERS | REFERENCES |
|---|---|---|
| Economic and business |
| Ballon et al. (2018); Banerjee (2022); Fuglsang et al. (2021); Guzmán et al. (2013); Jernsand (2019) |
| User centricity |
| Bronson et al. (2021); Dell’Era & Landoni (2014); Eriksson et al. (2005); Guzmán et al. (2013); Konstantinidis et al. (2021); Mastelic et al. (2015); Ståhlbröst (2012); Ståhlbröst & Holst (2017); Svensson et al. (2010) |
| Innovation |
| Ballon et al. (2018); Greve et al. (2021); Kemeç (2023); Leminen et al. (2016, 2023); Leminen & Westerlund (2012); Schuurman et al. (2016); Ståhlbröst (2012); Veeckman et al. (2013); Yilmaz & Ertekin (2023) |
| Knowledge and learning |
| Archibald et al. (2021); Berniak-Woźny & Szelągowski (2023); Ceseracciu et al. (2023); Eriksson et al. (2005); Mastelic et al. (2015); Nyström et al. (2014); Purcell et al. (2019); Smith et al. (2022); Ståhlbröst (2012) |
The economic viability and strategic business value of LLs have been foregrounded across multiple studies. For instance, Jernsand (2019) highlights that in the tourism sector, LLs are not only innovation arenas but also incubators for workforce training, demonstrating a dual benefit in enhancing business performance while fostering sustainable economic practices. Similarly, Banerjee (2022) discusses a human-centric approach to innovation that directly links user engagement to commercial outcomes by streamlining the transition from design thinking to product prototyping. Despite these positive associations, critical voices argue that while economic benefits are frequently assumed, robust mechanisms for quantifying economic impact are still underdeveloped, suggesting the need for standardised evaluation metrics (Yilmaz & Ertekin 2024).
The centrality of the user in the LL approach is consistently emphasised, for instance, Dell’Era & Landoni (2014) delineate the conceptual evolution of LLs from mere user participation to deep co-creation, where end-users actively influence the design and implementation of solutions. This shift towards a more user-centric paradigm is underlined by Svensson et al. (2010), who propose innovative techniques to harness user contributions effectively during digital innovation processes. Critical analysis of these findings, however, indicates that while user involvement enriches outcomes, there is significant heterogeneity in user engagement practices. This variation calls for systematic frameworks to ensure that user insights are efficiently captured and uniformly applied across different LL contexts (Dell’Era & Landoni 2014; Konstantinidis et al. 2021).
Innovation is arguably the most prominent parameter and has been widely explored across studies. Greve et al. (2021) argue that transitioning from niche research settings to mainstream innovation management demonstrates the transformative capacity of LLs. This potential is further elaborated by Leminen et al. (2023), who propose a framework where the inherent uncertainty and interactions within LLs accelerate the development of market-ready products. Additionally, the co-creation process, as discussed by Leminen & Westerlund (2012), challenges traditional innovation models by incorporating end-users as co-producers, thereby unveiling latent needs and innovative opportunities. Notwithstanding these contributions, a more critical strand of literature argues that such high expectations sometimes outstrip the practical capabilities of LLs, particularly in the absence of clear operational guidelines that bridge ideation and commercialisation (Greve et al. 2021; Leminen & Westerlund 2012).
Beyond serving as mere innovation hubs, LLs function as knowledge innovation ecosystems where continuous learning and feedback loops enhance overall system performance. Archibald et al. (2021) emphasise that LLs facilitate multidirectional knowledge translation, where stakeholder interactions contribute to a deeper mutual understanding and iterative improvement of ideas. This knowledge-centric approach is also manifested in academic initiatives, where universities use LL methodologies to prepare students through experiential learning, thereby integrating academic research with practical problem-solving (Berniak-Woźny & Szelągowski 2023; Purcell et al. 2019). Nonetheless, a critical gap identified in the literature is the lack of comprehensive frameworks that measure learning outcomes in a standardised manner. Researchers conclude that while qualitative accounts abound, quantitative metrics to assess incremental knowledge gains within these ecosystems remain underdeveloped (Archibald et al. 2021; Berniak-Woźny & Szelągowski 2023; Purcell et al. 2019).
LLs extend their influence beyond commercial innovation to address broader societal challenges, embodying a commitment to sustainable community development. In urban contexts, (U)LLs are instrumental in developing resilient solutions to climate and environmental challenges, as discussed by Quadros Aniche et al. (2024) and further reinforced by studies on sustainable development (Molnar et al. 2023). The societal impact is also evident in educational and healthcare settings where LLs foster community engagement and democratise access to innovative solutions, as emphasised by Archibald et al. (2021) and Satria et al. (2023). Such initiatives not only stimulate community resilience but also facilitate a symbiotic relationship between academic institutions and local communities, thereby bridging the gap between high-level policy goals and localised action (Molnar et al. 2023). Critical reflections suggest that, despite these positive implications, there is a persistent need for longitudinal studies to robustly document the social impacts of LLs and address challenges related to sustainability, governance and stakeholder alignment (Molnar et al. 2023; Quadros Aniche et al. 2024; Satria et al. 2023).
4.4 DISCUSSION: THE CONCEPT OF LL REVISITED
This paper systematises how LLs are evaluated in terms of their efficacy and thus contributes to a more structured understanding of their innovation potential. It addresses an evident gap in the LL literature: a lack of standardised frameworks and consistent parameters for evaluating outcomes, effectiveness and long-term impacts. Pertinent scholarship has explored the conceptual underpinnings and typologies of LLs, yet evaluation mechanisms have largely remained anecdotal. This study contributes to advancing the field in two critical ways: (1) by proposing a differentiated view of evaluation frameworks that reflect the diverse ambitions of LLs; and (2) by consolidating and classifying success parameters across multiple analytical dimensions.
Historically, LLs have been conceptualised either as methodological testbeds (Eriksson et al. 2006; Schuurman et al. 2013) or as actor-driven platforms for stakeholder co-creation (Leminen et al. 2012; Nyström et al. 2014). While these perspectives remain foundational, the findings of this research suggest that LLs are increasingly evolving into meta-frameworks for systemic innovation. The emergent typology—including the introduction of the network-driven LL—reflects this evolution by emphasising cross-actor co-ownership and ecosystemic innovation over traditional, hierarchical innovation management. This refined perspective contributes to the literature conceptually by redefining LLs not only as intermediaries but also as orchestrators of value co-creation across organisational, sectoral and societal boundaries (Fauth et al. 2024). In doing so, LLs take on a new role as boundary objects that enable experimentation, coordination and adaptation in complex environments, resonating with ideas from transition management and systemic design.
A key finding of this paper is the identification and classification of four main types of evaluation frameworks: outcome, impact, effectiveness and hybrid oriented. By mapping these approaches, the research offers a synthesis of the fragmented field of LL evaluation, building on previous critiques (De Vita & De Vita 2021; Paskaleva & Cooper 2021), but going beyond by providing a clear taxonomy that aligns an evaluative focus with LL type and objective. This framework is descriptive and also prescriptive, supporting both scholars and practitioners in selecting appropriate evaluation models based on LL configuration and desired outcomes.
Essentially, the diversity of LL evaluation frameworks demonstrates that the conceptualisation of outcome, impact and effectiveness frameworks as well as hybrid models provides a comprehensive overview that can inform both practice and research. Addressing challenges such as the standardisation of metrics and the integration of multiple evaluative dimensions will be key to advancing the field of LL research and ensuring that these innovative environments continue to deliver measurable benefits for society.
The structured categorisation of success parameters across five analytical dimensions—economic/business, user-centricity, innovation, knowledge/learning and society/community—represents another substantial contribution. Other studies (Ballon et al. 2018; Ståhlbröst 2012) have hinted at these categories, but often as isolated observations. This study systematically maps and integrates these dimensions, thus enabling a multidimensional operationalisation of success in LL contexts. This is particularly useful given the diverse goals and governance models of LLs. By clarifying how success should be assessed in different configurations, this study supports tailored performance assessments and lays the groundwork for the future development of context-sensitive evaluation tools.
This structured categorisation of success parameters provides clarity. By grouping outcomes into distinct analytical dimensions, it is possible to identify the underlying priorities and rationales that shape how success is defined and measured across different contexts. This separation highlights the multidimensional nature of LL impacts—spanning economic, social, user-centric, innovation-driven and knowledge-related outcomes—and also facilitates the selection of context-appropriate evaluation metrics. It supports both researchers and practitioners in aligning assessment approaches with specific LL goals, while simultaneously revealing underexplored areas such as learning outcomes or long-term societal effects. Ultimately, this comprehensive overview contributes to the development of more transparent, comparable and context-sensitive evaluation strategies.
While typologies and evaluation metrics are foundational, the research also reveals deeper epistemological tensions in the LL discourse. These include the friction between standardisation versus contextual flexibility, and between tangible outcomes versus transformational impacts. The hybrid frameworks identified in this study attempt to reconcile these tensions but highlight the need for inter- and transdisciplinary methodologies capable of capturing the complexity of LL environments. Moreover, the observed absence of longitudinal studies and standardised measurement of LL impact suggests clear avenues for future research. This study contributes by articulating these gaps and proposing a consolidated language for success discussion, which can enable comparative research, benchmarking and policy alignment (Table 7).
Table 7
Initial overview of living lab (LL) types, associated evaluation frameworks and success parameters serving as a foundation for future research and practical validation.
| TYPE OF LIVING LAB (LL) | MOST RELEVANT EVALUATION FRAMEWORK | MOST RELEVANT SUCCESS PARAMETER | CONCEPTUAL JUSTIFICATION |
|---|---|---|---|
| Provider-driven LL |
|
| Initiated by academia or research institutes, provider-driven LLs prioritise structured experimentation, method testing and pedagogical outcomes. Thus, evaluating operational effectiveness (e.g. knowledge flows, process quality) and short-term outcomes (e.g. prototypes, learning) is most suitable. |
| Utiliser-driven LL |
|
| Driven by private firms aiming for product or service optimisation, utiliser-driven LLs benefit from outcome frameworks that track commercial results and iterative innovation cycles. The primary concern is value realisation, making business-focused and innovation-related outputs the main evaluation interest |
| User-driven LL |
|
| Rooted in grassroots or civic initiatives, user-driven LLs focus on needs-based innovation, empowerment and community impact. Long-term social change and democratic innovation demand impact frameworks, but hybrid models can help trace the bottom-up engagement process and its translation into outcomes |
| Enabler-driven LL |
|
| Enabler-driven LLs pursue societal transformation, regional strategies or inclusive policymaking. Impact-oriented frameworks capture these ambitions, and hybrid models help combine short-term responsiveness with long-term systemic change, including user integration |
| Network-driven LL |
|
| Network-driven LLs involve multi-stakeholder co-ownership, promoting ecosystemic value creation, radical innovation and continuous learning. Hybrid frameworks are best suited to capture multidimensional impacts, while effectiveness-oriented elements help evaluate governance and coordination |
In practical terms, the findings equip LL organisers, funders and evaluators with tools to design, manage and assess LL initiatives more precisely. By aligning specific LL types with appropriate evaluation strategies and efficacy parameters, practitioners can increase the transparency, accountability and scalability of their initiatives. For policymakers and funding bodies, this offers a more evidence-based foundation for strategic investments and the design of supporting infrastructures for LLs.
Thus, the research question that has guided this paper—what parameters are used in the pertinent literature to assess the efficacy of LL in supporting innovation?—can be answered as follows:
Parameters of efficacy of LLs comprise four types, i.e. provider, utiliser, user and enabler driven, which direct the focus of measuring innovation towards the actor/user (who is driving activities). Thereby, the factors that impede successful innovation can be effectively identified and adjusted.
Evaluation frameworks for systemic innovation, including outcome-, impact-, effectiveness- and hybrid-oriented ones. These enable practitioners and scholars to qualify and quantity the innovation of LLs based on their outcomes.
Success parameters that direct the attention to the content and goals of respective LLs and touch upon the dimensions of economy and business, the user centricity, the epitomes of innovation, knowledge and learning, and, finally, society and community.
Some limitations should be noted. First, it is based on a literature review and does not incorporate primary empirical data, which may have provided further depth to the evaluation framework synthesis. Second, the scope of included literature—although extensive—was limited to publications primarily from Europe and North America, potentially excluding valuable insights from other cultural contexts. Third, while the study distinguishes between types of LLs and evaluation dimensions, it stops short of operationalising these into actionable tools or metrics, which would require further empirical testing and validation.
5. CONCLUSIONS
A key shortcoming in the scholarly discourse on living labs (LLs) has been the lack of structured frameworks and consistent parameters to evaluate their efficacy in fostering innovation. Through a comprehensive literature review and thematic synthesis, the research identified a range of LL types, and categorised evaluation frameworks into outcome-, impact-, effectiveness- and hybrid-oriented types, and systematised success parameters into five analytical dimensions: economic and business value, user-centricity, innovation capacity, knowledge and learning, and societal contribution. The findings contribute to a differentiated understanding of LLs, not only as methodological tools but also as evolving meta-frameworks for collaborative, systemic innovation.
Future research can build upon this foundation in several ways:
Empirical testing of frameworks
Applying the typology and evaluation criteria to concrete LL cases through longitudinal or comparative studies would help validate and refine the findings.
Development of standardised metrics
There is a need for practical, standardised evaluation tools that can be adapted to different LL types while ensuring comparability.
Policy and governance implications
Exploring how policy frameworks and funding mechanisms shape the effectiveness and sustainability of LLs could inform more strategic public investment and governance models.
DATA ACCESSIBILITY
All data generated or analysed during this study are included in this article and the supplemental data online.
COMPETING INTERESTS
The author declares that there are no financial or non-financial competing interests relevant to the content of this manuscript.
ETHICAL APPROVAL
This study did not involve human participants, human data or human tissue, and therefore did not require ethical approval or participant consent.
