Crowd Assessment of the Military Utility of Future Technologies

Marcus Dansarie; Kent Erik Andersson; Stefan Silfverskiöld

doi:10.31374/sjms.339

Introduction

Inexpensive do-it-yourself drones … do not have the explosive power of artillery, cruise missiles, or loitering munitions. However, these recent attacks demonstrate that small drones can still have asymmetric strategic impacts. Only a fraction of drone attacks need to be successful, and even small explosives can have outsized effects. Parked aircraft are uniquely vulnerable. Observers should wonder what the advent of these weapons means for the future of warfare. (Jacobsen, 2023)

The quote highlights how Ukraine has leveraged cheap technology to achieve results in asymmetric action on Russian-controlled territory. In a single week, attacks using groups of relatively cheap cardboard drones reportedly destroyed four military aircraft on Russian airfields and two advanced ground-based air defence systems in Crimea (Jacobsen, 2023). This is a recent example of successful military innovation. A resource-poor player can still hope to deter an aggressor, or to win an armed conflict, by being more creative in the use of technology and methods of warfare. Studies of the phenomenon in previous research focus on changes in doctrine, while others focus on changes in structure or organization; Horowitz and Pindyck (2023) recently made an excellent review of this important field of research and its definitions. In this study, we acknowledge technology as an important component.

The impact of technological advancements on warfare can be variously evolutionary, surprising, or revolutionary (Handel, 1987; Hundley, 1999). Significant impacts are sometimes termed “disruptive” in their capacity to render certain capabilities outdated, necessitating the development of new ones (Bower & Christensen, 1995; Christensen, 1997). The historical examples of gunpowder, railways, radar, and information technology evidence that a deep understanding of technology can offer military advantages (van Creveld, 1991). The probability of successful military innovation is the greatest when organizational conditions are right and the need for a new solution appears to be existentially important (Hundley, 1999) – a theory supported by initial observations of Russia’s war against Ukraine. Consequently, in order to circumvent the danger of being surprised on a future battlefield, most Western nations invest considerable effort in peacetime to increase their understanding of developing technologies. This effort requires reliable technology forecasting.

Silfverskiöld et al. (2021) have reported on the evaluation of a low-cost method for military utility assessment of future technologies (MUAFT). Even if the method does not offer perfect predictions per se, it has played a central part in the Swedish technology forecasting process since 2012, and as such in the long-term capability development process of the Swedish Armed Forces’ (SwAF). To avoid embarking on unnecessarily “long, uncertain, and often expensive paths to develop new military materiel”, the new method aims to assess the potential contribution of certain selected technologies to military capability: “Previous recommendations, based purely on potential technical performance, should be abandoned” (Silfverskiöld et al., 2021). The MUAFT method, as evaluated, is described in a separate report (Andersson et al., 2019). Covering the use of the method from 2012 to 2018, the evaluation shows the approach to be feasible, and that the method should therefore be of interest to other small- and medium-size countries. Its cost-effectiveness permits ongoing assessments of technologies, initially deemed to have uncertain military value, as they progress. The MUAFT method, however, needs to be properly applied.

For Silfverskiöld et al. (2021), the validity is weak in two aspects: the composition of the expert group and the application of the Delphi method, described below (Dalkey & Helmer, 1963). First, for a holistic view on capability, the expert group should represent the perspectives of the most important forces for change acting on the development of a military capability, such as organization, programs, technology, capital, market, and regulations (Clark, 1996; Silfverskiöld et al., 2021). The evaluation highlights two problems with the method. First, the expert seminars are often conducted with experts from the military and engineering fields alone (P1). Second, the most important aspect of the Delphi method is anonymity. The fundamental idea is that the discussion and assessments should not be affected by prestige and preconceptions. It should also be perceived that a participant in the group should run no risk in introducing new ideas or criticizing ideas advocated by the majority. In the physical seminars, these ideals were not sufficiently met (P2).

Following the framework described by Johannesson and Perjons (2014), this study takes a design science approach with a view to improving the validity of the MUAFT method in a new development iteration. The idea is to introduce a web-based application permitting participants in the expert group to discuss and assess the impact of future technology on military capability development, asynchronously and independent of the experts’ locations. Drawing on research on the phenomenon of productive collective judgement, “the wisdom of the crowd” (Brabham, 2008), understood here as a gradual, emergence of effective solutions to problems produced by a large group of problem solvers, a broader range of knowledge and experience should be expected. Additionally, the updated MUAFT method is intended to create conditions for truly anonymous discussions.

The research question was thus formulated: “How is it possible to modify the established MUAFT method to allow for experts to participate asynchronously and anonymously?”

The remainder of the paper is set out as follows. After an account of previous research on technology forecasting, crowdsourcing, and the Delphi method, the paper then follows the structure proposed by Johannesson and Perjons (2014). The following section describes the design science approach and our operationalization of it for this study. After this comes an elaboration of the problems introduced above and what is required if they are to be solved, before a presentation of the developed crowd assessment method along with its development process and demonstrated use. After this comes (first) a section on the documentation of the results from the evaluation and (second) a discussion of the study results. The final section sets out conclusions.

Previous Research

Forecasting involves the making of systematic estimations, projections, or predictions of highly probable events (Kuosa, 2014). Although technology forecasting has been criticized for its inherent inaccuracy (Martin, 2010) and although quick recovery from technological surprise has been proposed as an alternative approach (Finkel, 2011), ongoing research focuses on the development of effective methods. Forecasting technological advancements over a 20-year timeframe is a complex task, but studies indicate that, even with reasonable precision expectations, it can have significance (Kott & Perconti, 2018; Quinn, 1967). Timely and accurate assessments of the utility of future technology (Andersson et al., 2015; Modig & Andersson, 2022) can provide military decision-makers with a deterrent edge on potential aggressors, since transforming new technologies into new military capabilities in peacetime may take decades. The MUAFT method studied in this work is part of the Swedish Armed Forces’ technology forecasting process.

Presented in the introduction above, the first issue of the validity of the current MUAFT method relates to the composition of the expert group. The modified method is expected to involve a larger and more diverse expert group by leveraging crowdsourcing methods and crowd wisdom. A good summary of these methods and their possibilities is provided by Brabham (2008), who refers to several examples of the wisdom of crowds, originally provided by Surowiecki (2005). “The web is the necessary technology that can realize the four-pronged specifications of crowd wisdom and flex a mass of users into productive laborers”, states Brabham; “crowdsourcing … is a model capable of aggregating talent, leveraging ingenuity while reducing the costs and time formerly needed to solve problems” (Brabham, 2008, pp. 81, 87).

Other possibilities with large assessment groups have emerged in trials with similar methods described by Mellers et al. (2014) within a study sponsored by the Intelligence Advanced Research Projects Agency (IARPA). Certain participants in the study turned out to be superforecasters, i.e., people that consistently make accurate predictions within a broad range of subjects. Four mutually influencing factors are considered to be important for the superforecasters’ superior abilities: cognitive abilities and styles, task-specific skills, motivation and engagement, and suitable environments (Mellers et al., 2015). The latter factor refers to the creation of environments that facilitate effective cooperation between multiple high-performing individuals. While it is probably not possible to arrive at the performance described by Mellers et al. (2014) in an updated version of the MUAFT method studied in this work, it might be possible to collect data that enables the identification and cultivation of suitable individuals in the future.

Other potentially relevant publications from the IARPA project include Horowitz et al. (2019), Katsagounos et al. (2021), and Prelec et al. (2017). However, the MUAFT method and the questions it aims to address differ significantly from those within the IARPA project. As a result, only portions of the findings are directly applicable. A key difference is that the IARPA project allows for the objective evaluation of results. In contrast, the predictions made using the MUAFT method extend several decades into the future, making objective evaluation significantly more challenging.

The second issue regarding validity discussed in the introduction relates to limitations in the application of the Delphi method in the current MUAFT system. The original Delphi method was developed by Dalkey and Helmer (1963) at the RAND corporation in the 1950s. In an experiment they demonstrated that by carefully managing interactions among subject matter experts – concealing opinions and gradually revealing only the factors each expert independently deems relevant – the group can reach a well-considered and often convergent opinion. They concluded that the method seems “conducive to independent thought” as opposed to conventional round-the-table use of experts. Thus, anonymity constitutes a very important aspect in the use of expert opinions.

Linstone and Turoff (2011) point out several aspects of the use of the Delphi method relevant to the application studied in this work. First, they note that “a key benefit of participation was the ability of individuals to engage in a group communication process asynchronously, at times and places convenient to them”. Second, they emphasize the common misconception that the method is conducted with the specific goal of achieving consensus; the number of discussion rounds should be determined by the point at which response stability is reached – in fact, disagreements within the group can be a valuable outcome.

Recently, dynamic argumentative Delphi (DAD) surveys have gained traction in social studies. The primary goal of the DAD survey is to facilitate online Delphi consultations with numerous participants while preserving the interactive, argumentative nature of traditional Delphi studies. Expert participants remain anonymous and can change their opinions at any time, while new arguments are always visible to all (Cuhls, 2024). A key takeaway from this field of research is that the facilitator in our modified MUAFT method should actively ensure that participants follow guidelines and refrain from improperly influencing others.

Therefore, in this study the aim is to draw on earlier research in the wisdom of the crowd and the Delphi method respectively, to improve the validity of the MUAFT method.

Method

The Design Science Approach

While traditional scientific research emphasizes understanding phenomena through systematic observation, experimentation, and theoretical explanations, design science aims at creating practical and innovative solutions to real-world problems. Design science is used most frequently, perhaps, in the field of information systems. The method has also found applications in fields such as operations management (Holmström et al., 2009). It is used to systematically design, test, and refine artefacts that solve real world problems. Kernel theories, sourced from suitable scientific fields, are used to guide artefact design and ensure that the process is scientifically grounded. Types of artefacts produced by design science research include constructs, models, methods, and instantiations (Hevner et al., 2004; Johannesson & Perjons, 2014).

Many problems suitable for the design science approach are so-called wicked problems. Wicked problems are characteristically hard to define; there may often even be disagreement about whether a problem exists at all. Their solutions cannot be enumerated and are generally of the type “good or bad” or “better or worse”. Accordingly, there are no simple rules for evaluating or identifying potential solutions. The problem domain is typically complex with many interacting subcomponents and effective solutions being dependent on human social abilities (Hevner et al., 2004; Rittel & Webber, 1973). Our problem, forecasting the military utility of future technologies, shares many of these properties – an indication that it is a good candidate for the design science approach. Because of the difficulty in defining the problem and evaluating solutions, artefacts produced are unlikely to offer a perfect solution. For this reason, design science can be viewed as an iterative process, where each designed and evaluated artefact contributes more knowledge about the problem; this can then be used as input for another application of the framework.

Where design science may be criticized for framing routine software development as scientific research, especially when used to produce instantiations in information systems, the two may be differentiated by noting that legitimate research generates new knowledge about the problem domain. To achieve this, scientific rigour must be applied throughout the design science process; this requires the design to have a scientific grounding, the produced artefact to be a material contribution to knowledge, and for the results to be testable (Hevner et al., 2004). To this end, Hevner (2007) presents a view of design science as composed of three cycles. Parallel with the design cycle, which produces the artefact, a relevance cycle ensures it remains relevant for the application domain; a rigour cycle ensures it remains grounded in science. In Gregor’s (2006) taxonomy, design science produces theory for design and action – i.e., the produced artefact says how to do something and its claim to solve a particular class of problems is a testable proposition.

Our description of the design science approach is adopted from Johannesson and Perjons (2014). They divide the research process into five stages: explicate the problem; define the requirements; design and develop the artefact; demonstrate the artefact; and evaluate the artefact. Figure 1 demonstrates how the result of each stage provides input to the next.

Illustration of the research design process with references to detailed content and result of each stage.

Operationalization

The aim of this study is to improve the MUAFT method. We consider previous iterations complete after the evaluation reported by Silfverskiöld et al. (2021); this study thus constitutes a new development iteration. The initial problem providing input to the design science process arises from the Swedish Armed Forces’ dissatisfaction with previous prognosis methods. Development and demonstration have been repeatedly performed since 2012, meaning the method has now been iteratively refined. To summarize the problem, as detailed more comprehensively in the introduction, this study aims to improve the process of military utility assessment of future technologies. It does this by addressing two issues relating to validity: lack of breadth of competence (P1) and lack of anonymity (P2). We refer to Silfverskiöld et al. (2021) for further details on the evaluation of the previous version of the MUAFT method.

In Stage II, the problems identified in Stage I were elaborated in discussions with the technology assessment team that typically conducts technology assessments commissioned by the Swedish Defence Materiel Administration (Försvarets materielverk, or FMV). Following this, a conceptual solution was outlined. Requirements, defined below, were derived by consulting theories on crowdsourcing and the Delphi method.

Most of the project time was dedicated to Stage III. It focused on designing, developing, and testing the web-based application, and on the design and testing of the new method. We considered adopting commercially available or open-source software but decided to develop a custom application to ensure its functionality adhered closely to our method. This also ensured control over the variables in the experiment.

An experiment was then conducted to demonstrate (Figure 1, Stage IV) and evaluate (Figure 1, Stage V) the artefact – that is, the developed assessment method. The experiment was divided into two parts. The first part, serving as a reference, was conducted as part of the regular technology forecast process, commissioned by the sponsor. The second part, referred to as the post-development part of the experiment, was conducted largely using the method described below (“Demonstration of MUAFT v2”). Minor adjustments were made to the method during the experiment, following consultations with the technology assessment team. This post-development part of the experiment fulfilled the demonstration purposes of the design science approach.

Invitations to the first seminar dealing with the experiment’s references were distributed in May 2022. The first seminar was conducted in June and the second in September the same year, prior to the development of the web-based application; for details on the outcome see Hult et al. (2022). The post-development part was conducted over a period of three weeks in September 2023. In both cases, the military utility of photonic radar was assessed, using a technology report from the Fraunhofer Institute for Technological Trend Analysis (Gabel, 2021). This technology report was chosen since it was representative of the reports typically used in MUAFT and because one of the authors had the expertise to assume the role of advocate/facilitator in both parts of the experiment. While we did not use the same expert participants for each of the experiment’s two parts, seeking to avoid biasing the results, we cannot guarantee that those in the post-development component had not accessed the reporting from the reference component – but we should note that we have received no indication, from the participants or others, that this occurred.

Many of the experts in the reference part of the experiment, 7 out of 15, participated as part of their duties. The students and three representatives of other agencies and industry participated voluntarily. The participants in the expert group could be divided into four categories: five end users/officers, three researchers, two systems engineers and five students.

For the post-development part of the experiment, we aimed for a maximum of 25 expert participants to avoid excessive administration. Invitations to participate were sent out successively to staff at the agencies usually participating, handpicked for relevant knowledge, and to relevant student groups. The participants were not told beforehand what the task would be – only that it would concern the evaluation of new methods for technology forecasting. They participated according to their available time and to the extent of their interest. Given that the focus of the study was the method rather than the quality of overall assessments, no special effort was made to increase diversity in scientific perspectives of the expert group.

The expert participants interested in participating in the post-development part of the experiment responded to the invitations by filling out a web-based form, providing their email address and categorizing themselves in one of four categories: “end user” (incl. officers), “researcher”, “systems engineer” or “other”. Following basic vetting, the participants were added as users in the web-based application. The expert group in the post-development part of the experiment consisted of four end users, seven researchers, two systems engineers and seven other participants. In the last category there were three officer cadets, one civilian alumnus, and two officers working with education and science and technology (S&T) respectively, according to these participants’ own categorizations. They were 4 women and 15 men, aged 20 to 65, all Swedish citizens.

Both parts of the experiment were conducted with the support of the same administrator and the same advocate for the technology of interest. In the post-development part, the advocate also assumed the role of facilitator.

Before commencing the three-week demonstration period, the participants were asked to view three five-minute instructional videos. The videos were kept short to ensure a high proportion of the participants would view them. The first video presented the purpose of the experiment and details on the use of data, the second explained the overall purpose of the method, and the third provided instruction on how to access and use the web-based online tool. Participants were also given contact information to the system administrator in case they needed support or to report any problems. No such contacts were recorded during the experiment.

During the demonstration period, the technology assessment team and the authors discussed issues such as how to respond to inputs from the participants or how the facilitator should act in a particular situation. The experiences from these discussions were used to make continuous updates to the method throughout the experiment.

The post-development method is documented in the section “Demonstration of MUAFT” below.

The evaluation in Stage V compares the extent to which the reference part of the experiment and the post-development part of the experiment meet the derived requirements from Stage II. The comparison is based on data collected from anonymous surveys administered to participants in the experiment and on observations made by the authors during and after the experiment.

Details on the survey, including the questions and response rates, are presented below in the “Evaluation” section. Only 20 people were invited to the post-development survey; the drawback with such a small number of participants is that we can only claim differences in estimates to be indicative rather than statistically significant.

Requirements of the Modified Method

The context and constraints from the previous version of the method were unchanged. The assessment of the future military utility of a technology took place within the framework of the Swedish technology forecast process (Andersson et al., 2019; Silfverskiöld et al., 2021). The original key requirements for the method were that it should include assessments of consequences to the military capability of interest seen as a whole, and perform cost-efficient assessments.

However, as already indicated, the validity of the established method is insecure on two significant grounds: too few expert participants meant competence was insufficiently broad, (P1), and it was not sufficiently anonymous (P2). The proposed solution to both problems, supported by previous research, was that they be replaced by in-person seminars with a web-based application for discussions among experts online.

To elaborate, experience has shown that a single expert participant can greatly influence the assessment in the current MUAFT method by coming up with a key idea or pointing out something significant during the work. This has raised questions about how many key ideas have been missed in those assessments where the “right” person was not present. As a wide range of competences and experiences is also important for uncovering all the consequences of the introduction of a new technology, a large panel with diverse professional and academic backgrounds is desirable. The group that currently participates in the MUAFT evaluations is comparatively homogeneous and consists of officers with similar professional and education backgrounds, most of them affiliated with the institution performing the evaluations.

However, with a suitable web-based application in place, and drawing on previous research on crowdsourcing, we have reason to expect a significant increase in the number of participants and anticipate that the workload for each group member will decrease; the total time invested by each participant in a complete technology assessment can thus be expected to be only a few hours. This streamlined approach aims to attract individuals with valuable knowledge, experience, and insights, such as soldiers, non-commissioned officers, junior officers with recent operational experience, high-ranking officers, and subject-matter experts in key drivers of military capability development.

The second issue with the current MUAFT method is the lack of anonymity. Although voting is performed anonymously, the meetings where expert participants perform the assessment have been conducted either face-to-face or through video conferences, compromising anonymity in the discussions (Silfverskiöld et al., 2021). To rectify this, the web-based application should ensure that the participants cannot discover each other’s identities. So that participants can interact and follow each other’s reasoning throughout the process, however, each should have a fixed pseudonym visible to the others. The proposed changes aim to reduce bias in comparison to the current method and mitigate effects such as prestige.

To summarize, the MUAFT method must be modified to meet the following new requirements. A web-based application for discussions among experts online shall allow for asynchronous (Req. 1) and anonymous participation (Req. 2), regardless of participants’ locations. This is in turn expected to enhance opportunities to participate and reduce the time invested by invited experts (Req. 3). The logic is that with less work needed from each participant, the updated method will make it possible for a wider range of experts to participate, thereby increasing the validity of the method. For us to consider the modification of the method successful, it is also necessary that the participating experts do not perceive the method’s overall usability an obstacle to participation (Req. 4).

Artefact – The MUAFT Method (v2)

Design and Development of the Application

Based on the requirements, development of a web-based application to support the post-development MUAFT method started with the identification of the minimum set of features required for a working product. Following this, the application was developed using modern software development principles. The application has three user roles: administrators, facilitators, and participants. Administrators are responsible for creating new assignments, adding users to the system, and assigning users to assessments; a system administrator usually fills this role. Most users are facilitators or participants, and the application is designed so that a user who serves as a facilitator in one assessment can simultaneously be a participant in another. Facilitators have similar responsibilities as the facilitator in the current MUAFT method – they lead the discussion and are responsible for updating the report to reflect the group’s assessment. Participants support the assessment by making comments and suggesting changes to the report according to their expertise.

Throughout development, the question of how to stimulate and motivate users during the entire assessment process was a central consideration. User motivation and engagement is decisive for the quality of the predictions (Mellers et al., 2015). A key factor in this is that the user interface should be intuitive and perceived as easy to use.

To provide anonymity, participants are identified by a participant number. The numbers are unique to each assessment: someone participating in two simultaneous assessments may have a different number for each. Users use their email address to log in. This is the only piece of personal information collected about them. Once an assessment is finished, the connection between participant number and user identity can be removed in the database, rendering the data from an assessment truly anonymous. In addition to ensuring anonymity in accordance with the Delphi method, this also ensures compliance with data protection regulations such as GDPR and simplifies sharing of assessment data.

Figure 2 below shows the participant user interface. The left part of the screen is dedicated to participant opinions. At the top is the participant’s current opinion (1) along with an agreement slider (2). The participant number (3) is also displayed here. The slider is meant to be used by participants to indicate their current agreement with the current topic of discussion, as indicated by the facilitator. Participants can update their opinions and agreement levels at any time. Below the participant’s opinion is a list of other participants’ opinions (4). The list can be sorted by several variables, such as time or agreement level. It is possible to view all of a participant’s past opinions by clicking the “opinion history” link (5).

Screenshot of the participant user interface.
*Note*: The figure has been annotated with numbers in red, showing the location of the opinion input field (1), agreement slider (2), user number (3), opinion list (4), opinion history link (5), facilitator message (6), assessment file links (7), and draft report (8).

The right part of the screen is dedicated to report-related information. At the top is the current message from the facilitator (6). The facilitator typically uses this feature to ask participants to focus their comments and opinions on a specific issue or part of the report. This is followed by one or more links to files related to the assessment (7), such as the technology report. Below that is the current version of the report (8).

In addition to the functionality available to participants, facilitators can update the report, upload files, send messages to participants, and view statistics on participation and agreement levels. The facilitator leads the discussion through messages to participants and periodically updates the document based on participants’ comments and opinions. Both the report and participant opinions are entered in Markdown format. This provides a user-friendly way to include basic formatting, links, tables, and so on. It also makes the report easy to render as a web page and to export it to formats such as a Word document or a .pdf.

Apart from minor bugs and usability issues, the internal test of the application also revealed that users found it difficult to remember to visit the platform regularly to participate in discussions during the assessment period. This prompted the addition of email alerts for new comments, updates to the report, and facilitator messages.

The application records all actions in a database. This includes all participant opinions, facilitator messages, and changes to the report, along with timestamps. During each assessment, the data is used to provide the facilitator with statistics about participation. After an assessment, the data can be used to support analysis. Another feature of the high-fidelity logging of opinions and changes to the report is that it allows the origin of ideas, both those that made it into the report and those that were rejected, to be traced. This is an improvement to the current MUAFT method.

Demonstration of MUAFT v2

The demonstration presented follows the process structure of the developed method, as shown on the right side of Figure 3. With a focus on changes in the updated MUAFT method, the description aims to provide a coherent overview of the method’s use; the updated method is thus described largely based on the previous description by Andersson et al. (2019).

Process views of the initial method (left) and the developed MUAFT v2 method (right).

First, the sponsor delivers technology assessment reports from selected research institutes to the assessment group (see the “technology report” step in Figure 3). For the experiment demonstrating the updated method, a report on photonic radar technology was used (Gabel, 2021).

Preparation

Ideally the report, which should provide an estimate of the current state and a prediction of the future state of the technology within the specified time frame, is assigned to an analyst whose expertise and interest align well with the technology in focus. The analyst evaluates the report assuming the role of an advocate, supporting the technology’s adoption. The report should be accepted as largely accurate; otherwise, the assessment team may risk shifting its focus toward reviewing the research institute’s work instead.

Building on the report, the technology advocate designs one or more conceptual technical systems that leverage the new technology, situating them within one or two plausible future military scenarios. These systems and scenarios are selected to clearly demonstrate the technology’s advantages. By framing a solution within a specific scenario, the relevant military capabilities are indirectly defined, enabling an expert group to evaluate the potential of the technology’s military utility.

The technology advocate then drafts a memo to serve as a foundation for discussions on its application. The initial version should provide an overview of the technology’s evolution, covering its past, present, and anticipated future states, along with its identified opportunities and limitations. If the input report includes additional forces for change, such as market forces or regulatory developments, these should also be summarized in the memo.

Future state of the technology:

Microwave photonics could circumvent the above-mentioned problems of conventional high-frequency electronics. This technology could make it possible to achieve low phase noise, ultra-large bandwidth, and excellent tunability, together with low-distortion signal propagation and very small transmission losses. Photonic technologies also allow for multidimensional multiplexing, energy-efficient purely analogue and ultra-fast signal processing, as well as immunity from electromagnetic interference, such as electronic warfare and High Power Microwaves. Photonic radar signal generation and processing additionally benefits from the availability and maturity of highly coherent optical sources. In contrast to Quantum radars, Photonic radars have a traditional RF front-end… (Entry by facilitator of the demo session).

The process continues with identifying assumptions that support the future conceptual technical system and the scenario at a specified future time, typically twenty years ahead. While these assumptions primarily focus on the development of the technology, they may also encompass other factors influencing military capabilities. Following this, a description of the conceptual technical system or systems is provided, outlining the key elements and functions. Based on these assumptions, future scenarios are then described, offering a vision of what might unfold. Finally, an initial iteration of a SWOT analysis – an analysis of strengths, weaknesses, opportunities, and threats (Weihrich, 1982) – is conducted to assess the potential use of the technology within the assigned scenarios (see “Asynchronous Web-Based Review” below).

The next four sections of the memo are added during the asynchronous web-based assessment and contain assessments of several key aspects.

First, an evaluation of the technology’s contribution to military effectiveness and its impact on the capabilities of the military actor in focus. Next, the technology’s footprint is assessed – specifically its impact on military suitability and affordability. Following this, an assessment is made regarding the need for military research and development; finally, the aggregated future military utility of the technology in focus is evaluated (see “Conclusions on military utility and recommendations” below).

The memo should be comprehensive enough so that a participant does not need to read the original report or other literature to participate meaningfully, but not so lengthy that participation is discouraged.

The Asynchronous Web-Based Review

The technology advocate will facilitate the asynchronous web-based assessment, proposed to span four weeks. The objective is to produce a credible forecast of a military scenario that highlights the technology’s potential, drawing fully on the expert group’s knowledge and experience.

Before the assessment begins, the facilitator uploads the technical report and enters the first version of the prepared memo in the application. This is done well in advance to ensure that participants have sufficient time to prepare. A preliminary schedule for the web-based assessment seminar is also provided, allowing participants to plan their involvement. Although participants can choose individually when to provide feedback on different sections of the memo, they must broadly adhere to the schedule to ensure they do not miss crucial parts of the assessment.

During the assessment, the facilitator seeks to stimulate the discussion through interaction with participants, progressively supplementing the memo with input agreed to be worthy of recording. The facilitator also coordinates the seminar, deciding when to proceed to the next step. The primary role of the other participants is to critically evaluate the proposed assessment based on the memo’s current content and their own knowledge and experience, while also providing suggestions for additional input.

Following this, the application of the technical system within the specified military scenario, including the integration of the technology in focus, is analyzed in four steps.

Step 1: SWOT Analysis

The purpose of a SWOT analysis (Weihrich, 1982) in this context is to identify strengths, weaknesses, opportunities, and threats from the use of the conceptual technical system in the specific scenario. This is used as a basis for continued assessment.

The facilitator enters a few examples of possible findings into the SWOT matrix in the report to encourage input from the participants. Then, the participants suggest further input to the matrix until there are no new opinions, or the facilitator decides to move on to the next step.

(S): Considerably increased detection of stealthy UAVs compared to classical radar systems. … (W): Uncertainty if PR will be more robust or have smaller size. … (O): PR could possibly be used on larger platforms, saving volume, weight, and energy consumption and allowing for better robustness of sensors, communication systems, and EW systems. … (T): Possibly development and implementation of PR-systems will be too expensive to be affordable. (Entry by facilitator of the demo session)

Step 2: Assessment of Capability Impact

The primary objective of the second step is to establish a foundation for evaluating the military effectiveness dimension of military utility (Andersson et al., 2015). The impact of the technology is typically evaluated in relation to the elements of combat power found in various military doctrines: fires, movement and manoeuvre, sustainment, command and control, protection, and intelligence and information. In the photonic radar experiment, the technology was assessed to have its greatest impact on the fires element. “Increased capability to handle low flying, small stealthy targets (i.e., targets with small radar cross section [RCS]).” (Entry by facilitator of the demo session)

Step 3: Assessment of Footprint

The aim of the third step is to establish a foundation for evaluating the military suitability and affordability aspects of military utility. The assessment produces a compilation of expected footprints, derived from the technology in the scenario, and organized according to the DOTMLPFI model (NATO ACT, 2021). Table 1 outlines the factors and includes examples from the photonic radar assessment. The example statements are from the reference part of the experiment performed using the previous MUAFT method.

Table 1

Footprints from the assessed photonic radar technology.

INFLUENCED FACTOR	FOOTPRINT
Doctrine	MIMO might imply changed doctrines and tactics regarding operational use of the sensor chain.
Organization	The unmanned systems and the new sensor chain will have to be incorporated in the organisation.
Training	Commanders need education to understand the usability of photonic radars to exploit new capabilities due to Low Probability of Intercept (LPI)
Materiel	New unmanned platforms to carry photonic radars and new communication links, possibly satellites, are needed.
Leadership	A new kind of leadership may have to be developed to communicate with machines.
Personnel	Fewer personnel will be required on the introduction of unmanned platforms.
Facilities	Facilities for maintenance of unmanned platforms and photonic radars are needed.
Interoperability	Military standards and standardization agreements must be followed.

[i] Note: The results from the evaluation of the updated method were considered insufficient to illustrate this step.

Step 4: Assessment of the Need for Military R&D

The fourth step seeks to determine whether military research and development is needed to enable the technology’s integration into service. In the demonstration, the suggestion read:

There is a need for SwAF funded research on the following issues: Assessment of the robustness against jamming of photonic radar; to which extent can photonic radar be detected by an opponent; and in which way can photonic radars be subject to deception and what are the countermeasures? (Entry by facilitator of the demo session)

Conclusions on Military Utility

Before proceeding, the facilitator ensures that the memo is updated with all relevant input from the earlier phases to create a stable basis for conclusions.

Either a consensus or a stable difference in opinion regarding the assessment of the military utility of the photonic radar technology is pursued using a variant of the Delphi method (Jaiswal, 1997, pp. 213–215). The facilitator requests that the anonymous participants provide their personal overall assessment of the technology’s future military utility, choosing from one of four responses (“significant”, “moderate”, “uncertain”, or “negligible”) and to submit their responses before a specified deadline. After that time, the facilitator asks for input with motivations on the respective assessments. The facilitator then continues by asking for new assessments from each expert participant. The procedure is iterated until consensus has been reached or until the results have converged towards a stable outcome.

Based on the collective voting results, the technologies are categorized into three groups of recommendations. The sponsor is usually advised to pursue technologies that significantly enhance its military capabilities, to monitor those with moderate or uncertain impact, and to avoid investing in technologies with negligible impact. The demonstration concluded with the following overall assessment of the military utility of photonic radar technology:

The expert group estimates that TRL 9 will be reached no earlier than 2045. Using the Delphi method, the expert group has assessed photonic radar technology as having moderate potential for military utility. This assessment is motivated by the potential for the SwAF to acquire a new capability with photonic radars: the ability to detect slow, low-flying objects with a small radar cross-section (RCS), as well as objects on the surface. We recommend that the SwAF initially monitors the development of photonic radars and considers initiating a research project on the subject at a later stage, if suitable. (Entry by facilitator of the demo session)

Evaluation

This section reports on results from the evaluation stage (Stage V) in the design science approach described in the “Method” section above. First, the results of a survey of the participants in both parts of the experiment are presented. Next, the experiences of the authors after the post-development experiment are presented. Before ending with the analysis of evaluation results there is also a brief comment on comparing the two resulting reports.

Expert Participant Survey Results

To provide a baseline, a questionnaire with six questions was first sent to all experts and students who had participated in at least one of the seminars conducted in the last two years. There were 20 responses. After the post-development experiment, a new questionnaire with the same six questions and an additional ten were sent to the ten expert participants. The first two questions in both questionnaires relate to the participants’ experience with the method, and estimates of the time they had invested in the process. The last question in both questionnaires offered an opportunity to make text comments. All other questions were formulated as statements with answers on a Likert-type scale (“strongly disagree”, “disagree”, “neutral”, “agree”, “strongly agree”, and “don’t know”). The questions and survey results are presented in Figures 4 and 5.

Survey responses to questions with interval variables in the reference and post-development surveys.

Survey responses from participants in assessments of the military utility using the MUAFT method before and after the reported development.
*Note*: “Don’t know” responses are reported separately on the right-hand side.

Half of the respondents in the reference survey had participated in the assessment of two or fewer technologies while the other half had more experience: two respondents had participated in ten or more assessments. We note that four respondents in the reference group, which consisted of previous participants in MUAFT assessments, indicated that they had participated in no prior assessments. We believe this may be to them misinterpreting the question as the number of evaluated technologies prior to their previous participation. In comparison, the post-development respondents reported less experience with the method, with a majority reporting no previous experience. The estimated time invested by the expert participants is significantly lower in the post-development experiment. The estimates for how easy it is to participate indicates that the method’s usability is high both before and after development. The participants also seem to consider their knowledge and experience to have been drawn on in both cases. When it comes to the question of sufficient competence in the expert groups, the participants appear to be somewhat neutral. While the median value for perceived competence in the post-development experiment is slightly positive, the difference is not significant. The estimates of the method’s reliability indicates that the participants do not have strong opinions in either case.

The additional questions, sent to participants after the post-development experiment, were intended to provide information on the possible effects of developing the method on assessment participation, time invested, and usability. The questions and survey results are presented in Figure 6.

Survey responses to the additional questions sent to participants in the post-development experiment.
*Note*: “Don’t know” responses are reported separately on the right-hand side.

Participants’ opinions on whether they have participated to the extent they wanted were significantly diverse, as indicated by the responses to the first question in Figure 6. Respondents indicated a clear uncertainty about whether the mix of experiences and competences in the group was sufficient for the task, with only one respondent giving a response other than “neutral” or “don’t know”. We interpret this as resulting from the strong degree of anonymity produced by the new method and recognize that the method needs improvement when it comes to how competence and experience can be conveyed among participants. There were no strong opinions on the time allotted for the assessment task, as indicated from the results of the eighth statement. Respondents were divided about the usability of the user interface.

Expert Participant Responses

Many text responses in the reference survey corroborate the conclusions on the utility of the method reported by Silfverskiöld et al. (2021). Some comments (“it is critical that experts are included in the seminars”, for example, or “the biggest disadvantage I have seen this year is that some seminars have had few participants”) specifically concern the participation of experts. Other comments also concern the amount of time invested in work; “the procedure with two seminar sessions is too time-consuming”, one respondent wrote, while another wrote that they “would have liked to have participated more, but have been short of time … if you have to attend every seminar and read all the texts, time that you often don’t have and then prioritize away.”

In the post-development survey, participants indicated that they had invested less time than they had aimed for: “I had a couple of other tasks at work competing for time”, wrote one; it was “difficult to balance participation in the study with other duties” wrote another. Some of the comments indicated a need for clearer guidelines on when to do the work: “On a couple of occasions I missed the deadline because of things at work and at home”, for example; another wrote that “with a clearer timetable before, it would have been easier to plan MUAFT time in the calendar”. On the other hand, one of the participants thought that there was still “too high a threshold to participate, there is no way (and it is likely difficult to create a way) to participate to different extents – or for short moments.”

Some comments in the reference survey specifically concern the method’s usability, and reinforce a perception that poor attendance before development is not a question of the difficulty of being an expert participant: “It is easy to participate”, for example, and “the forecasts are well thought out and the methodology around them good. I look forward to being able to participate more times”.

The comments on usability post-development are all suggestions to improve the web-based application (“There is an ‘agree’ slider which is very confusing. What do I agree with?”). However, some of the suggestions are related to increased participation. One suggestion is to “take inspiration from, for example, social media platforms for increased user interaction … you should be allowed to react to short pieces of text and graphics”. Another concerns increasing the number of device possibilities, given that “the website only works on a computer and not on a phone … it is more difficult to find time to participate.”

Research Team Observations

One week after the post-development experiment concluded, the research team compiled their observations. The observation subjects were successively identified during the post-development demonstration and were set out in accordance with the major activities in the assessment process demonstrated (see Table 2). As the observations are too detailed to reproduce here, they have been incorporated into the post-development version of the method as documented in the section “Design and Development of the Application” above.

Table 2

The structure used to capture experiences from the research team.

MAJOR PROCESS ACTIVITIES	CONCERNS DISCUSSED
Choose advocate	Simultaneously being the facilitator?
Preparation	Providing technical support; providing education to expert participants; maintaining the application; entering the first part of the memo; selecting and inviting expert participants; offering incentives for participation; administering participants with assured anonymity.
Asynchronous web-based assessment	Dividing the review into appropriate steps; the facilitator’s dialogue with the participants, including formulating messages, updates, and their frequency; management of those invited who do not participate from the outset; management of additional participants during review period; length of review period and possible end criteria; tool improvements towards increased validity, increased anonymity and robustness/reliability.
Conclusions and recommendations	Whether conclusions and recommendations should form a separate step in the process; design of voting procedure for enhanced anonymity.

A Comparison of the Resulting Assessment Reports

The research design does not require a comparison of the two reports produced after the reference and post-development assessments respectively. An overview comparison, however, does show that the reference report is richer in comments, arguments, and recommendations, and that the two processes end up with different assessments for the technology of interest. Photonic radar technology is there assessed to be of significant future military utility.

Analysis of Results

Evaluation results are analyzed relative to the derived requirements on the artefact stated in the relevant section above (“Requirements of the Modified Method”).

Requirement 1: The results show it is feasible to integrate an online web-based tool with the MUAFT method to allow for asynchronous participation.

The conclusion is supported with three arguments. First, post-development, the estimated time invested was lower than before (Req. 3). Second, using a web-based tool does not seem to have been an obstacle to participation (Req. 4). Out of 20 registrations, 15 people participated actively, from every desired category: end users, researchers, systems engineers, and others (students). Participants were divided when it came to the usability of the web-based application. Although some respondents were unable to participate as fully as they desired, this points to potential for improvement. Enhancing the user experience and minimizing usage barriers are likely to boost engagement, thereby increasing the validity of the results. Despite the identified usability issues, the findings indicate that post-development participants remain satisfied with both how their knowledge and experience were utilized and the method’s reliability.

Requirement 2: The results show that the conditions for anonymous participation have improved considerably.

The demonstration illustrates the feasibility of designing the web-based application and the method so that participants may engage in the assessment without disclosing their identities to others involved. While participants may inadvertently disclose their identity in their posts, it is reasonable to assume that this risk can be mitigated through proper training. Before development, in comparison, participants met in person during two seminars.

There is one disadvantage: it appears that there will be an increased workload placed on the management team when employing the developed method.

First, a technical administrator is required to set up the application, to administer accounts, and to provide support – meaning a new competence category has become necessary. Second, the results indicate a correlation between facilitator activity and the participants’ desire to invest time in the assessment. Post-development, the facilitator should engage with the expert group daily throughout the entire assessment period. Before development, work was concentrated on the two seminars and the participants’ own preparation. Guidelines for facilitator engagement with the expert group, to encourage involvement, were developed successively during the experiment; there is a need for further study.

Discussion

A method to assess the military utility of future technology has been developed to address issues regarding validity and the anonymity of participating experts reported in an earlier study. The solution draws on theories of crowdsourcing, seeking to broaden perspectives and to increase participation in the expert group, and on theories of the Delphi method regarding the anonymity of group participants and recent research on argumentative surveys, seeking to make better use of the collective knowledge of the group. The new version of the method has been implemented through the development and use of a web-based application. The research study was structured and conducted using a design science approach. The analysis of the evaluation results shows that the modification has been successful. It is possible to use a web-based application to ensure anonymous discussions over an extended period, allowing experts to participate from both locations and hours of their own choosing.

There are limitations to the study. Lacking both an expert group of sufficient size and diverse composition and adequate time to reach a definitive result, we have not been able to demonstrate that the method improves validity. We have, however, shown that the web-based application can be used for expert participation, and argue that the only requirement for real-world implementation is a group of satisfactory composition and number. A survey of participants in the tool’s demonstration has confirmed that confidence in the method’s validity and reliability was not negatively affected by the modification.

The study has not tested whether the method’s original key requirements continue to be met – specifically whether the method includes assessments of the consequences to the overall military capability of interest and whether the assessments are cost-efficient (see the section “Requirements of the Modified Method” above). However, the assessments remain structured according to the capability-centric dimensions of military utility: military effectiveness, military suitability, and affordability (Andersson et al., 2015). Additionally, with results indicating a significant reduction in the work time invested by each participating expert, we believe more experts could participate within the original method’s budget. It is therefore reasonable to claim that the original requirements can still be met with the modified method.

But how novel is the use of crowdsourcing in this problem domain? In 2013, Majchrzak and Malhotra proposed a research agenda to delineate the contributions of information systems research to crowdsourcing for innovation – a field that has grown significantly since then (Ghezzi et al., 2018). This body of work, however, primarily leverages the public to solve specific problems rather than to predict future developments. When combining this research area with the community investigating methods for forecasting technological development, the available studies become notably fewer. Nevertheless, there are some relevant examples. Passive crowdsourcing has shown promise in producing statistical predictions about future technological developments (Briscoe et al., 2015). Other instances of success are found in more specialized domains, such as predicting chart-topping artists and music (Steininger & Gatzemeier, 2019) or enhancing the accuracy of disaster predictions through crowdsourced intelligence integrated into numerical models (Wang et al., 2024). The uniqueness of our application lies in its general perspective on technological development, its long-term horizon, and its relevance to the military domain.

Regarding our application of the Delphi method, we believe our study contributes to the emerging field of dynamic argumentative Delphi research (Cuhls, 2024). Developments in this field will be monitored, especially to further develop the role of the facilitator in future research.

The primary goal was to increase the validity of the method, and our results suggest that we have achieved this. By significantly reducing the time each participant spends on the work and by inviting volunteers with a high level of interest and knowledge in the specific technology area under study, we expect it to be possible to conduct a broader assessment at the same or lower cost than before. The only cost-increasing aspect of the updated method is the need for additional technical administration.

There likely remain other benefits not yet accounted for. One major benefit anticipated is the substantial growth of the network discussing these issues, including the automatic dissemination of knowledge. End users are also invited to participate in the assessments. The extent of these benefits will need to be determined by future research.

Conclusion

Technology forecasting is an important activity in the capability development of today’s armed forces. This study takes a design science approach to investigate whether crowdsourcing and the anonymous, asynchronous participation of experts using a web-based application is a feasible path to improve the validity of existing methods for assessing the military utility of future technologies.

The study’s primary findings include the development of a dedicated web-based application designed to facilitate successive assessments of the military utility of future technological developments by engaging online participants. The results indicate that the application effectively supports asynchronous assessments and, along with the procedures it enables, can be configured to preserve participant anonymity. These enhancements to the MUAFT method are likely to leverage the benefits of crowdsourcing and improve the method’s overall validity. Future research should implement this approach in real-world assessments of emerging technologies, involving a sufficiently large and diverse group of experts and students. Only through repeated, real-world applications can the method’s validity and reliability be thoroughly evaluated. However, scaling up its use will undoubtedly raise new considerations related to usability and security.

Acknowledgements

The authors would like to extend their warmest thanks to the colleagues and students at the Department of Systems Science for Defence and Security at the Swedish Defence University, the Swedish Armed Forces, and sister agencies for their participation in the experiments and demonstrations. Special thanks are extended to Therese Almbladh for sharing her experiences in leading previous projects in technological forecasting and for her extensive support throughout the project.

Competing Interests

The authors have no competing interests to declare.

Author Contributions

Marcus Dansarie: Conceptualization, Methodology, Software, Validation, Formal analysis, Investigation, Data curation, Writing – Original draft, Visualization

Kent Andersson: Methodology, Validation, Formal analysis, Investigation, Writing – Original draft, Project administration

Stefan Silfverskiöld: Investigation, Resources, Writing – Review & Editing, Supervision

Term	Definition
Conceptualization	Ideas; formulation or evolution of overarching research goals and aims
Methodology	Development or design of methodology; creation of models
Software	Programming, software development; designing computer programs; implementation of the computer code and supporting algorithms; testing of existing code components
Validation	Verification, whether as a part of the activity or separate, of the overall replication/reproducibility of results/experiments and other research outputs
Formal analysis	Application of statistical, mathematical, computational, or other formal techniques to analyze or synthesize study data
Investigation	Conducting a research and investigation process, specifically performing the experiments, or data/evidence collection
Resources	Provision of study materials, reagents, materials, patients, laboratory samples, animals, instrumentation, computing resources, or other analysis tools
Data Curation	Management activities to annotate (produce metadata), scrub data and maintain research data (including software code, where it is necessary for interpreting the data itself) for initial use and later reuse
Writing – Original Draft	Preparation, creation and/or presentation of the published work, specifically writing the initial draft (including substantive translation)
Writing – Review & Editing	Preparation, creation and/or presentation of the published work by those from the original research group, specifically critical review, commentary or revision – including pre-or postpublication stages
Visualization	Preparation, creation and/or presentation of the published work, specifically visualization/data presentation
Supervision	Oversight and leadership responsibility for the research activity planning and execution, including mentorship external to the core team
Project administration	Management and coordination responsibility for the research activity planning and execution
Funding acquisition	Acquisition of the financial support for the project leading to this publication

Author Information

Marcus Dansarie is a Ph.D. student at the Department of Systems Science for Defence and Security at the Swedish Defence University and at the Department of Informatics at the University of Skövde. He received the M.Sc. degree in Information Warfare Systems Engineering from the Naval Postgraduate School, Monterey, CA, USA, in 2017. His primary research interest is in security in wireless communication systems, including both technical and organizational aspects of vulnerabilities as well as threat mitigation. He is a Lieutenant in the Swedish Armed Forces.

Kent Andersson is senior lecturer and associate professor in Systems Science for Defence and Security at the Swedish Defence University. He received his Licentiate degree in solid-state physics from Uppsala University in 1993 and his Ph.D. in military sciences from the National Defence University of Finland in 2018. The title of the dissertation was “On the Military Utility of Spectral Design in Signature Management: a Systems Approach”. He is a lieutenant colonel in the Swedish Armed Forces and has a thorough experience in development of command and control systems. His current research is in the field of military capability development and specifically on decision-making based on military utility assessments.

Stefan Silfverskiöld is a senior lecturer and assistant professor in Systems Science for Defence and Security at the Swedish Defence University. He received his Licentiate degree and Ph.D. degree in engineering physics from Uppsala University, Sweden, in 1997 and 2002, respectively. The title of the dissertation was “Effects of Lightning Electromagnetic Pulse and High Power Microwaves on Military Electric Systems”. His current research is in the field of capability development and in military utility assessments. He is a Captain (Navy) in the Swedish Armed Forces and is currently Head of the Department for Systems Science for Defence and Security. He has previously been Defence Attaché adjoint for Sweden in Paris and dean and Deputy Head of Research at the Swedish Defence University.