Can&rsquo;t Software Malfunction?

Jeroen de Haas; Wybo Houkes

Introduction

Our society is digitizing rapidly. Ever more of our daily activities are facilitated by digital support systems, or directly involve digital artifacts such as health apps, social-media feeds, and computer games. This makes us, individually and collectively, increasingly dependent on the correct performance of digital artifacts and support systems. Unfortunately, this correct performance is not guaranteed. Word processors may fail to respond, an operating system may cause a computer to crash, or one’s avatar in a computer game might get stuck in a wall.

In most philosophies of technical artifacts, these episodes in which digital artifacts fail would be analyzed in terms of artifact malfunctioning. Even those who deny that functions are essential to artifacts maintain that functions are central elements of the clusters of features that characterize artifact kinds, such as Phillips screwdrivers and Eames 1950 desk chairs (Juvshik 2021). Digital artifact kinds such as word processors and health apps seem no exception. Furthermore, virtually all philosophical accounts of functions take as a central analysandum that items may fail to perform their function, i.e., that they may malfunction (e.g., Preston 2009; Houkes & Vermaas 2010).

Yet according to Luciano Floridi, Nir Fresco and Giuseppe Primiero (henceforth: FFP), ‘in a strict sense, it seems impossible for software to malfunction (2015: 1217).’ At first glance, this seems at odds with the everyday experience of software failure. However, rather than denying the phenomenon of failure, FFP claim that it cannot be analyzed in terms of malfunctioning. Therefore, they conclude that software presents a problem case for philosophical theories of artifacts and their functions: their analysis of software failure would reveal that there is a class of technical artifacts to which we ascribe a function, but that cannot malfunction.

This analysis raises interesting issues for a metaphysics of technical artifacts in general and that of software in particular. In addition, FFP’s analysis has repercussions for the practice of software engineering, by taking all software failures as design errors unless they are caused by hardware problems. Thus, it attributes extensive responsibilities to software engineers. On the one hand, this echoes self-presentations of software designers and analyses of software failure in the engineering literature. It also turns out to rest on assumptions about a central creative activity in software engineering, namely implementation. This characterization is prima facie plausible and can, like the attribution of design errors, be found in the software-engineering literature. FFP’s argument is therefore aligned to a more broadly accepted understanding of the nature of software and its relation to human creators. On the other hand, however, actual practices in contemporary software engineering are at odds with this (self-)understanding and with FFP’s conclusions. Their metaphysics of software and its implications for the responsibility of software engineers presupposes that these engineers have a measure of control over their creations that, in practice, they cannot achieve; software has become far too complex to reliably locate design errors. Moreover, FFP as well as many software engineers presuppose that the performance of digital artifacts is controlled through self-contained and complete sets of instructions. In practice, the creation and running of software are heavily mediated processes involving many agential activities, none of which results in any self-contained, let alone complete sets of instructions. As a result, software performance may vary, often in undesirable ways, without either hardware failures or design errors. The upshot is that, in an episode of ‘software failure’, it is difficult to identify anything as a self-contained software token that is realized on a local machine and to which a function may be ascribed which it may be said to fail to perform. The metaphysics of software and other digital artifacts must therefore be considerably more complicated than presupposed in FFP’s analysis and, perhaps, any extension of metaphysical accounts developed for more traditional classes of artifacts such as screwdrivers and desk chairs.

The paper is structured as follows. First, we discuss FFP’s argument for their central claim and describe how it hinges on the notion of ‘implementation’. We also identify the implications of the argument for philosophical function theories and for software engineering. Then, we analyze the notion of ‘implementation’ by drawing on maker-intention-based analyses of artifact kinds, in particular in contexts of automated production. By doing so, we identify two presuppositions about the creative activity of software engineers, which we shall call complete control and self-containedness. We present arguments against each of these presuppositions, using brief descriptions of some practices in contemporary software engineering.

Why Software Cannot Malfunction

In their 2015 paper, Floridi, Fresco and Primiero argue that ‘in a strict sense, it seems impossible for software to malfunction (1217).’¹ In this section, we characterize how they use ‘software’ and ‘malfunction’;² then present their argument regarding software malfunctioning; and identify the repercussions regarding design errors and the responsibilities of software engineers.

FFP distinguish software from two more abstract computational items (programs and algorithms) as well as from hardware. Programs are defined as ‘single, complete, and self-contained ordered sequence[s] of instructions (2015: 1213).’ They implement an algorithm in a programming language such as C++ or Python, but are not yet machine-executable. Software, then, is ‘any ordered sequence of machine-executable instructions (ibid).’ Furthermore, also in an earlier paper (Fresco & Primiero 2013), they associate these objects with ‘levels of abstraction’ and characteristic creative activities and responsible agents: an Algorithm Design Level, under the purview of algorithm designers; and an Algorithm Implementation Level, under the purview of (software) engineers. Finally, hardware are the physical devices of which the states are changed through executing instructions. Here, we focus on contexts in which a single physical device, a local machine such as a particular laptop, executes all instructions.

Regarding malfunction, FFP build on earlier philosophical analyses. Like Maarten Franssen (2006), they discuss malfunctioning statements against the background of evaluating artifacts as tokens of a type (e.g., ‘This is a poor car engine’). More specifically, they follow Jesse Hughes’ (2009) understanding of malfunctioning as a token’s incapability to perform as well as other tokens of its type. Additionally, they distinguish dysfunction and misfunction, resulting in the following pair of definitions:

‘An artefact token t dysfunctions if and only if it is less reliable or effective in performing its function F than one justifiably expects for tokens of its type T.’ (2015: 1208)

and

An artefact token t misfunctions if and only if […]

(1) using t produces some specific side-effects e of type E;

(2) because of e, one has reason not to use t; and

(3) other (“normal”) tokens of the same type do not produce the same side-effects of type E. (2015: 1202)

This definition rules out by fiat that artifact types such as Eames 1950 desk chairs may malfunction. As Hughes puts it: ‘malfunction applies to individual tokens, not types. Types may be poorly designed, so poorly that … no token can do [sic] realize its functional goal, but types do not malfunction (2009: 195).’ This does not exclude failures at the level of artifact types, but such type-level failures should be analyzed as design errors rather than type-malfunctioning.

With this definition, FFP only need a substantive argument about both instances of malfunctioning of software tokens. For this, they state:

Thesis 1 (Software Dysfunction) A software token t cannot dysfunction, since it cannot be less reliable or effective in performing its function F compared with other tokens of the same type T independently of the supporting hardware used to run it. (2015: 1215)

‘Thesis 2 (Software Token Misfunction) Software tokens of a given type T in isolation do not misfunction, since they all inherit a single software design D and are not comparable with other “normal” tokens of the same type T.’ (2015: 1216)

Their argument for this thesis can be reconstructed as follows:

All software tokens are implementations of the same underlying program, which implements an algorithm in a programming language.
The implementation of an algorithm in a program and consequently in software fixes the performance of this software at the type level, barring hardware failures.
If so, all software tokens function in the same way.
Theses 1–2 follow; hence, software tokens cannot malfunction.

Step 1 adds to the levels-of-abstraction analysis a characterization of the relations between the three levels of algorithm, program, and software. FFP refer to these relations as well as the result of running executable instructions as ‘implementation’. Alternately, implementation is taken to be a formal relation between objects (e.g., a program as an implementation of an algorithm), a process by which this relation is established (e.g., a program implementing an algorithm), the outcome of this process (e.g., the physical implementation of a program) and a creative activity (e.g., a software engineer implementing a program in software). The latter aligns with earlier work by Fresco & Primiero (2013), and reflects other analyses of the implementation relation, such as William J. Rapaport’s: ‘implementation is not a binary relation … A cognitive agent C uses M to implement A as I, possibly for some purpose P (2005: 286).’

Step 2 characterizes this relation between levels as one that fixes the function of software tokens. The implementation of algorithms in particular plays a crucial role: ‘once the algorithm is implemented using some programming language, the function of the software is fixed by its type (2015: 1215).’ While it is common for the analyses of technical artifacts to identify mechanisms that establish functions as normative standards against which the performance of artifacts is to be judged, these do not determine that all tokens function in the same way – only that all tokens have the same function. Thus, such accounts provide a basis for malfunctioning statements. Returning to FFP’s argument: for Step 3 to follow, implementation, by contrast, must fix the functioning or actual performance of tokens rather than their function. If so, Step 4 follows trivially with the definition of malfunctioning: if all tokens perform in the same way, one cannot be less reliable or effective than another.

In the next section, we reconstruct how the notion of implementation must be understood to have this strong performance-fixing role. Before doing so, we highlight the most important consequence of FFP’s analysis and embed it in a larger literature basis.

Like many other authors, FFP’s analysis discusses the nature of the artifact in relation to creative activities of intentional agents: implementation is under the purview of software engineers. The performance-fixing role of implementation entails that there is no software token malfunctioning. Yet, this does not rule out type-level failure, but rather entails that any software failure (i.e., a failure that cannot be attributed to hardware) reflects poorly on the creative activity. Indeed, now referring to the characteristic activity as ‘design’, FFP write: ‘Typically, “software malfunctions” are simply design errors for which only the designer can be deemed responsible (2015: 1215).’

This responsibility seems more extensive than that of the designers of other artifacts. If one’s chair wobbles, this may likewise be attributed to poor design, and other tokens of the artifact type would suffer from the same problem. However, not every chair wobbles because of an error made by its designer; it might also be a case of token malfunctioning. Whereas most analyses of artifact malfunctioning would leave open this diagnosis, FFP rule it out, thus intuitively extending the responsibilities of software engineers for the performance of their creations.

Many software engineers, or authors with at least a deep familiarity with software-engineering practices, would agree. They likewise hold human agents responsible for every failure that cannot be attributed to hardware. One brief example, with very high impact in the field, comes from a paper on the dependability and security of computational systems (Avižienis et al. 2004). Central aspects of this paper, which has been cited thousands of times, conform to FFP’s analysis. Firstly, the concept of implementation is as central as in FFP’s analysis, and variously refers to a mechanism in or behavior of a computational system, comprising both process and outcome, and to strategies and choices by agents. Secondly, the paper features an extensive taxonomy of faults in computational systems, organized in nested dichotomies. Faults are characterized primarily in terms of incorrect services, i.e., behavior of the local machine that does not conform to user expectations. Notably, on inspection of the order of the dichotomies that classify (the causes of) incorrect services, any software failure is cast as human-made: in the nested dichotomies, ‘software’ failures are never classified as ‘natural’, only as ‘human-made’.

Implementation as a Creative Activity

In the previous section, we discussed how FFP’s conclusion depends on the notion of implementation. This notion should be such that implementation not only fixes the function of software tokens but their actual performance as well. If so, incorrect services cannot be analyzed as token malfunctioning but are instead due to design errors or hardware failure. Design errors would manifest in all services based on the software type, hardware failure would not.

In this section, we investigate what conception of implementation, understood as a creative activity by intentional agents, would support these conclusions. For this, we draw on work in the metaphysics of artifacts that focuses on creative activities. We discuss how, in this work, a relevant creative activity – that of automated production – has been analyzed, to clarify what has to be achieved in implementation for it to fix performance. This allows us to identify two presuppositions about the creative activity of software engineers, which we shall call complete control and self-containedness. We show that these presuppositions, much like the notion of implementation and the diagnosis of incorrect service in terms of design errors, can be found throughout the software-engineering literature.

It is worth stressing that FFP also touch upon another possible ground for their claims, namely software’s ‘peculiar dual ontological nature (2015: 1215)’: software is, on the one hand, connected to an abstract object (an algorithm implemented in a programming language) and, on the other hand, realized on a physical device. They do not analyze this duality further. Yet, it should be clear that it does not, in and of itself, lead to the conclusion that there is no software token malfunctioning. Other authors have presented similar accounts of artifacts, which connect or identify them with abstract objects that have concrete instances, and often draw on a type-token distinction.³ Virtually all these accounts explicitly allow for token malfunctioning. Any peculiar nature that grounds FFP’s argument therefore cannot reside in duality per se, but in the particular form of duality, i.e., the relation between algorithm, program, software, and hardware. This relation is characterized as one of implementation, so any peculiarity must be identifiable by explicating this relation.

To do so, we will analyze implementation as a creative activity. We thereby focus on the initial creation of technical artifacts, leaving aside several causes and responsibilities for failures at later stages of an artifact’s lifecycle (e.g., those related to maintenance). This focus is shared with most philosophical analyses of technical artifacts, especially those which emphasize creative intentions: these typically analyze an artisanal context, in which an individual maker creates a concrete particular object such as a carpenter making a single chair.⁴ Mostly focusing on this context, Amie Thomasson (2003) analyzes the nature of artifacts in terms of the intentions of their makers. In particular, she argues that makers need to have substantive and substantively correct conceptions of what constitutes a member of an artifact kind. Therefore, a term like ‘chair’ picks out entities that are the products of substantively correct and therefore largely successful intentions to create something of that kind. This leaves room for flawed artifacts. For one thing, substantively correct creative intentions may still be incorrect in certain details. A carpenter may, for instance, have an inadequate grasp of chair joinery, producing chairs with wobbly legs. Alternatively, the productive activity itself may be slightly flawed, although it is governed by entirely correct intentions. Even an expert carpenter occasionally exerts too much or too little force on a chisel. Evidently, the former cause of incorrect outcomes are design errors, which will affect all chairs produced by the artisan, whereas the latter only affects occasional tokens of the type. By contrast, on FFP’s analysis, implementation, as a creative activity, cannot lead to defective tokens while it still allows for errors of omission and commission (‘design errors’).

Indeed, a software engineer’s creative activities might appear to be such that any flaws in the creative activity either manifest during development or when the software is run and will then manifest whenever it is run. This is intrinsically connected to what constitutes incorrect performance of software in the first place, namely undesirable behavior of a local machine. This behavior reflects potentially on software only because this behavior is, intuitively, dictated by the software and thus prescribed by the software engineer. This intuition is almost invariably expressed in introductory explanations of programming and software engineering: that programming is ‘telling the computer what to do’.

The focus on issuing instructions as (part of) the creative activity enables an analogy with a creative activity other than artisanal practices, namely that of automated production. In the metaphysics of artifacts, various authors have discussed the latter because it is not obviously compatible with an account that centers around maker’s intentions. In many of today’s factories, the activities that directly contribute to the creation of artifacts are unlikely to be those of an intentional agent; rather, they are operations performed by robot arms, conveyor belts and other manufacturing tools. Yet, intentionalist accounts may seek the requisite creative activities elsewhere, namely in manufacturing engineering, i.e., the design of the manufacturing process itself (e.g., Evnine 2019; Juvshik 2021; Paek 2023).

Now imagine that a manufacturing engineer designs a production process for car engines, arranging a configuration of and series of operations by manufacturing tools such as robot arms. In this design process, errors may be made; the designer might, for instance, omit the operation of fastening a bolt. This, like a carpenter’s inadequate command of chair joinery, would lead to flaws in every token of the type produced in the designed manufacturing process. The manufacturing tools themselves are another source of flaws. If they systematically perform poorly, this again manifests in every token. If they behave erratically, or occasionally perform poorly, only some tokens will be flawed. In this way, token malfunctioning may be understood as the result of manufacturing-tool problems. A third source is related to the ineliminable variety in any mechanical manufacturing process: no matter how carefully a manufacturing engineer designs the process and isolates the manufacturing set-up from external influences, no two tokens produced by the process will be identical. Impurities in the raw materials, slippage on a conveyor belt, vibrations, etc. all lead to variations in the produced tokens. Whereas many tokens will stay within tolerance margins and thus perform as expected, some will exceed the margins and manifest failures.

This (simplified) analysis of manufacturing engineering can be likened to software engineering. Where the former specifies a set of manufacturing tools and their intended operations, a software engineer specifies instructions for a family of local machines that will carry out these instructions. In both cases, faulty instructions or omissions lead to problems in all tokens produced. Problems in the manufacturing tools, which affect only some tokens, can be likened to hardware failures. Yet if implementation, i.e., the creative activity of the software engineer, fixes performance of tokens (barring hardware failure), it is thereby presupposed that any residual variation in the output is due to hardware failure: the instructions to the machine must tell it completely and exactly what to do. If not, there would be a source of token malfunctioning that is not reducible to hardware failure. For this, there must be a dividing line, transparent to the engineer, between an initial sequence of instructions that fixes the performance and further realization steps (i.e., details of execution) that do not affect the performance. We call this presupposition that of complete control, and use the term control threshold to refer to the dividing line in the implementation process. According to this, software engineers can compose instructions for the machine that are so fine-grained and complete that the same instructions are issued to all machines, and result in the same outcome (i.e., provide the same service) unless there is a failure of one local machine.⁵

Another, closely related, presupposition is that of self-containedness. This assumes that there are no external influences that adversely affect the production process that leads to the machine providing a service; for if so, unless these influences have systematic effects, they would only occasionally interfere and lead to incorrect services, again giving rise to token malfunctioning. FFP elevate this absence of external influences to a definition for programs: ‘single, completed, and self-contained ordered sequence[s] of instructions (2015: 1213).’ As FFP’s definition makes clear, self-containedness pertains on the one hand to the sets of instructions executed by local machines: these need to specify everything that happens when these machines provide their services (up to the control threshold). On the other hand, it pertains to what is providing the service as well as to the service provided: these need to be distinct entities, in order to identify that something (a concrete particular object) fails to provide the correct service (token behavior). The latter aspect of self-containedness is shared with many philosophical accounts of technologies, which focus on self-contained entities or ‘artifacts’ (e.g., Thomasson 2003; Franssen 2006; Houkes & Vermaas 2010).

Like the notion of implementation, the complete-control and self-containedness presuppositions are expressed throughout the literature on software engineering. In fact, they trace back to early attempts at the mechanization of computational tasks. Charles Babbage is widely credited with the conception of the first programmable computer, the analytical engine. It was the spiritual successor to the difference engine, designed to eliminate human errors in the calculation of mathematical and astronomical tables in a time when the computational labor was divided between mathematicians who devised computational procedures and human computers who carried them out (Babbage 1824). Reflecting on Babbage’s analytic engine, Luigi Menabrea (1843) claimed the machine would afford ‘rigid accuracy’ and ‘economy of time’. The idea carried over to early attempts at formalizing notions such as program and its close cousin, algorithm. Donald Knuth, for instances, states that the actions required for each step must be ‘rigorously and unambiguously specified for each case,’ and specified in ‘formally defined programming languages’ to eliminate the ‘possibility the reader might not understand exactly what the author intended (1968/1973: 5).’⁶

We find similar ideas in more contemporary sources. Bjarne Stroustrup, the inventor of the C++ programming language, writes in an entry-level textbook for software engineers: ‘To get a computer to do something, you (or someone else) have to tell it exactly – in excruciating detail – what to do. Such a description of “what to do” is called a program, and programming is the activity of writing and testing such programs (2024: § 1.1).’ ⁷ Current and past editions of the seminal Structure and Interpretation of Computer Programs state that, given a program, only hardware failures may alter a computational process: ‘A computational process, in a correctly working computer, executes programs precisely and accurately (Abelson, Sussman & Sussman 1996: Ch. 1; Abelson et al. 2022: Ch. 1).’ An excerpt from Mathematical Foundations of Software Engineering highlights the performance-fixing quality of design and implementation:

Each copy of the software code is identical, and the software code is either correct or incorrect. That is, software failures are due to design and implementation errors, rather than to the software physically wearing out over time. (O’Regan 2023: 243)

This, again, leads to any untoward behavior of software being attributed to its creators. For instance, the National Security Agency claims that ‘poor or careless memory management’ by developers creates vulnerabilities in software (2022: 2). Engineers themselves appear to share this sentiment. In a public response to an update of EVE Online that removed a critical file of the operating system (in FFP 2015: 1216), the director of the development team wrote ‘Windows doesn’t protect those files and therefore software developers must take care not to touch them. We should have been more careful (Thorsteinsson 2007).’

Control and Self-Containedness in Software Engineering

In the previous sections, we reviewed FFP’s argument regarding software malfunctioning, made explicit how it analyses failures mainly in terms of design errors, and reconstructed the underlying presuppositions of complete control and self-containedness. At various points, we showed how FFP’s terminology and conclusions, as well as the presuppositions of their argument, are found in the literature on software engineering. Thus, as strange as it may seem to claim that software cannot malfunction; that virtually all failures are due to design errors; and that software engineers exert complete control over the behavior of local machines through fully detailed sets of instructions: these claims have deep roots in self-conceptions of software engineers.

In this section, we argue that despite these deep roots, actual practices in contemporary software engineering are at odds with the presuppositions of complete control and self-containedness. This undermines the basis of FFP’s claims, and thereby at least seriously qualifies the blanket attribution of responsibility-for-error to software engineers.

First, we turn to the assumption of complete control. If software engineers are responsible for correct performance, barring hardware issues, the implementation steps up to the control threshold should allow for (human) errors and the agents involved should be able to locate any errors made. The control threshold is crossed when the performance of software tokens is fixed. Figure 1 visualizes FFP’s conception of the process by which, ultimately, behavior is produced in local machines (right-hand side of row b). This process involves program types (algorithms), program tokens, software types, and self-contained executables that target a specific machine architecture (row a) as well as software tokens, which are identical copies of an executable (row b). Rows a and b in Figure 1 correspond to the elements that fall within the software engineer’s purview, and those that do not, respectively. The implementation steps up to and including the creation of the executable are initiated during the software engineering process, and any intermediary products are, in theory, available for inspection. Later tasks, by contrast, which concern the copying of code onto local machines and requesting its execution, are typically not carried out by software engineers. Still, the assumption of complete control entails that any consequence of the activities in row b can be inferred from the state of affairs at the time of crossing the control threshold.

A graphical representation of the types, tokens, and implementation steps that relate them as defined by FFP. An executable is obtained by compiling and linking a program token, which is constituted by a collection of source files, for a particular machine architecture **(a)**. A software token is a copy of the executable and will always lead to the same external behavior when run on a local machine of the target architecture **(b)**.⁸

Although FFP’s account presupposes a control threshold, it is unclear where they locate it. On the one hand, they state that the implementation of an algorithm in a programming language fixes the function of software (2015: 1215), which places the control threshold towards the beginning of row (a). On the other hand, they claim that software performance is fixed by its type. This, combined with the identification of software types as collections of object files, suggests the threshold is crossed during compilation (2015: 1215). FFP note that compiler bugs affect all tokens equally (2015: 1216). Compilers and compiler settings do indeed partly determine software behavior (cf. Figure 2 below). David Monniaux (2008) provides examples of bugs that are introduced by how a compiler translates floating-point into machine-executable instructions.⁹ Like compilers, linkers can have bugs or different behavior, which would similarly affect all tokens and thus move the control threshold. However, this pushing forward and concomitant re-identification of software types cannot extend all the way to the executed instructions: doing so would turn those into single-token types, which makes applying any type-relative notion of malfunctioning problematic.

How a program is compiled depends not only on the target instruction set architecture. The C++ program shown in **(a)** calculates the sum of numbers 1 up to 5 inclusive. The version compiled using Apple clang version 15.0.0 (clang-1500.3.9.4) with optimizations enabled is considerably shorter **(b)** than with optimizations disabled **(c)**.

Causes of malfunction then must be due to localizable faults earlier, rather than later in the process. As anyone who has done any programming can testify, this localization is itself a non-trivial task, and the experts agree: fault localization is ‘widely recognized to be one of the most tedious, time-consuming, and expensive – yet equally critical – activities in program debugging (Wong et al. 2016: 707).’ Localizing faults is not just tedious; it is also itself prone to errors. Since software systems nowadays are so large-scale and complex, resulting from the efforts of many different agents (contributing constitutive elements of larger-scale software to libraries, software repositories, etc.) there is no way in which a single human agent could locate every potential fault with a sufficiently large probability. Proposing and testing fault-localization techniques is therefore a lively branch of research within software engineering. Nowadays, a number of traditional localization techniques are no longer held ‘effective in isolating the root causes of failures (Wong et al. 2016: 710),’ spurring even further growth and divergence of automated techniques, none of which is regarded as satisfactory.¹⁰ One basic problem is that most techniques are tested with artificial faults, and that success at such tests has only limited predictive value for finding real faults (e.g., Pearson et al. 2017; Habib & Pradel 2018).

To be sure, a fault or ‘bug’ in software would still amount to a design error, even if it is a mistake. Thus, although the considerations above cast doubt on the presupposition of complete control, they do not necessarily identify token-level malfunctioning of software. Yet, the following example demonstrates that identical software tokens can perform differently even on identical, well-functioning machines, further calling into question the (position of the) supposed control threshold and the completeness of control. How modern processors respond to instructions is not only determined by the instruction itself, but also susceptible to systemic effects. Contributing factors include other concurrently running software and the contents of the machine’s memory. For instance, if a processor encounters the first instruction to read a value from a memory region, it will retrieve a copy of that region from external memory and store it locally, inside its cache (Kowarschik & Weiß 2003). This enables the processor to respond quickly to identical or similar instructions in the near future. This is analogous to a worker on an assembly line who, anticipating a delay of the manufacturing process, takes preventive measures. Clearly, this mechanism not only impacts the internal behavior,¹¹ which Fresco and Primiero (2013) grant software control over, but also has observable side effects.¹² These effects, moreover, are the result of the processor working as intended.

Behavior also depends on the exact model of processor. In Monniaux’s words: ‘different processors within the same architecture can implement the same transcendental functions with different accuracies’ which leads to ‘floating-point discrepancies between the current Intel embedded processors (based on i386/i387) and the current Pentium workstation processors (2008: 20–21).’

A brief look at the mechanisms that enable software to run across different software and hardware configurations undermines the presupposition of self-containedness. It also reveals another pathway for divergent behavior that results from identical software tokens running on well-functioning hardware. While early computational systems mirrored the division of labor sketched at the end of the previous section, the distance between designing and performing computations has widened considerably through what Dijkstra termed ‘configurational abstraction’: the extension of physical machines by layers of software, which together constitute a virtual machine with an enriched or different ‘instruction repertoire (Dijkstra 1970/2022: 365).’

Configurational abstraction provides a uniform interface across various hardware and software combinations. Rather than producing bespoke tokens for every combination, designers compile and link their work once (see row a in Figure 1) and rely on their users’ machines’ ability to translate the instructions from the virtual machine’s repertoire to that of the actual hardware. Thus, software will run on machines composed of different models of processors, graphics cards, memory modules, storage devices and monitors, etc. The notion of a ‘specific machine architecture’ abstracts from a great variety of hardware and software combinations, and the concept of a local machine (Figure 1 row b) black boxes the necessary mechanisms for a token to run on it. Configurational abstractions can indeed be opaque to software engineers. For instance, when Apple changed the instruction-set architecture of its computers, it introduced a software translation layer to maintain compatibility with software designed before the transition was announced (Apple Inc. 2020: 1h39). Emulation allows software to run on systems that are decades younger and consist of components that were unavailable when the software was designed (Acker 2021).

Consideration of the translation mechanisms involved affords a more fine-grained diagnosis of the alleged ‘bug from 2008 in the multiplayer video game EVE Online (Floridi et al. 2015: 1216).’ The public apology offered explains how the installation software used to update the video game inadvertently deleted an essential file of the Windows operating system. The installer was composed from a specialized instruction repertoire provided by the Nullsoft Scriptable Installer System (NSIS). This translated the constituting instructions into an equivalent sequence using Windows’s more general and extensive repertoire. Crucially, EVE’s creators missed an inconsistency in NSIS’ repertoire. What they assumed was an instruction to delete a file from their video game, NSIS instead interpreted as a request to delete a homonymous file from the operating system. NSIS accordingly issued Windows an instruction to self-sabotage, which it followed. Arguably, there were three moments at which the problem could have been prevented: when the instructions were written, and twice during translation. Indeed, affected users inquired why Windows did not protect its files (Thorsteinsson 2007). Microsoft could have written Windows to deny problematic file deletion requests. Moreover, it could have updated its operating system in response, ensuring that future versions remain unaffected by the execution of EVE Online’s (unchanged) flawed installer.

Even for stable hardware and software configurations, i.e., those in which hardware and translation mechanism are fixed, divergent behavior still occurs. For instance, modern software is allocated memory by the operating system. Instead of describing every detail of the allocation process, software instructs the processor to jump to a memory-allocation procedure. Where the procedure resides in memory is typically unknown during compilation and linking. The operating system must therefore substitute the actual memory address after it has loaded the memory allocator into the local machine’s memory. How allocation then proceeds, depends on the allocation patterns of other running software, the amount of available memory, and whether previously allocated memory may be freed by temporarily storing its current contents on a slower access medium, etc. (Silberschatz et al. 2011: Ch. 8–9). How a system handles a network request similarly hinges on the quality of the network connection, the congestion of the physical medium, and the network activity of other running software (Kozierok 2005).

This shows that, nowadays, the creation and running of software are heavily mediated processes involving many agential activities. From the coding of an algorithm in a programming language, to the carrying out of a machine code instruction, all activities admit certain degrees of freedom and uncertainty. Using ‘implementation’ as a blanket term for these activities may be justified to some extent. Indeed, the creativity of the agents involved is not only expected but valued. Optimizing compilers produce faster and/or smaller code (see Figure 2). Dynamic loading and linking by operating systems enable code sharing and reduce the maintenance burden for software engineers who write applications for the platform. Caching mechanisms increase processors’ throughput and compensate for working memory being orders of magnitude slower.

Yet, these mediated processes leave it very unclear what is the software ‘token’ that is being realized when something is run on a machine, and what type it is a token of. Computational systems nowadays are so variegated, distributed, and dynamic that the control threshold is shifted far beyond the reach of software engineers. If software types fix performance, many tokens are arguably the only instantiation of their type and analyzing failure in terms of a malfunctioning artifact becomes problematic for reasons that FFP do not touch upon. The actual performance of software, it seems, is not any more transparent, or otherwise under the control of any (human) agent, than that of other technical artifacts. The assumption that software engineers fix the actual performance of software, and thereby have responsibility for that performance, is therefore wrong: they cannot take this responsibility any more than an architect can take responsibility for every flaw in a building. Attributing responsibility for software failure will be as susceptible to many-hands problems and responsibility gaps, as attributions for failures of other complex engineered systems.

Conclusions

In this paper, we have discussed an argument to the effect that software cannot malfunction. This argument would highlight a fundamental difference between software and other technical artifacts and identify a major complication for philosophical function theories. Yet, the repercussions extend to software engineering itself: failures of digital artifacts would be diagnosed as the result of design errors, unless they can be attributed to hardware issues. We argued that these conclusions rest on two presuppositions, which we called ‘complete control’ and ‘self-containedness’. Software is taken to be a self-contained unit through which the software engineer exerts complete control over a machine’s behavior. Furthermore, the argument presupposes a control threshold that divides the activity of implementation into an initial, performance-fixing sequence of steps that is fully transparent to designers, followed by a possibly empty sequence of realization steps. Throughout, we illustrated that the argument, its repercussions, and presuppositions are echoed in the literature on software engineering. However, we argued that the presuppositions of the argument are untenable in light of current practices. For one thing, software engineers cannot exert complete control because software has become too complex for any technique to reliably locate every fault in it. For another, the creation and running of software are nowadays heavily mediated processes involving many agential activities. As a result, it is very much possible for software tokens of the same type that run on well-functioning local machines to produce different external behavior, some of which amounts to an incorrect service.

This undermines the presuppositions of complete control and self-containedness. It does not, however, lead to a straightforward diagnosis of some cases of failure as software-token malfunctioning. Rather, our considerations in this paper show how difficult it is to identify any self-contained software token that is realized on a local machine and to ascribe a function to this token which it may be said to fail to perform. As such, software and perhaps also other digital artifacts do seem different from more extensively analyzed classes of technical artifacts, for which identifying a malfunctioning token is reasonably straightforward.

Our analysis of contemporary software practices may therefore reveal shortcomings in FFP’s argument against software malfunctioning. It does not, however, immediately entail an alternative metaphysics of software than that offered by FFP. It remains to be seen whether a metaphysics of software and digital artifacts can be developed that is tailored to contemporary practices – and whether such a metaphysics would be largely similar to or fundamentally different from that of traditional technical artifacts.

Notes

[1] All references in this section are to Floridi et al. (2015) unless otherwise noted.

[2] For ‘software’, ‘hardware’ and for ‘malfunction’, we use simplified versions of the terminology presented by FFP, also in related papers by these authors. Our later arguments do not, we submit, depend on these simplifications, nor do we inspect FFP’s conclusions mainly regarding the implications of more detailed choices in their terminology.

[3] E.g., Moor (1978) and Irmak (2012) for software; the latter draws on a similar account of musical works (Levinson 1980). A dualist metaphysics of technical artifacts is presented in, e.g., Houkes et al. (2011) and Kroes (2012).

[4] Maker-intention-based metaphysics of technical artifacts have well-documented shortcomings, for instance in ignoring user-centred practices; see e.g., Preston (2009) and Koslicki (2018, Ch.8) for this and other counterarguments. Our analysis takes for granted FFP’s implicit endorsement of a maker-intention-based metaphysics (e.g., in the reference to creative activities and authorial responsibilities) and analyses what its presuppositions must be.

[5] Fresco and Primiero claim that software not only fixes the observable behavior, but also the internal behavior of the system, any deviations of which must be contingent on hardware failures (2013: 257–258).

[6] For a short overview, see (Granström 2011: 77–85).

[7] This is probably an oversimplification for didactic purposes; we do not claim that Stroustrup is unaware of the complex, mediated relation between programs and machine behavior. Still, precisely through presentations in didactic contexts, the presuppositions of complete control and self-containedness remain entrenched in software engineering.

[8] We gratefully acknowledge Luis Zuno, who produced the artwork used to illustrate the external behavior under the CC0-license.

[9] Thanks to an anonymous reviewer for helpfully suggesting this reference.

[10] ‘The software community has not yet developed a sound software reliability predictor model.’ (O’Regan 2023: 242).

[11] We borrow the distinction between internal behavior and external behavior from Plotkin (2004: 20).

[12] From these side effects, software running on a machine can infer the contents of otherwise inaccessible memory locations (Kocher et al. 2019). See Curtis-Trudel (2023) for a more detailed analysis of Primiero’s later work in light of speculative execution.

Acknowledgements

We would like to thank the audience at the ‘Digital Artifacts’ conference, organized by Alexandre Declos, Kathrin Koslicki and Olivier Massin at the Institute of Philosophy of the University of Neuchâtel, as well as two anonymous reviewers for various helpful suggestions.

Funding information

This publication is part of the project Understanding Software (with project number 023.022.004) of the research program Doctoral Grant for Teachers which is partly financed by the Dutch Research Council (NWO).

Competing Interests

The authors have no competing interests to declare.

Can’t Software Malfunction?

Full Article

Introduction

Why Software Cannot Malfunction

Implementation as a Creative Activity

Control and Self-Containedness in Software Engineering

Figure 1

Figure 2

Conclusions

Notes

Acknowledgements

Funding information

Competing Interests

Paradigm

My account