skip to main content
research-article
Open access

Quantifying Levels of Influence and Causal Responsibility in Dynamic Decision Making Events

Published: 19 December 2023 Publication History

Abstract

Intelligent systems support human operators’ decision-making processes, many of which are dynamic and involve temporal changes in the decision-related parameters. As we increasingly depend on automation, it becomes imperative to understand and quantify its influence on the operator’s decisions and to evaluate its implications for the human’s causal responsibility for outcomes. Past studies proposed a model for human responsibility in static decision-making processes involving intelligent systems. We present a model for dynamic, non-stationary decision-making events based on the concept of causation strength. We apply it to a test case of a dynamic binary categorization decision. The results show that for automation to influence humans significantly, it must have high detection sensitivity. However, this condition is insufficient since it is unlikely that automation, irrespective of its sensitivity, will sway humans with high detection sensitivity away from their original position. Specific combinations of automation and human detection sensitivities are required for automation to have a major influence. Moreover, the automation influence and the human causal responsibility that can be derived from it are sensitive to possible changes in the human’s detection capabilities due to fatigue or other factors, creating a “Responsibility Cliff.” This should be considered during system design and when policies and regulations are defined. This model constitutes a basis for further analyses of complex events in which human and automation sensitivity levels change over time and for evaluating human involvement in such events.

1 Introduction

Collaboration of human operators1 and automation in decision-making is growing in all areas of life. Examples include advanced warning and driver assistance systems in cars [6, 49, 64], diagnostic procedures in healthcare [24, 54, 62], systems that provide information and recommendations to pilots during flights [2, 61, 80], and security analysis systems used by security administrators in information technology (IT) environments [65]. Automation could also provide support in multi-criteria decision-making or with linguistic models [21, 42, 43, 82], and could enhance team performance [12, 14]. Human operators can work with automation in various ways at the strategic, tactical, or operational levels [23]. This paper focuses on operational-level cooperation, which in many situations can take one of two forms. In some scenarios, the automation serves as a decision support system (DSS), intended to assist and advise the human operator’s decision-making to improve the overall task performance [60], operating at levels 2-4 in the Sheridan and Verplank model [73]. In other scenarios, the human takes a supervisory role, as may be required by regulation or policies, to ensure that the automation performs its function correctly [3, 11], operating at Sheridan and Verplank level 6 [73] in what is referred to in this paper as Automated Decision Making (ADM) [20, 48]. In either case, the automation collects information from the environment, processes it, and makes recommendations based on its inputs and analyses. At the same time, the operators also observe the environment independently, collect information, process it, can also consider recommendations from the automation, and establish their assessment of the situation. In a DSS, the human decides on the required action. At the same time, in an ADM, the automation makes the decision, and the operator can intervene to change the automation’s decision if it seems incorrect.
Such automation is becoming more sophisticated and incorporates not only advanced sensors and detection devices but also artificial intelligence (AI) with machine learning (ML) and deep learning (DL) algorithms that analyze the available data. As automation becomes more “intelligent,” it becomes essential to define the human operators’ and the automation’s influence on the process and their respective responsibility for its outcomes [55, 66, 78]. Are human operators in control when they are added to supervise an automated process? Are humans still fully responsible for outcomes if they receive advice from a DSS that analyzes the situation much better than they can? These are not theoretical questions but rather issues that need to be considered when an investment in DSS or ADM is made, and their implications need to be understood. Understanding the contribution of automation and human operators to the operation and the potential adverse outcomes of automated vehicles or unattended airborne vehicles (UAVs) is critical for establishing proper guidelines for their operation, which will allow widespread deployment and use of such technologies. They are also relevant to regulators who may require human involvement in critical processes, demanding that humans have ultimate control without knowing how much they genuinely contribute to the situation. Establishing such operational guidelines and regulations requires a forward-looking analysis of automation and human involvement. Unfortunately, most research on this topic focused on retrospective analyses of specific events after the outcome was available. Such results are not easily translatable to guidelines for future systems and devices.
A first attempt to a priori quantify the human causal responsibility when a human operator collaborates with intelligent automation was made by Douer and Meyer for single-decision events [15, 16, 17]. Their Responsibility Quantification (ResQu) model used information theory to quantify the contribution of the human and the machine to the final decision in a single-decision event, using it to quantify the human’s causal responsibility for the resulting action. While this serves certain situations, most real-life events are dynamic and involve changes in the environment, the decision maker, the available information, or the values of the outcomes. As explained below, the ResQu model cannot quantify human involvement in such dynamic events, and a new approach is required.
We present here a method for quantifying the Level of Influence (LoI) of the automation and the human on the decision-making and the resulting action in dynamic processes. The LoI can be used to understand the potential benefit of adding a human or automation to a process. It can be used to inform and advise system designers, process managers, and policymakers about the relative importance of humans and automation in a system.
The rest of this paper is organized as follows: We first review the related work in this field and highlight the research questions we address in this paper. A model for human-automation dynamic collaboration is described in Section 2, and an application of the model to a binary dynamic decision-making event with normally distributed noise is described in Section 3. Section 4 discusses the insights we gained from the results and identifies directions for future work. Finally, Section 5 presents the conclusions.

Related Work

There has been considerable research on the interaction between human operators and automated systems (for instance, in [58, 73]). Multiple models have been proposed to define this interaction, such as shared-control [72], co-active design [36, 37], and a layered model of collaboration [57], to name a few. These models identify which functions are to be performed by the automation and which are left to the human operator. In some cases, humans and automation perform similar activities, such as information gathering and analysis, but eventually, one makes the decision that leads to action.
DSSs are added to improve the outcomes of processes, e.g., to reduce traffic accidents, improve medical diagnoses, or prevent attacks on information systems. The analyses of DSSs often focused on the retrospective examination of decisions. They did not provide forward-looking, prospective analyses of the value that can be obtained by such DSSs (for reviews of such studies, see [70] and [60, 75]). ADMs, by contrast, are meant to ensure that humans remain involved in decision-making to avoid situations in which critical decisions, especially those that can affect people’s life and welfare, are entirely made by machines without human oversight. Humans may be included in the process due to insufficient trust in the automation design and capabilities or because they can add considerations of ethical or political implications, human mercy, or other aspects that may be outside the scope of the automation’s sensors and logic. Examples include the requirements to have alert drivers even when cars can supposedly drive themselves, e.g., in advanced Traffic-Aware Cruise Control mode [69], the demand to have “meaningful human involvement” in the operation of highly automated weapon systems (AWS) [3], the requirement to have human officers involvement when deciding about a person’s statutory status [22], and so on. However, simply adding a human to a process supported by a DSS or ADM does not mean that the human significantly influences the process or its outcomes. Compared to the system’s computational power, the complexity of the situation and human cognitive limitations may be such that the human will practically always be expected to accept the automation decision or recommendation. Therefore, the question of whether humans significantly influence such decisions is of the essence when designing such systems and determining regulations and policies.

Influence and Responsibility

As humans and automation collaborate to perform a task, the level of influence of each of them is less clear the more intelligent the automation becomes. This is particularly challenging at intermediate levels of automation when a process is partly autonomous with some human involvement. Humans are considered to have complete control if they wield a sword against an enemy, but without control, if an AWS has all the information, and a human follows its instructions (a paraphrase on the example given by Horowitz and Scharre [34]). However, what would be the level of influence of a human operator on an intelligent system that performs situation analyses and recommends actions? In such scenarios, the operators must remain vigilant and control the system. They may, however, also trust the technology to alert them when needed and possibly even abort a mission autonomously, should the situation change and, for instance, civilians enter an advanced weapon’s impact perimeter. What is the operator’s level of influence in this situation?
Furthermore, what would happen when the operator’s ability to identify the situation deteriorates while the automation still functions at its original level? Would this change the operator’s influence on the decision and the outcome (e.g., injury of innocent civilians) since they have become less capable? Some work was done to define a quantitative method to attribute causation and responsibility in such situations involving multiple agents [25, 44]. However, this work focused on the retrospective analysis of specific events for the attribution of actual, token-level causation (vs. prospective general causation) and the related responsibility of the agents to the outcome. They also assumed that all agents have the same capabilities, so they did not analyze the impact of possible differences between agents, which typically exist when humans and automation are concerned.
Unlike causality and control, responsibility is a multifaceted topic, including role, causal, legal, and moral responsibility [30, 31, 71, 79]. A human operator using a DSS is still the one performing (or avoiding) the action, and an operator of an ADM still has the ability and responsibility to become involved and correct any perceived errors made by the automation. The person has the authority to act, and as a result, carries the role responsibility for the action. That said, attribution of responsibility is, in many ways, a subjective matter. People may hold operators responsible for the outcomes even if they had no way to control them, such as concerns of being liable for accidents when riding in a fully automated car [1, 13] and the assignment of moral responsibility to humans who deploy AWS [76]. Furthermore, operators may be held liable due to indirect liability and not necessarily due to their performance in a specific event. For instance, liability could extend to activities before an event, the operator’s intent, and their awareness of the implications of the outcome.
To quantify human responsibility in human-machine collaboration, it is necessary to identify a quantitative measure independent of any subjective factors. We followed Lagnado et al. [44] and Douer and Meyer [16] and aimed to quantify the causal responsibility which is related to the human’s direct contribution to the outcome, irrespective of any legal, moral or ethical aspects. This allowed us to identify a Human Responsibility Indicator (HRI), which provides a possible measure of human responsibility for the outcome. However, it should not be confused with other aspects of responsibility.
Douer and Meyer [16] defined human causal responsibility as the proportion of the remaining uncertainty about the action taken, conditioned on all automation information processing functions and the overall uncertainty about the resulting action. This definition is based on the entropy concepts from information theory. These can be used when the random variables that represent the human and DSS classifications are stationary and ergodic. Entropy can be calculated when the probability function is known and changes deterministically over time [38]. However, this is not true for dynamic events in which earlier decisions, based on random events, may change the probability distributions in non-deterministic ways [47]. As a result, the evaluation of human responsibility in dynamic decision-making processes cannot be based on computations of entropy reduction, and a different method is needed.

Dynamic Decision Making

A model must incorporate the temporal aspects of events when quantifying the impact of automation on dynamic decision-making processes that develop over time and may be influenced by various variables [5, 18, 52, 59]. We focus here on processes in which: (a) the situation continuously changes over time, (b) the decision maker needs to make consecutive decisions, taking their past decisions into account and considering how these changed the situation, and (c) incremental information could be gathered at each stage [40]. Typically, such a dynamic event is finite in time, lasting for a period T or N stages, for a continuous or discrete event, respectively. If the operator refrains from deciding at any time (or stage), the event will end after time T (or N stages).
Such dynamic decision-making processes have several characteristics [4, 18]:
(a)
A series of decisions is required to reach the goal.
(b)
The decisions are not independent, i.e., previous decisions may constrain future decisions.
(c)
The state of the environment may change, either independently or as a consequence of the decision maker’s previous actions.
(d)
The decisions have to be made in real-time.
Moreover, the dynamic environment consists of one or more components that can change over time and influence the event development:
(i)
The environment can change by itself or due to an operator’s actions (or lack thereof).
(ii)
The measurement sensitivity and setting can change due to internal or external factors (e.g., fatigue or weather conditions, respectively).
(iii)
The value matrix for the different decisions can change over time, e.g., when a physician diagnoses a disease, the effectiveness of a treatment may decrease the later the correct diagnosis is made [39, 41, 56, 67].
(iv)
The decision vector, which includes all possible decisions at every stage, can change in time as the available options change.
(v)
The decision-making logic may change over time as a function of other parameters of the environment or the value matrix. Moreover, human operators are likely to deviate from the normative behavior, especially under pressure [59], and their adjusted behavior may change the decision-making process.
(vi)
The decision arbitration, which is the process of choosing which decision to implement (the human’s or the automation’s), can change over time. For instance, a car collision avoidance system may respect the driver’s decision to speed up even when approaching an obstacle. Still, it may override this speeding-up decision and autonomously brake when the car gets too close to the obstacle [9].
We developed a model of decisions with such temporal changes. We used it as the basis for quantifying the human and automation influence on the decision and the resulting action.

2 An Analytic Model for Quantifying Influence in Human-machine Collaboration

The analysis of the decision-making process and the involvement of the human or the automation in the decision or the outcome is associated with Probabilistic Causation [19, 33]. Prospective, a priori estimation of the control of the process is related to the general, type-level causation of the outcome. An agent’s prospective general causation is this agent’s average causal contribution in all possible distributions of future events and resulting decisions in human-automation collaboration in a probabilistic world. An effect E is said to be caused by a cause C iff C raises the probability of E to occur, compared to the situation where C is not to happen, i.e., \(\mathbb {P}(E|C)\gt \mathbb {P}(E|\sim C)\) [ibid]. A numerical index for the strength of causation of such a cause C on the outcome E is provided by the difference between those probabilities: \(\Delta P = \mathbb {P}(E|C)-\mathbb {P}(E|\sim C)\) [19, 35, 53]. Therefore, if the effect E is the outcome of a decision-making process, and the cause C is the existence of a DSS that consults the human, then we define the level of influence (LoI) of an added agent (human or automation for ADM or DSS, respectively) by determining “how far” they moved the prospective probability distribution of the outcome from where it was before they were added, towards a different outcome probability distribution. Thereby, the model can characterize human involvement in decision-making processes in general. It cannot be used for analyzing individual decisions retrospectively.
The human-automation decision-making process was modeled following the stages of information processing: information acquisition, information analysis, decision and action selection, and action implementation [58], as represented in Figure 1, which is an adaptation to dynamic events of the model presented in [16]. The system is modeled as a continuous process (with t representing time) or a discrete process (with k representing the stage; \(k=0\) is the first stage). A discrete model is presented in Figure 1, with the current stage number denoted as k. The operator and automation both monitor the environment state \(E(k)\) and collect their observations from the environment, \(e^A(k)\) and \(e^H(k)\) for automation and human observed data, respectively. If this is not the first stage, they also consider the automation and human classifications in the previous stage, \(Y^C(k-1)\) and \(X^C(k-1)\) respectively, and their decisions at the previous stage, \(Y(k-1)\) and \(X(k-1)\). They analyze them and generate their current state classifications of the inputs, \(Y^C(k)\) and \(X^C(k)\), considering their bias, which depends on the payoff values they assign at this stage to the possible outcomes (Value or Payoff matrix, \(V(k)\)). Human operators also consider the automation classification output when making their classifications2. Both automation and human then select their chosen decision, \(Y(k)\) and \(X(k)\), from the set of all available decision alternatives at that stage, represented by the vector \(D(k)\). An Arbiter then makes a selection of which decision to implement, the human’s or the automation’s, resulting in an action \(Z(k)\) that influences the environment and generates a new environment state, \(E(k+1)\).
Fig. 1.
Fig. 1. A model of a dynamic decision-making process performed in collaboration between a human operator and a DSS. Shaded blocks are the DSS contribution and do not exist when humans operate without DSS support.
For stages after the first stage, the past information about the classifications and decisions made at previous stages is reflected in the decision-making state matrix, \(T(k)\ \ k\ge 1\), which is of dimension \(4\times k\) and includes four vectors with values from stage 0 to stage \(k-1\) of (i) the classifications of the automation \(\vec{Y}_{k-1}^C\), (ii) the classifications of the operator \(\vec{X}_{k-1}^C\), (iii) the decisions of the automation \(\vec{Y}_{k-1}\) and (iv) the decisions of the operator \(\vec{X}_{k-1}\), for \(k=1 \dots N-1\).
\begin{equation} T(k) = \begin{bmatrix} \vec{Y}_{k-1}^C \\ \vec{X}_{k-1}^C \\ \vec{Y}_{k-1} \\ \vec{X}_{k-1} \end{bmatrix} = \begin{bmatrix} \big (Y^C(0), Y^C(1),\dots Y^C(k-1)\big) \\ \big (X^C(0), X^C(1),\dots X^C(k-1)\big) \\ \big (Y(0), Y(1),\dots Y(k-1)\big) \\ \big (X(0), X(1),\dots X(k-1)\big) \end{bmatrix} \end{equation}
(1)
Classifications in stage k are based on the decision-making state from the previous stage, \(T(k)\), the environmental observations in stage k, and the payoff matrix at the current stage \(V(k)\), which determines the classification bias. The classification process can be described as a function that represents the logic behind the classification process:
\begin{align} Y^C(k) &= f\big (e^A(k), V(k), T(k)\big) \end{align}
(2)
\begin{align} X^C(k) &= g\big (e^H(k), V(k), T(k), Y^C(k)\big) \end{align}
(3)
The functions \(f(\cdot)\) and \(g(\cdot)\) depend on the specific logic used for making a classification. For example, in a binary detection system, one could apply signal detection theory (SDT) principles [50] to build those functions using optimal criteria. The DSS and the operator can also use different types of environmental information, such as numerical (continuous or discrete), linguistic terms, visual or audible. For example, a driver assistance system uses video streams from in-car cameras together with numerical data from a light detection and ranging (Lidar) system to classify the objects surrounding the car and identify if there are obstacles in the car’s path. In another example, a border control DSS would process the human responses in an immigration form, which could be in natural language or discrete Likert-scale responses [46], to classify if a person should be admitted into the country. The processing of these different types of inputs is reflected in the classification functions \(f(\cdot)\) and \(g(\cdot)\), which are defined such that they can map all those different inputs to possible classifications \(X^C\) and \(Y^C\). In another system, automation can use reinforcement learning (RL) to provide better predictions. Moreover, the operator’s function \(g(\cdot)\) can incorporate Bayesian calculations to leverage the automation classification to modify their prior “belief” and optimize the resulting classification. Once the classifications are completed, the decision can be made by the automation and the operator (depending on who makes the decision), using the decision functions \(H_Y\) and \(H_X\), respectively:
\begin{align} Y(k) &= H_Y\big (X^C(k), Y^C(k), D(k)\big) \end{align}
(4)
\begin{align} X(k) &= H_X\big (X^C(k), Y^C(k), D(k)\big) \end{align}
(5)
The decision functions \(H_Y\) and \(H_X\) depend on the logic used to make the decision. For example, consider a simple random-walk decision process [10] in which the human operator follows the logic: “(a) start with counter equals zero, (b) for automation binary classification in stage k: \(Y^C(k)=+1\) or \(Y^C(k)=-1\), add or subtract one from the counter, respectively, and (c) decide to ‘Engage’ if the counter reaches threshold \(\lambda\), ‘Abort’ if the counter reaches \(-\lambda\) or otherwise, continue to collect more information.” For this process, the human decision function \(H_X\) can be formalized as:
\begin{equation} H_X(k) = {\left\lbrace \begin{array}{ll} Engage &\sum _{i=0}^k Y^C(i) \ge \lambda \\ Abort &\sum _{i=0}^k Y^C(i) \le -\lambda \\ Continue &otherwise \end{array}\right.} \end{equation}
(6)
In the DSS case, the human operator makes the decisions, so the action Z equals the operator’s decision X. The observations of the environmental state are stochastic, hence, the operator’s decision, which is a function of the observations, is a random variable. The list of an operator’s decisions through all N stages of the event can be represented, as described above, by a random vector \(\vec{X}\) with different values for each stage:
\begin{equation} \vec{X}_{N-1} = \big (X(0), \dots , X(N-1)\big)\ \ X(i) \in D(i) \end{equation}
(7)
All possible values of \(\vec{X}_{N-1}\) define its sample space \(\Omega\), and the probability when the operator is using a DSS for each sample decision vector of dimension N: \(\vec{\chi } = (\chi _0, \chi _1,\dots , \chi _{N-1})\) can be calculated as follows:
\begin{equation} \mathbb {P}_{X}\big (\vec{\chi }\big) = \mathbb {P}\Big (X(0)=\chi _0, \dots , X(N-1)=\chi _{N-1}\Big) \ \ \forall \vec{\chi } \in \Omega \end{equation}
(8)
The collection of probabilities of all possible samples of \(\vec{\chi }\) defines the probability distribution of all possible decision combinations across the N stages of the event when a human operator is using a DSS. It determines the probability of the operator performing each specific series of decisions during the event.
To determine the influence of the DSS on the outcome, we quantify its influence on the probability distribution of the human operator’s decision. If the DSS did not influence it, then its influence on the outcome is defined as 0 since the probability of the outcome has not changed. In contrast, if the DSS significantly changes the probability distribution of the human’s decision, its influence on the outcome is significant. Such an analysis can be done by examining a similar reference event in which the operator decides without a DSS, as illustrated in Figure 1 without the shaded blocks. In such an event, the human operator uses the current environment state and their decision from the previous stage to determine the classification, \(\breve{X}^C(k)\), which will drive the decision, \(\breve{X}(k)\) that is used as the final decision, \(\breve{Z}(k)\). The probability for the human decisions without using a DSS in such an event is calculated for every sample of the random vector \(\vec{\breve{\chi }}_{N-1}\) in the sample space, similarly to (8):
\begin{equation} \mathbb {P}_{\breve{X}}\big (\vec{\breve{\chi }}\big) = \mathbb {P}\Big (\breve{X}(0)=\chi _0, \dots , \breve{X}(N-1)=\chi _{N-1}\Big) \ \ \forall \vec{\breve{\chi }} \in \Omega \end{equation}
(9)
The level of influence of the DSS on the operator’s decision-making is defined using covariation assessment for causation strength [35, 53], extended to a multi-dimensional probability space, as the distance between the probability distributions of the operator’s decisions throughout the event stages with and without the DSS. This distance is measured using the Hellinger distance [32], H, which measures the similarity between two probability distributions. Hellinger distance was used successfully in machine learning systems [8] [26] and across different domains, from ecological systems [7] [45] to security [77]. It is a symmetrical metric bound within the range [0,1] and is defined for all possible probability distributions. This measure indicates how much the DSS influenced the probability distribution of the operator’s decisions.
\begin{align} DSS_{inf} &\overset{def}{=}H(\vec{X}, \vec{\breve{X}}) = \frac{1}{\sqrt {2}}\sqrt {\sum _{\forall \vec{\chi }\in \Omega } \bigg (\sqrt {\mathbb {P}_X\big (\vec{\chi }\big)}-\sqrt {\mathbb {P}_{\breve{X}}\big (\vec{\chi }\big)} \bigg)^2 } \end{align}
(10)
The human responsibility indicator for this dynamic event can be estimated as the complement of the normalized DSS influence. Since the Hellinger distance is within the range [0,1], the human temporal responsibility, T-RESP, can be defined as
\begin{equation} \text{T-RESP} = 1- \frac{DSS_{inf}}{DSS_{inf_{max}}} \end{equation}
(11)
The normalization of \(DSS_{inf}\) is performed for each event length independently since the maximum value of the influence depends on many factors that change according to the specific situation and how the environment and decision factors change over time, fundamentally changing the operator’s behavior.
Based on the above analysis, one can a priori estimate the benefit (in terms of the increase in expected value, EV) from using a DSS. The EV for the human decisions with DSS support is denoted by \(EV_X\), while the expected value of the human-only process is denoted by \(EV_{\breve{X}}\). The benefit of using the DSS is therefore defined as:
\begin{equation} \text{DSS benefit} \overset{def}{=}EV_X - EV_{\breve{X}} \end{equation}
(12)
The above model is generic and is the basis for calculating automation or human influence and their expected benefit for various decision-making processes with different probability distributions. For example, an ADM case can be analyzed similarly by taking as a reference the automation-only probabilities and comparing them to the probabilities of the combined system of humans and automation. The following sections demonstrate how such a calculation can be applied to a specific scenario and what conclusions can be drawn from it.

3 Application of the Model

3.1 Binary Decision Making with Normally Distributed Noise

The above model can be applied to many situations when decision-making is required throughout an evolving dynamic event. To illustrate how this model can be used, we analyzed the schematic case of a manufacturing facility’s quality assurance (QA) department. The factory manufactures devices per customers’ orders. The manufacturing of each order is a single dynamic event that starts with the setup of the machine to the customer’s specifications (done only once, at the beginning of the order manufacturing) and then manufacturing N batches of M devices each (for a total of \(M\cdot N\) units in each order), with every batch considered to be a stage in the event. Machine setup can be done accurately or inaccurately, with a certain probability, resulting in intact or faulty devices, respectively, in all batches of that order. Before manufacturing each batch, a human QA inspector (the “operator”) checks the system setup to detect inaccurate machine setup before the batch is manufactured. This inspection is not always correct.
Similarly, a QA system (the “automation”) can perform the same inspection with its own probability of success. Before manufacturing each batch, the automation informs the operator whether it determined that the machine was set accurately or not. The operator then performs their inspection, considers the automation’s advice, and decides whether to continue manufacturing that order. While the human could have completed this inspection alone, automation was added as a DSS to help improve the overall performance of this task. In terms of decision-making, we define an inaccurate setup as a “Positive” reading, which optimally should trigger an “Alarm” (i.e., aborting the manufacturing of the order). Values are associated with each decision, depending on whether the machine was set up accurately and the decision was correct. The inspector can decide to stop manufacturing at a specific batch k. The event is then terminated, and either a True Positive (\(V_{TP}(k)\)) or a False Positive (\(V_{FP}(k)\)) value is associated with the decision for an inaccurate or accurate setup, respectively. If the inspector does not stop the manufacturing at any point and completes the order, the event ends after the N stages. The inspector’s series of N decisions results in a value of True Negative (\(V_{TN}\)) or False Negative (\(V_{FN}\)), depending on whether the setup was accurate or not, respectively.
Our analysis of this application assumes a theoretical “ideal human operator” who uses normative decision-making methods to maximize the expected value. Such a human remembers their past classifications and decisions, leverages them for Bayesian inference, and employs dynamic programming for optimal decision making [63] (see a detailed analysis of the dynamic programming for this scenario in the Supplemental Material). This is not a realistic model of actual human decision-making, but it estimates the optimal performance level that can be reached in the task.
The insights we gain from analyzing results for an ideal human reveal characteristics of the system and can serve as the basis for the analysis of real-life scenarios involving the system. It should be noted that the human operator is “ideal” only in their ability to employ optimal decision-making processes. They do not necessarily have optimal or even good, detection sensitivity. The automation is assumed to be a simple, memory-less sensor with a fixed unbiased threshold.
This case demonstrates how dynamic events can become non-stationary. Their analysis requires a model that can accommodate such situations. The operators’ past decisions change their response tendencies (biases) for the subsequent decisions and the likelihood of them stopping the order manufacturing. Therefore, the probability of the human decision to act is not constant with k, and the stochastic variable representing it is not stationary.

3.2 Event Forward-Looking Analysis

The forward-looking analysis is performed by calculating the probabilities at the first stage (\(k=0\)) and then, using this knowledge, calculating forward the probabilities for stage k for \(k=1 \dots N-1\), based on the knowledge of the values from stage \(k-1\). The operator uses straightforward Bayesian inference to calculate the updated probabilities based on their previous probabilities and the automation’s current probabilities - where relevant. Details of the calculation are included in the Supplemental Material.

3.3 The Distance between Probability Distributions and Automation Influence

It is possible to compute the probability distributions of the series of decisions until the operator stops the order manufacturing (at stage \(k, 0\le k \le N-1\), or never), with and without the automation recommendation. Our measurement of the influence of automation on human behavior is based on the distance between these distributions. If they are close, automation has little effect on human decisions. If the distributions are very different, automation affects the decisions.
The sample space of possible decisions in this scenario consists of the following vectors (with \(Abort_k\) and \(Continue_k\) representing decisions to stop or continue the order manufacturing in stage k):
\begin{equation} \begin{split} \Omega &= \big \lbrace \vec{\chi }_0=[\text{Abort}_0], \\ &\ \ \ \ \ \vec{\chi }_1=[\text{Continue}_0, \text{Abort}_1], \\ &\ \ \ \ \ \dots \\ &\ \ \ \ \ \vec{\chi }_{N-1}=[\text{Continue}_0, \text{Continue}_1, \dots , \text{Abort}_{N-1}]\\ &\ \ \ \ \ \vec{\chi }_N=[\text{Continue}_0, \text{Continue}_1, \dots , \text{Continue}_{N-1}] \big \rbrace \end{split} \end{equation}
(13)
Based on (9), the probabilities for the operator’s decision without automation assistance, represented as \(\breve{X}\) are: (note that the conditional probabilities below are defined in the Supplemental Material):
\begin{equation} \mathbb {P}_{\breve{X}}(\breve{\chi }_k) = P_s\Big [\prod _{i=0}^{k-1} \breve{P}_{FN}(i)\Big ] \breve{P}_{TP}(k) + (1-P_s)\Big [\prod _{i=0}^{k-1} \breve{P}_{TN}(i)\Big ] \breve{P}_{FP}(k) \ \ \ \forall 0\le k\le N-1 \end{equation}
(14)
and
\begin{align} \mathbb {P}_{\breve{X}}(\breve{\chi }_N) &= P_s\prod _{i=0}^{N-1} \breve{P}_{FN}(i) + (1-P_s)\prod _{i=0}^{N-1} \breve{P}_{TN}(i) \end{align}
(15)
According to (8), the probabilities for the human operator’s decision with automation support, represented as \(\hat{X}\) are:
\begin{equation} \mathbb{P}_{\hat{X}}(\chi_k) = P_s\Big[\prod_{i=0}^{k-1} \hat{P}_{FN}(i)\Big] \hat{P}_{TP}(k) +\\ (1-P_s)\Big[\prod_{i=0}^{k-1} \hat{P}_{TN}(i)\Big] \hat{P}_{FP}(k) \\ \forall 0\leq k\leq N-1 \end{equation}
(16)
and
\begin{align} \mathbb {P}_{\hat{X}}(\chi _N) &= P_s\prod _{i=0}^{N-1} \hat{P}_{FN}(i) + (1-P_s)\prod _{i=0}^{N-1} \hat{P}_{TN}(i) \end{align}
(17)
The Level of Influence (LoI) of the automation on the outcome is estimated by the Hellinger distance (10) between these two distributions - the larger the distance, the stronger the automation influence. The benefit of automation in this application can then be determined, as explained above.

3.4 Numerical Example

To demonstrate how the model results can be used and interpreted, we present an example case for the above QA department scenario, using SDT concepts [27, 51, 81]. The manufacturing machine setup accuracy is measured by a single observable parameter which transforms the data into a scale value. Such a parameter is represented by a random variable Q that can take any real numerical value \(q \in \mathbb {R}\). An incorrect setup of the manufacturing machine (a “positive” reading) can be represented as the presence of a signal (denoted as S), with a prior probability of occurrence \(P_s\).
In contrast, a correct setup is referred to as noise (denoted as N), with a prior probability \(P_n = 1-P_s\). Therefore, the measured setup accuracy Q can be associated with one of two probability distributions, \(E_S\) or \(E_N\), depending on whether a signal is present (Figure 2). In the equal variance SDT model, the two probability distributions are typically assumed to be similar but with different means. The difference between the means (measured in standard deviations) is the detection sensitivity \(d^{\prime }\) of the detector (the human or automation in our case). The higher \(d^{\prime }\), the smaller the overlap between the two distributions, and the easier it is to determine whether there is a signal. In actual applications, the prior probability of an inaccurate setup (a Signal) would be available from historical data for an existing system, the testing of a new system, or theoretical analyses or simulations for a system not built yet. The human detection sensitivities could be calculated from success rates in detecting such situations without a DSS. For example, assuming a normal probability distribution of the machine accuracy measure Q, the sensitivity, \(d^{\prime }\), can be calculated as \(d^{\prime }=Z(P_{TP})-Z(P_{FP})\), with Z being the inverse of the cumulative normal distribution and \(P_{TP}\) and \(P_{FP}\) being the True Positive and False Positive probabilities of detection. For this numerical example we used the signal prior probability \(P_S=0.2\) and a normal, equal variance distribution of Q with \(E_N \sim \mathcal {N}(\mu _N,\sigma _N^2)=\mathcal {N}(0,1)\) or \(E_S \sim \mathcal {N}(\mu _S,\sigma _S^2)=\mathcal {N}(d^{\prime },1)\) in case of Noise or Signal, respectively. Varying values of the detection sensitivity \(d^{\prime }\) were used for the automation and the operator in the range \([0.6,\dots ,3.0]\) with steps of \(\Delta d^{\prime }=0.1\). This range represents agent sensitivity values from very poor (\(d^{\prime }=0.6\)) to extremely high (\(d^{\prime }=3.0\)). The event’s discrete dynamic programming numerical calculation, described in the Supplemental Material, was performed by taking observation steps \(\delta =0.01\), and the probability range was divided into 100 buckets (\(\epsilon =0.01\)). The event length was \(N=1, 4 \& 8\) stages. We assumed that \(M=10\) units would be manufactured in every batch, with unit values of +1 or -1 for intact or faulty units, respectively. This resulted in outcome values of \(V_{TP}(k)=-10k\) for the cost of wasted material for faulty devices manufactured until this stage, \(V_{FP}(k)=+10k\) (\(k=0 \dots N-1\)) for the profit from intact devices manufactured until this stage, and \(V_{TN}(N)=+10N\) for the value of a complete manufactured order of intact devices. The \(V_{FN}(k)\) for each stage was calculated from the base value of maximum “waste” when the whole order is manufactured under inaccurate setup, \(V_{FN}(N)=-10N\), going backward using dynamic programming as defined in the Supplemental Material.
Fig. 2.
Fig. 2. SDT model for detecting a signal in the presence of normally distributed noise.
A numerical analysis of the above example event provides us with a measure of the automation LoI on the operator’s decision in the case of a DSS when the automation is added to aid the human (Figure 3). Note that the results were not normalized to show the actual distance between the distributions.
Fig. 3.
Fig. 3. Comparing DSS level of influence (LoI) on the operator’s decision for dynamic events with lengths of N=1, 4, and 8 stages, drawn as 2D and 3D charts, as a function of the detection sensitivity of the human (\(d^{\prime }_H\)) and the automation (\(d^{\prime }_A\)). In the 2D charts, solid black lines represent the \(d^{\prime }_A\) and \(d^{\prime }_H\) combinations that maintain constant ratios R of 1.5, 2.0 and 3.0.
The results show growth in the automation influence when the automation sensitivity (\(d_A^{\prime }\)) increases, while an increase in the human sensitivity (\(d_H^{\prime }\)) reduces the automation influence. Maximum automation influence is achieved when the automation has maximum sensitivity while the human has minimal sensitivity. When the automation sensitivity is low, the automation recommendation does not significantly influence human decisions, so the automation influence is reduced to a minimum. Although, in theory, Hellinger distance can take any value in the range [0,1], it is unlikely that the distance values in such scenarios would be close to 1 since that requires that the two probability distributions are never both positive for any sample. In reality, even with the lowest detection sensitivity, there will be some positive probability for almost all scenarios in the probability sample space, and the upper limit of 1 would not be reached. This can be demonstrated by a simple comparison of the extreme scenarios when comparing an All-Knowing Agent (\(d^{\prime }=\infty\)) who would have the following series of probabilities for an “Abort” decision at stage k: \([P_s, 0, 0, 0, \dots ]\), to a random decision (“Pure Chance”) at each stage based on a coin toss (\(d^{\prime }=0\)) that would result “Abort” probabilities of \([0.5, 0.5^2, 0.5^3, \dots ]\). The Hellinger distance between these extreme cases is 0.23, 0.68 & 0.79 for N=1, 4, & 8, respectively. Therefore, we refer to the maximal distance when the automation sensitivity is maximal and the human sensitivity is minimal as the upper limit used to normalize the influence.
The DSS influence depends not only on the relation between \(d^{\prime }_A\) and \(d^{\prime }_H\) but also on their absolute values. As shown by the solid black lines in Figure 3 panels (a)-(c). When the ratio of the two detection sensitivities \(R=d^{\prime }_A/d^{\prime }_H\) is kept constant, the DSS influence still changes, depending on the actual values of these parameters.
The DSS influence changes as a function of the human and automation detection sensitivities, and it is not the same for all event lengths. In this example, the DSS influence is stronger when the event is longer, as can be seen in Figure 3, and through the growth of the average DSS influence values across all 625 sample points in each figure, as shown in Table 1.
Table 1.
Event LengthNumber of sample pointsMinimum InfluenceMaximum InfluenceTotal Influence (Sum)Average Influence
16250.0000.2934.360.055
46250.0040.2944.590.071
86250.0050.3152.680.084
Table 1. DSS Influence Indicators for Different Event Lengths
Once the automation influence is established, the human responsibility can be calculated using (11). The results for multiple event lengths are presented in Figure 4. Note that these charts show the data in a rotated axis view, compared to Figure 3, to better show the changes of the human responsibility as the sensitivity levels change. As shown in Figure 4, human responsibility decreases with the increase in automation sensitivity and the decrease in human sensitivity. By definition, T-RESP (11) is zero when the automation influence is maximal, which occurs when the automation sensitivity is maximal (\(d_A^{\prime }=3.0\)), and human sensitivity is minimal (\(d_H^{\prime }=0.6\)). Similarly, when the human’s sensitivity is high (\(d_H^{\prime }=3.0\)) and automation’s sensitivity is low (\(d_A^{\prime }=0.6\)), the human is not affected by the automation’s advice and human responsibility is maximal (close to 1).
Fig. 4.
Fig. 4. Calculation results of the Human Responsibility Indicator, T-RESP, when a DSS is added to assist a human operator in a dynamic decision-making event of lengths N=1, 4 & 8 stages, as a function of the combination of human and automation detection sensitivities. Note that the horizontal axis in the responsibility charts are rotated by \(180^{\circ }\) compared to the level of influence charts (Figure 3) to allow better visibility.
The results also show that the human responsibility indicator drops significantly in the analyzed scenario once the human detection sensitivity is below a certain threshold, between \(d^{\prime }_H\) values of 1.2 to 1.6, depending on the event length. This means that the human causal responsibility may, with specific detection sensitivity values, quickly drop when the detection sensitivity even slightly decreases.
This numerical example can demonstrate how the Hellinger distance represents the change in the decision probability. Since the probability distribution space is of dimension N, which is hard to illustrate for larger N values, we use a single decision (\(N=1\)) to show the Hellinger distance on a linear, one-dimensional space. Using the above use case, with normal distributions and \(d^{\prime }_A=d^{\prime }_H=2\), assuming signal probability \(P_S=0.2\) and the above value matrix for \(N=1\) (\(V_{TN}=10\), \(V_{FN}=-10\), \(V_{TP}=V_{FP}=0\)), we calculate the probability to take action (i.e., assume “Signal”) for an All-Knowing Agent, Pure Chance (both defined above), human-only and for human assisted by automation (DSS). The Hellinger distances between those probability distribution combinations are shown in the top part of Figure 5(a) on a straight line. The Hellinger distance metric is additive on this one-dimensional space since when the human acts normatively and adopts the automation’s recommendation, the human distribution “moves” from its original (non-assisted) location towards the optimal All-Knowing Agent’s distribution by a distance equal to the DSS influence on the human. As expected, the EV improves when the automation assists the human (Figure 5(a) bottom-part). It should be noted that the distributions are not supposed to be ordered on the line according to their EV, but their order depends on the values of the outcomes and the detection sensitivity. For example, for a different outcome value ratio of 0.25, the human and human-assisted distributions would be placed to the left of the All-Knowing Agent distribution. However, the additive attribute of the distance would remain (results not shown here).
Fig. 5.
Fig. 5. Illustration of the Hellinger distance and EV (top and bottom lines in each panel, respectively) for four probability distributions for a single stage decision of different agents: All Knowing Agent (\(d^{\prime }=\infty\)), Pure Chance (\(d^{\prime }=0\)), human only (\(d^{\prime }=2\)) and human with automation DSS (\(d^{\prime }=2\)). Decision probabilities were calculated using Bayesian inference, outcome values: \(V_{TN}=10\), \(V_{FN}=-10\), \(V_{TP}=V_{FP}=0\), and signal probability \(P_S=0.2\). (a) assuming a normative human behavior, both alone and when assisted, (b) assuming normative human when alone (for reference) but a counter-normative behavior when assisted by automation (opposite decision from a normative assisted human).
This analysis also shows that we must assume that the human aims to improve task performance and does not act randomly or against that goal. As can be seen in Figure 5(b), if the human deliberately decides to do the opposite of what a normative human would do (i.e., always choosing “Signal” when a normative human would choose “Noise,” and vice versa), this counter-normative behavior will drive the probability distribution of the human when assisted by automation, further away from the optimal All-Knowing Agent and result in the lowest EV. Our model would then show a considerable Hellinger distance, i.e., a significant automation influence and, therefore, supposedly, minimal human responsibility. In this case, it is hard to claim that the human is not responsible for outcomes, as they deliberately did not act towards improving the task performance. Thus, a prerequisite for the model is the assumption that the human aims to improve performance.

4 Discussion

4.1 Main Findings

The general influence of DSSs on operators’ decisions and their impact on human causal responsibility were analyzed in prior studies for cases of a single stationary decision. In this work, we extended previous research to quantify the a priori, general influence of automation on human choices in an evolving, non-stationary event that requires a series of decisions. We computed the Hellinger distance to measure the automation and human influence, using the causation strength concept so that a greater distance indicated a more substantial influence. The general model was then applied to a specific DSS scenario with predefined parameters to demonstrate how it can be used and what applicative conclusions we can draw from it. There are several key insights:

4.1.1 Human Responsibility Depends on the Human and Automation Sensitivity Levels, Not Only their Relative Values.

For the analyzed scenario, our results demonstrated that automation with high detection sensitivity could significantly influence a human operator with low sensitivity. However, if the automation’s sensitivity decreases or the human’s sensitivity increases, the influence drops relatively quickly. The relation between the measure of automation influence and the two sensitivities is complex. Automation influence changes differently following a reduction in automation sensitivity or increased human sensitivity. We were unable to identify an analytical representation for these changes. These findings suggest that the influence of automation in a dynamic event depends not only on how much better the automation sensitivity is, compared to the human’s sensitivity, but on the actual sensitivity values. As shown in Figures 3(a)-3(c), even with a constant ratio R between automation and human detection sensitivities, the automation’s influence changes, depending on the actual values of \(d^{\prime }_A\) and \(d^{\prime }_H\). Therefore, investing in automation that is “twice as good as the human” will not necessarily result in a certain influence level, but its effect depends on the sensitivity values.
Moreover, investing in a very sensitive DSS will not necessarily increase its influence if the human detection sensitivity is high. The actual values of both the automation’s and the human’s sensitivity must be considered. At the extremes, a human with low sensitivity (e.g., \(d_H^{\prime }=0.6\)) will be strongly influenced by ’very good’ (\(d_A^{\prime }=3.0\)) automation and will contribute little to the resulting outcome. On the other hand, a human with very high sensitivity (\(d_H^{\prime }=3.0\)) will not be influenced by low sensitivity automation (\(d_A^{\prime }=0.6\)). In different situations, matters are more complicated, and one needs to model and compute the influence of the automation, as demonstrated here. This observation is important, considering the requirement to have meaningful human control (MHC) in critical processes. Automation can be (almost) autonomous, but regulations, ethics, or the law may still require human supervision. The requirement seems to be met in DSS situations since the human operator is only “advised” by the automation but still makes the final decision, and it is tempting to assume that the human is in full control of the process [74]. However, we see here that this may be incorrect when the human is strongly influenced by automation, and therefore, the assumption that they are still “in control” may not be valid. When humans with relatively low detection sensitivity, such as novice trainees or tired operators, receive advice from a very good DSS, the recommendations will strongly influence their decisions, and the assumption that a human has meaningful control in this situation may not hold. Careful analyses are required for each case to ensure humans and automation operate within their predefined parameters and evaluate whether the human controls the situation.

4.1.2 Automation Influence Increases with Event Length.

No simple formula was found to estimate the level of influence as a function of the event length due to the many factors involved in the process, sometimes with contradictory impacts and changes over time. However, in the example we presented, the overall level of influence was greater in longer events with more stages. For them, automation influence was more substantial, even with lower automation sensitivity levels and with higher human sensitivity levels. Mathematically, this is explained by the fact that the longer the event (or the decision vector \(\vec{X}\)), the larger the possible Hellinger distance (e.g., as shown above, the distance between the decision probability space of an “All-Knowing Agent” and random coin tosses mentioned above is 0.23, 0.68 and 0.79 for event lengths 1, 4 and 8 stages, respectively). In practice, this translates into more possible options for the decision series in longer events, and, therefore, more ways for automation to influence human decisions.

4.1.3 Human Responsibility Indicator Can Suggest a Measure of Human Causal Responsibility.

The HRI T-RESP provides an indicative measure complementary to the normalized automation influence. It is zero when the automation has maximum impact in the analyzed detection sensitivities range and is maximal when the automation influence is minimal. This behavior suggests that it can be used as an indicative measure of the level of causal responsibility of the human operator, which is aligned with their ability to influence (or “control”) the situation. We use the term “responsibility” with some reservations. Although causal responsibility is defined in some cases as equivalent to “causation” [29], it is also mainly related to retrospective analyses or actual (or token) causation [28, 29]. A retrospective actual causation analysis requires that the human did act, which is not always the case when performing a probabilistic, a priori analysis. Therefore, we suggest referring to T-RESP as a measure that provides some information about human responsibility. However, how it is applied to each specific type of responsibility requires in-depth analyses and depends on the context. When the value of this indicator is high, the human has indeed fulfilled the causation requirement to be considered responsible for the action. However, additional factors should be considered when determining a person’s responsibility, such as how aware the person was of the consequences and ramifications of the action, what was the person’s intent, who can/should carry the responsibility share attributed to the automation (the designer of the system, the manager who decided to deploy it, etc.) and so on [28, 44, 68, 71].

4.1.4 A “Responsibility Cliff” May Exist when Human Detection Capability Deteriorates.

DSS influence increases with the decrease in human detection sensitivity (based on Figures 3 and 4), which means the human causal responsibility indicator decreases. However, this decrease is not necessarily linear, with a constant rate of change. In some situations, there may be a point at which the human responsibility drops faster with the \(d_H^{\prime }\) decline (e.g., for N=8, this can be observed in Figure 4(c) when \(d_H^{\prime }\lt 1.1\)), suggesting that humans would start to follow the automation recommendation with minimal contribution from their side. We call this situation a “Responsibility Cliff,” where small changes in human detection sensitivity significantly change human influence and the derived causal responsibility. The responsibility cliff is an important observation to which system designers and policymakers must pay attention. In many real-life situations, the human operator, whether a physician, driver, pilot, or quality assurance person, faces many dynamic decision-making events. In the above example, the QA inspector may oversee the production of many orders during an 8-hour shift and must go through such a decision-making event for each order. Such inspectors may be professionals and experienced with high \(d_H^{\prime }\), so from the point of view of the factory management, they are usually held responsible for their decisions. However, their levels of vigilance and attention may drop towards the end of their shift, lowering their \(d_H^{\prime }\) and hitting the Responsibility Cliff so that they may rely more on the DSS, and their responsibility indicator would significantly decrease. In such conditions, holding the operator responsible for the results while the automation greatly influences them may not be the correct or acceptable approach. Managers, policymakers, and system designers should consider the responsibility cliff when defining and evaluating operational procedures.
This example demonstrates why the human responsibility indicator should be used with caution since one could argue that even with the lower responsibility indicator values at the end of the shift, the QA inspector should be held responsible for not alerting their superiors that they need to be replaced because they were tired and their detection capability was impaired. Had they done so, the responsibility cliff would have been avoided, and the damaged order would not have been manufactured. Of course, this counterfactual addresses the bigger picture of the operator in the overall shift and not the specific decision-making event of that one order. However, it demonstrates why the responsibility indicator should not be used mechanically without taking a closer look at the situation.

4.2 Design Implications

Intelligent DSSs for assisting human operators in dynamic events require investment in the development of new mechanisms, possibly leveraging artificial intelligence and machine learning (AI/ML) and implementing them into specific functional platforms, such as cars, planes, security systems, medical instrumentation or QA systems, to name a few. With the posterior analysis of DSSs described in the literature, such systems’ benefit and contribution to better decision-making processes could only be evaluated after the system is built, i.e., after the investment has already been made. The new T-RESP model and the approach presented in this work allow the a priori quantitative estimation of the benefit of using the DSS in a dynamic event. This can be used to assess whether a system has a desired level of meaningful human influence. For example, investing in such a system could be appealing if the DSS sensitivity is significantly higher than human sensitivity. However, with certain human detection capabilities, the human influence may be unacceptably low. Assuming that the designers want to benefit from the enhanced automation capabilities, they need to find ways to increase the human detection capabilities to meet the regulatory influence level, e.g., by providing the operator with more information or by additional training. Alternatively, the process can be automated so humans will not be involved. In contrast, if the human’s sensitivity is high, a DSS may have little value.
The analysis performed in this work used an “ideal human” as a reference, assuming that this operator always makes normative decisions to maximize the task performance (e.g., EV), even when tired and with diminished detection sensitivity. Moreover, they are assumed to remember their choices and the associated probabilities of the various options and perform a future-looking dynamic programming analysis. We acknowledge that people do not operate this way. Still, using the ideal human model allows us to identify an attribute of the combined “human + automation” system that can guide the system designers, regulators, and policymakers. While actual human behavior may deviate from the above, this reference measure shows how the human and automation collaboration will function under optimal conditions. It provides indications for the appropriate settings of system and process parameters.

5 Conclusions

Previous work on human and automation teaming focused mainly on retrospective analyses of specific events. It did provide prospective, quantitative measures of the automation influence on human decisions and the derived human causal responsibility. Such a measure is essential because system designs, regulations, and work processes become more dependent on understanding and quantifying the delicate balance between decisions made by humans and automation. Moreover, the work done in this area focused on static events with a single decision. Many decision-making events in real life are, however, dynamic. The T-RESP model presented in this paper expands previous work on automation influence and human causal responsibility to dynamic events involving multiple interdependent decisions. The model can further provide an a priori perspective and a quantitative measure to weigh alternatives in DSS design and considerations of human responsibility. While the model is generic and could fit variations in the environment and the detection sensitivity of the human and automation, we presented a simplified numerical example to demonstrate how the model can be used and the insights that can be derived from it. It provided some important conclusions for that case that can be applied to system design (e.g., ensuring that the human is given sufficient information so their causal responsibility would be high enough to meet regulatory requirements), operational procedures (e.g., avoiding long shifts in which human d’ deteriorates below the Responsibility Cliff) and indications on causal responsibility that could help regulators define the frameworks in which the use of automation is allowed. Future research should focus on expanding this model by evaluating the implications of enhancing the automation capabilities, allowing it to include advanced features, such as backward and forward-looking analyses that can make it a better match for human decision-making. It should also be noted that the model assumes that the classification and decision functions used by the human and the DSS use deterministic logic that allows a priori calculation of the decision probabilities. If the logic cannot be defined in such deterministic terms, as may be the case with emerging systems that use generative AI (GenAI) to make decisions, further work may be required to adapt the model to such situations. Furthermore, empirical analyses of human behavior in such situations can be used to validate the model and to understand how it fits the way human operators behave and how the automation influence and human responsibility are perceived. Such work can lead to a descriptive model that explains and possibly predicts how humans react to advice from automation in such dynamic events. Lastly, future work can also analyze human responsibility in complex dynamic events where human sensitivity changes. Thus, the model can serve as a tool for systematically analyzing the human’s role in collaborations with intelligent systems in dynamic events. While we focused on decision support, the approach can be developed and applied to other human-AI interactions.

Acknowledgments

The authors would like to thank the anonymous reviewers for their valuable comments and helpful suggestions.

Footnotes

1
The term “operator” is used here in the broader sense to represent any human who uses or collaborates with automation towards a decision or action.
2
Such systems are asymmetric because the automation classification cannot use the human classification at the current stage as input since this would create an unresolved dependency loop.

Supplementary Material

3631611.supp (3631611.supp.pdf)
Supplementary material

References

[1]
Joanne M. Bennett, Kirsten L. Challinor, Oscar Modesto, and Prasannah Prabhakharan. 2020. Attribution of blame of crash causation across varying levels of vehicle automation. Safety Science 132 (122020), 104968.
[2]
Charles E. Billings. 1996. Human-Centered Aviation Automation: Principles and Guidelines. Technical Report. NASA, NASA Ames Research Center Moffett Field, CA United States. 222 pages. https://ntrs.nasa.gov/api/citations/19960016374/downloads/19960016374.pdf
[3]
Vincent Boulanin, Moa Peldan Carlsson, Netta Goussac, and Neil Davison. 2020. Limits on Autonomy in Weapon Systems: Identifying Practical Elements of Human Control. Technical Report. Stockholm International Peace Research Institute. https://www.sipri.org/sites/default/files/2020-06/2006_limits_of_autonomy.pdf
[4]
Berndt Brehmer. 1992. Dynamic decision making: Human control of complex systems. Acta Psychologica 81, 3 (1992), 211–241.
[5]
Jerome R. Busemeyer and James T. Townsend. 1993. Decision field theory: A dynamic-cognitive approach to decision making in an uncertain environment. Psychological Review 100, 3 (1993), 432–459.
[6]
William N. Caballero, David Ríos Insua, and David Banks. 2021. Decision support issues in automated driving systems. Intl. Trans. in Op. Res. 0 (2021), 1–29.
[7]
Jordi Catalan, M. Grazia Barbieri, Frederic Bartumeus, Peter Bitušík, Ivan Botev, Anton Brancelj, Dan Cogalniceanu, Marina Manca, Aldo Marchetto, Nadja Ognjanova-Rumenova, Sergi Pla, Maria Rieradevall, Sanna Sorvari, Elena Štefková, Evžen Stuchlík, and Marc Ventura. 2009. Ecological thresholds in European alpine lakes. Freshwater Biology 54, 12 (2009), 2494–2517.
[8]
David A. Cieslak and Nitesh V. Chawla. 2009. A framework for monitoring classifiers’ performance: When and why failure occurs? Knowledge and Information Systems 18, 1 (2009), 83–108.
[9]
Erik Coelingh, Andreas Eidehall, and Mattias Bengtsson. 2010. Collision warning with full auto brake and pedestrian detection - a practical example of automatic emergency braking. In 13th International IEEE Conference on Intelligent Transportation Systems. IEEE, 155–160.
[10]
D. R. Cox and H. D. Miller. 1965. The Theory of Stochastic Processes. Chapman & Hall/CRC, Boca Raton, FL.
[11]
Rebecca Crootof. 2015. The killer robots are here: Legal and policy implications. Cardozo Law Review 36, 5 (12015). https://scholarship.richmond.edu/law-faculty-publications/1595
[12]
Haydee Cuevas, Stephen Fiore, Barrett Caldwell, and Laura Strater. 2007. Augmenting team cognition in human-automation teams performing in complex operational environments. Aviation, Space, and Environmental Medicine 78 (62007), 63–70.
[13]
Mitchell L. Cunningham, Michael A. Regan, Timothy Horberry, K. Weeratunga, and Vinayak Dixit. 2019. Public opinion about automated vehicles in Australia: Results from a large-scale national survey. Transportation Research Part A: Policy and Practice 129 (112019), 1–18.
[14]
Mustafa Demir, Nathan J. McNeese, and Nancy J. Cooke. 2016. Team communication behaviors of the human-automation teaming. In 2016 IEEE International Multi-Disciplinary Conference on Cognitive Methods in Situation Awareness and Decision Support (CogSIMA). IEEE, 28–34.
[15]
Nir Douer and Joachim Meyer. 2020. Judging one’s own or another person’s responsibility in interactions with automation. Human Factors 00, 0 (82020), 001872082094051.
[16]
Nir Douer and Joachim Meyer. 2020. The responsibility quantification model of human interaction with automation. IEEE Transactions on Automation Science and Engineering 17, 2 (2020), 1044–1060.
[17]
Nir Douer and Joachim Meyer. 2021. Theoretical, measured and subjective responsibility in aided decision making. ACM Transactions on Interactive Intelligent Systems 11, 1 (2021), 1–37.
[18]
Ward Edwards. 1962. Dynamic decision theory and probabilistic information processings. Human Factors: The Journal of the Human Factors and Ergonomics Society 4, 2 (41962), 59–74.
[19]
Ellery Eells. 1991. Probabilistic Causality. Cambridge University Press.
[20]
Mica R. Endsley and David B. Kaber. 1999. Level of automation effects on performance, situation awareness and workload in a dynamic control task. Ergonomics 42, 3 (31999), 462–492.
[21]
Francisco J. Estrella, Macarena Espinilla, Francisco Herrera, and Luis Martínez. 2014. FLINTSTONES: A fuzzy linguistic decision tools enhancement suite based on the 2-tuple linguistic model and extensions. Information Sciences 280 (102014), 152–170.
[22]
European Union. 2016. Article 22 - Automated individual decision-making, including profiling. https://gdpr-info.eu/art-22-gdpr/
[23]
Frank Flemisch, David A. Abbink, Masahiko Itoh, Marie-Pierre Pacaux-Lemoine, and G. Weßel. 2019. Joining the blunt and the pointy end of the spear: Towards a common framework of joint action, human-machine cooperation, cooperative guidance and control, shared, traded and supervisory control. Cognition, Technology and Work 21, 4 (112019), 555–568.
[24]
Syed Jamal Safdar Gardezi, Ahmed Elazab, Baiying Lei, and Tianfu Wang. 2019. Breast cancer detection and diagnosis using mammographic data: Systematic review. Journal of Medical Internet Research 21, 7 (72019), e14464.
[25]
Tobias Gerstenberg and David A. Lagnado. 2010. Spreading the blame: The allocation of responsibility amongst multiple agents. Cognition 115, 1 (42010), 166–171.
[26]
Víctor González-Castro, Rocío Alaiz-Rodríguez, and Enrique Alegre. 2013. Class distribution estimation based on the Hellinger distance. Information Sciences 218 (2013), 146–164.
[27]
David M. Green and John A. Swets. 1966. Signal Detection Theory and Psychophysics. John Wiley & Sons. https://psycnet.apa.org/record/1967-02286-000
[28]
Guy Grinfeld, David Lagnado, Tobias Gerstenberg, James F. Woodward, and Marius Usher. 2020. Causal responsibility and robust causation. Frontiers in Psychology 11 (52020).
[29]
Herbert L. A. Hart. 2008. Punishment and Responsibility. Oxford University Press.
[30]
Herbert L. A. Hart and John Gardner. 2009. Punishment and Responsibility: Essays in the Philosophy of Law. Oxford University Press. 1–336 pages.
[31]
Herbert L. A. Hart and Tony Honoré. 1985. Causation in the Law. Oxford University Press.
[32]
Ernst Hellinger. 1909. Neue begründung der theorie quadratischer formen von unendlichvielen veränderlichen. Journal für die Reine und Angewandte Mathematik 1909, 136 (1909), 210–271.
[34]
Michael C. Horowitz and Paul Scharre. 2015. Meaningful human control in weapon systems: A primer. (2015), 2–16 pages. https://www.cnas.org/publications/reports/an-introduction-to-autonomy-in-weapon-systems
[35]
Herbert M. Jenkins and William C. Ward. 1965. Judgment of contingency between responses and outcomes. Psychological Monographs 79 (1965).
[36]
Joseph G. Johnson and Jerome R. Busemeyer. 2010. Decision making under risk and uncertainty. Wiley Interdisciplinary Reviews: Cognitive Science 1, 5 (2010), 736–749.
[37]
Matthew Johnson, Jeffrey M. Bradshaw, Paul J. Feltovich, Catholijn M. Jonker, M. Birna van Riemsdijk, and Maarten Sierhuis. 2014. Coactive design: Designing support for interdependence in joint activity. Journal of Human-Robot Interaction 3, 1 (2014), 43–69.
[38]
Amnon Katz. 1967. Principles of Statistical Mechanics (1st ed.). W. H. Freeman, San Francisco, CA.
[39]
Patricia M. Kennedy. 2013. Impact of delayed diagnosis and treatment in clinically isolated syndrome and multiple sclerosis. Journal of Neuroscience Nursing 45, 6 Suppl. 1 (2013), S3–S13.
[40]
José H. Kerstholt. 1994. The effect of time pressure on decision-making behaviour in a dynamic task environment. Acta Psychologica 86, 1 (1994), 89–104.
[41]
Don N. Kleinmuntz and James B. Thomas. 1987. The value of action and inference in dynamic decision making. Organizational Behavior and Human Decision Processes 39, 3 (1987), 341–364.
[42]
Alexander V. Korobov and Boris I. Yatsalo. 2021. Decerns-FT: Decision support system for analysis of multi-criteria problems in the fuzzy environment. Software Journal: Theory and Applications1 (2021).
[43]
Álvaro Labella, Konstantinos Koasidis, Alexandros Nikas, Apostolos Arsenopoulos, and Haris Doukas. 2020. APOLLO: A fuzzy multi-criteria group decision-making tool in support of climate policy. International Journal of Computational Intelligence Systems 13, 1 (2020), 1539.
[44]
David A. Lagnado, Tobias Gerstenberg, and Ro’i Zultan. 2014. Causal responsibility and counterfactuals. Cognitive Science 37 (2014), 1036–1073.
[45]
Claire Lavigne, Benoit Ricci, Pierre Franck, and Rachid Senoussi. 2010. Spatial analyses of ecological count data: A density map comparison approach. Basic and Applied Ecology 11, 8 (2010), 734–742.
[46]
Rensis Likert. 1932. A technique for the measurement of attitudes. Archives of Psychology 22, 140 (61932), 5–55.
[47]
Ling Feng Liu, Han Ping Hu, Ya Shuang Deng, and Nai Da Ding. 2014. An entropy measure of non-stationary processes. Entropy 2014, Vol. 16, Pages 1493-1500 16, 3 (32014), 1493–1500.
[48]
Stine Lomborg, Anne Kaun, and Sne Scott Hansen. 2023. Automated decision-making: Toward a people-centred approach. Sociology Compass (2023).
[49]
Shih Nan Lu, Hsien Wei Tseng, Yang Han Lee, Yih Guang Jan, and Wei Chen Lee. 2010. Intelligent safety warning and alert system for car driving. Tamkang Journal of Science and Engineering 13, 4 (2010), 395–404.
[50]
Neil A. Macmillan. 2001. Signal detection theory. International Encyclopedia of the Social & Behavioral Sciences (2001), 14075–14078.
[51]
Neil A. Macmillan. 2005. Detection Theory. Psychology Press. 513 pages.
[52]
John A. Maule and Anne C. Edland. 1997. The effects of time pressure on human judgement and decision making. In Decision Making: Cognitive Models and Explanations, Rob Ranyard, Crozier Ray W., and Ola Svenson (Eds.). Routledge, New York, NY, Chapter 11, 189–204.
[53]
Craig R. M. McKenzie. 1994. The accuracy of intuitive judgment strategies: Covariation assessment and Bayesian inference. Cognitive Psychology 26, 3 (61994), 209–239.
[54]
Mário W. L. Moreira, Joel J. P. C. Rodrigues, Valery Korotaev, Jalal Al-Muhtadi, and Neeraj Kumar. 2019. A comprehensive review on smart decision support systems for health care. IEEE Systems Journal 13, 3 (92019), 3536–3545.
[55]
Emanuele Neri, Francesca Coppola, Vittorio Miele, Corrado Bibbolino, and Roberto Grassi. 2020. Artificial intelligence: Who is responsible for the diagnosis? Radiologia Medica 125, 6 (2020), 517–521.
[56]
Savas Ozsu, Funda Oztuna, Yilmaz Bulbul, Murat Topbas, Tevfik Ozlu, Polat Kosucu, and Asiye Ozsu. 2011. The role of risk factors in delayed diagnosis of pulmonary embolism. American Journal of Emergency Medicine 29, 1 (2011), 26–32.
[57]
Marie Pierre Pacaux-Lemoine and Frank Flemisch. 2019. Layers of shared and cooperative control, assistance, and automation. Cognition, Technology and Work 21, 4 (112019), 579–591.
[58]
Raja Parasuraman, Thomas B. Sheridan, and Christopher D. Wickens. 2000. A model for types and levels of human interaction with automation. IEEE Transactions on Systems, Man, and Cybernetics - Part A: Systems and Humans 30, 3 (2000), 286–297.
[59]
Gloria Phillips-Wren and Monica Adya. 2020. Decision making under stress: The role of information overload, time pressure, complexity, and uncertainty. Journal of Decision Systems 00, 00 (2020), 1–13.
[60]
Roger Alan Pick. 2008. Benefits of decision support systems. In Handbook on Decision Support Systems 1. Springer Berlin, Chapter 32, 719–730.
[61]
Amy R. Pritchett. 2009. Aviation automation: General perspectives and specific guidance for the design of modes and alerts. Reviews of Human Factors and Ergonomics 5, 1 (92009), 82–113.
[62]
Rangaraj M. Rangayyan, Shantanu Banik, and J. E. Leo Desautels. 2010. Computer-aided detection of architectural distortion in prior mammograms of interval cancer. Journal of Digital Imaging 23, 5 (102010), 611–631.
[63]
Amnon Rapoport. 1967. Dynamic programming models for multistage decision-making tasks. Journal of Mathematical Psychology 4, 1 (21967), 48–71.
[64]
SAE. 2016. Surface Vehicle Recommended Practice - Taxonomy and Definitions for Terms Related to Driving Automation Systems for On-Road Motor Vehicles.
[65]
Said A. Salloum, Muhammad Alshurideh, Ashraf Elnagar, and Khaled Shaalan. 2020. Machine learning and deep learning techniques for cybersecurity: A review. Advances in Intelligent Systems and Computing 1153 AISC (32020), 50–57.
[66]
Martin Sand, Juan Manuel Durán, and Karin Rolanda Jongsma. 2021. Responsibility beyond design: Physicians’ requirements for ethical medical AI. Bioethics 0, March (2021), 1–8.
[67]
S. Sanjeevi, T. Ivanics, L. Lundell, N. Kartalis, A. Andrén-Sandberg, J. Blomberg, M. Del Chiaro, and C. Ansorge. 2016. Impact of delay between imaging and treatment in patients with potentially curable pancreatic cancer. British Journal of Surgery 103, 3 (2016), 267–275.
[68]
Filippo Santoni de Sio and Giulio Mecacci. 2021. Four responsibility gaps with artificial intelligence: Why they matter and how to address them. Philosophy & Technology 34, 4 (122021), 1057–1084.
[69]
Filippo Santoni de Sio and Jeroen van den Hoven. 2018. Meaningful human control over autonomous systems: A philosophical account. Frontiers in Robotics and AI 5, Feb. (22018), 1–14.
[70]
Ramesh Sharda, Steve H. Barr, and James C. McDonnell. 1988. Decision support system effectiveness: A review and an empirical test. Management Science 34, 2 (1988), 139–159.
[71]
Kelly G. Shaver and Debra Drown. 1986. On causality, responsibility, and self-blame. A theoretical note. Journal of Personality and Social Psychology 50, 4 (41986), 697–702.
[72]
Thomas B. Sheridan. 2012. Human supervisory control. In Handbook of Human Factors and Ergonomics (4th ed.), Gavriel Salvendy (Ed.). John Wiley & Sons Inc., Hoboken, NJ, Chapter 34.
[73]
Thomas B. Sheridan and William L. Verplank. 1978. Human and Computer Control of Undersea Teleoperators. Technical Report. Office of Naval Research, Arlington, VA.
[74]
Marc Steen, Jurriaan van Diggelen, Tjerk Timan, and Nanda van der Stap. 2022. Meaningful human control of drones: Exploring human-machine teaming, informed by four different ethical perspectives. AI and Ethics 2022 1 (52022), 1–13.
[75]
Reed T. Sutton, David Pincock, Daniel C. Baumgart, Daniel C. Sadowski, Richard N. Fedorak, and Karen I. Kroeker. 2020. An overview of clinical decision support systems: Benefits, risks, and strategies for success. npj Digital Medicine 3, 1 (2020), 1–10.
[76]
Mariarosaria Taddeo and Alexander Blanchard. 2022. Accepting moral responsibility for the actions of autonomous weapons systems-a moral gambit. Philosophy & Technology 35, 3 (92022), 78.
[77]
Jin Tang, Yu Cheng, and Chi Zhou. 2009. Sketch-based SIP flooding detection using Hellinger distance. In GLOBECOM - IEEE Global Telecommunications Conference. IEEE, 1–6.
[78]
Daniel W. Tigard. 2021. Responsible AI and moral responsibility: A common appreciation. AI and Ethics 1, 2 (2021), 113–117.
[79]
Nicole Vincent. 2011. A structured taxonomy of responsibility concepts. In Moral Responsibility, Nicole A. Vincent, Ibo van de Poel, and Jeroen van den Hoven (Eds.). Springer Dordrecht Heidelberg London New York, Chapter 2, 15–35.
[80]
Christopher D. Wickens, Anne S. Mavor, Raja Parasuraman, and James P. McGee. 1998. The Future of Air Traffic Control. National Academies Press, Washington, D.C. 336 pages.
[81]
Thomas D. Wikens. 2002. Elementary Signal Detection Theory. Oxford University Press, New York, NY. 262 pages.
[82]
Boris Yatsalo, Vladimir Didenko, Sergey Gritsyuk, and Terry Sullivan. 2015. Decerns: A framework for multi-criteria decision analysis. International Journal of Computational Intelligence Systems 8, 3 (2015), 467.

Cited By

View all
  • (2025)Context-Based Human Influence and Causal Responsibility for Assisted Decision-MakingHuman Factors: The Journal of the Human Factors and Ergonomics Society10.1177/00187208251317470Online publication date: 3-Feb-2025

Recommendations

Comments

Information & Contributors

Information

Published In

cover image ACM Transactions on Intelligent Systems and Technology
ACM Transactions on Intelligent Systems and Technology  Volume 15, Issue 1
February 2024
533 pages
EISSN:2157-6912
DOI:10.1145/3613503
  • Editor:
  • Huan Liu
Issue’s Table of Contents
This work is licensed under a Creative Commons Attribution International 4.0 License.

Publisher

Association for Computing Machinery

New York, NY, United States

Publication History

Published: 19 December 2023
Online AM: 03 November 2023
Accepted: 23 October 2023
Revised: 21 September 2023
Received: 05 February 2023
Published in TIST Volume 15, Issue 1

Check for updates

Author Tags

  1. Human automation interaction
  2. decision making
  3. decision support system
  4. DSS
  5. responsibility
  6. influence
  7. MHC

Qualifiers

  • Research-article

Funding Sources

  • Israel Science Foundation

Contributors

Other Metrics

Bibliometrics & Citations

Bibliometrics

Article Metrics

  • Downloads (Last 12 months)552
  • Downloads (Last 6 weeks)48
Reflects downloads up to 12 Feb 2025

Other Metrics

Citations

Cited By

View all
  • (2025)Context-Based Human Influence and Causal Responsibility for Assisted Decision-MakingHuman Factors: The Journal of the Human Factors and Ergonomics Society10.1177/00187208251317470Online publication date: 3-Feb-2025

View Options

View options

PDF

View or Download as a PDF file.

PDF

eReader

View online with eReader.

eReader

Login options

Full Access

Figures

Tables

Media

Share

Share

Share this Publication link

Share on social media