Articolo em G
Articolo em G
A R T I C L E I N F O A B S T R A C T
Keywords:                                                  Background: Arousal may be important for learning to restructure ones’ negative cognitions, a core technique in
Cognitive restructuring                                    depression treatment. In virtual reality (VR), situations may be experienced more vividly than, e.g., in an
Emotional activation                                       imaginative approach, potentially aiding the emotional activation of negative cognitions. However, it is unclear
Virtual reality
                                                           whether such activation and subsequent cognitive restructuring in VR elicits more physiological, e.g. changes in
University students
Wearable monitoring
                                                           skin conductance (SC), heart rate (HR), and self-reported arousal.
Cognitive behavioural therapy                              Method: In a cross-over experiment, 41 healthy students experienced two sets, one in VR, one face-to-face (F2F),
                                                           of three situations aimed at activating negative cognitions. Order of the sets and mode of delivery were rand
                                                           omised. A wristband wearable monitored SC and HR; self-reported arousal was registered verbally.
                                                           Results: Repeated measures analyses of variance revealed significantly more SC peaks per minute, F (1, 40) =
                                                           13.89, p = .001, higher mean SC, F (1,40) = 7.47, p = .001, and higher mean HR, F (1, 40) = 75.84, p < .001 in
                                                           VR compared to F2F. No differences emerged on the paired-samples t-test for self-reported arousal, t (40) =
                                                           − 1.35, p = .18.
                                                           Discussion: To the best of our knowledge, this is the first study indicating that emotional activation and subse
                                                           quent cognitive restructuring in VR can lead to significantly more physiological arousal compared to an imag
                                                           inative approach. These findings need to be replicated before they can be extended to patient populations.
 * Corresponding author. Vrije Universiteit Amsterdam, Faculty of Behavioural and Movement Sciences, Van der Boechorststraat 7, 1081 BT Amsterdam, the
Netherlands.
   E-mail address: f.bolinski@vu.nl (F. Bolinski).
https://doi.org/10.1016/j.brat.2021.103877
Received 11 December 2020; Received in revised form 16 April 2021; Accepted 26 April 2021
Available online 11 May 2021
0005-7967/© 2021 The Author(s). Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/).
F. Bolinski et al.                                                                                              Behaviour Research and Therapy 142 (2021) 103877
30-50% following psycho- or pharmacotherapy (Gueorguieva, Chek                  (2016) indicated a significant increase in arousal during a virtual height
roud, & Krystal, 2017; Wojnarowski, Firth, Finegan, & Delgadillo,                simulation. For VR to have clinical relevance and overcome the
2019), new therapeutic approaches for depression are needed.                     above-mentioned drawbacks of imaginative approaches, the resulting
    According to Beck’s cognitive theory of depression, negatively col          arousal should reach intensities or even surpass those of current tech
oured cognitive structures, called schemas (e.g. “I am worthless”), lead         niques. Pre-clinical evidence, e.g. through experiments, testing this
to biased information processing and ultimately negatively influence             assumption is needed before moving to patient populations.
individuals’ emotions and behaviours (Beck & Haigh, 2014; Beck, Rush,                Generally, arousal can be assessed along a spectrum of subjective to
Shaw, & Emery, 1979). For instance, patients might overgeneralise                objective measures. Firstly, based on self-report (Benjamin et al., 2010),
(American Psychological Association, 2018; Thew, Gregory, Roberts, &             secondly, by using observer ratings (e.g. Client Expressed Emotional
Rimes, 2017), exemplified by the dysfunctional thought that a single             Arousal Scale III-Revisited, CEAS; Carryer & Greenberg, 2010), and
failed exam means complete academic inadequacy, in turn leading to               thirdly, through monitoring of physiological markers, among which
feelings of worthlessness and avoidance of similar situations in the             changes in skin conductance (SC) and heart rate (HR) are commonly
future (Mellick et al., 2019). Research has confirmed that such negative         used (Boucsein, Haarmann, & Schaefer, 2007). These variables reflect
thoughts are involved in the development and maintenance of depres              arousal as a response to stressful events or stimuli: the deactivation of
sive mood (Lorenzo-Luaces, German, & DeRubeis, 2015; Scher, Ingram,              the parasympathetic nervous system (PNS) and activation of the sym
& Segal, 2005; Wenze, Gunthert, & Forand, 2010). Within cognitive                pathetic nervous system (SNS) results in increased HR and sweat
behavioural therapy (CBT), cognitive restructuring is a core set of              secretion, which leads to increased conductivity of the skin that can be
techniques to tackle these negative cognitions (Lorenzo-Luaces et al.,           measured through resistance to low-intensity electric current (Kyriakou
2015). Clark (2013, pp. 1–22) summarises the essential steps as identi          et al., 2019; Wolfensberger & O’Connor, 1967). Both markers have
fying a situation that triggered a negative thought, disputing this              proven to be valid and sensitive measures of emotional arousal (Chris
cognitive pattern, and finally replacing it with more adaptive thoughts          topoulos, Uy, & Yap, 2019; Malik, 1996). Importantly, research has
(e.g. “Anyone can have a bad day, I can retake the exam”). In therapy            shown that self-reported and physiological measures do not necessarily
sessions, the identification, activation, and subsequent restructuring of        converge on each other (Busscher, Spinhoven, & de Geus, 2020; Ciuk,
negative cognitions can be undertaken using imagination. However,                Troy, & Jones, 2015). To investigate emotional arousal, it is conse
because of its focus on complex cognitive processes, this procedure has          quently paramount to include physiological markers in addition to the
been described as particularly difficult for patients (Clark, 2013, pp.          subjective response. In recent years, wearable monitoring has made the
1–22). Also, Greenberg, Auszra, and Herrmann (2007) have stressed the            collection of physiological data more flexible for experimental and
importance of activating negative schemas and emotions in the moment,            therapeutic circumstances and more affordable (Kyriakou et al., 2019).
as well as the patient’s sense of agency in changing these structures            Compared to wired and more intrusive equipment that often measures
during therapy. Relying on imagination makes it difficult to ascertain if        only one specific signal, e.g. electrocardiograms (ECGs), the wristband
these requirements are met, how realistic patients’ experiences are, and         wearable used in this study (E4, Empatica, 2016) can unobtrusively
how much of the therapeutic information they take home. These dis               monitor a variety of physiological variables, such as movement, SC, and
advantages could potentially be overcome by using the immersive ca              blood volume pulse (BVP), which allows for the calculation of HR.
pacities of VR (Lindner et al., 2019).                                           Though the use of such devices in experimental studies is a relatively
    As outlined in the review by Bruijniks, DeRubeis, Hollon, and                recent phenomenon, research suggests that these provide a promising
Huibers (2019), successful psychological treatment for depression can            alternative to established stationary appliances (De Witte et al., 2020;
depend on the extent to which patients remember and learn therapeutic            Debard et al., 2020; Konstantinou et al., 2020; Menghini et al., 2019;
skills, such as cognitive restructuring, which they can use on their own,        Ollander, Godin, Campagne, & Charbonnier, 2016).
as homework assignments and following the conclusion of therapy
(Kuyken, Padesky, & Dudley, 2009). Research has long shown that                  2. Aim
emotional arousal is crucial to learning, or encoding of information into
memory (McGaugh, 2018; Sharot & Phelps, 2004). For instance, Cahill                  In this crossover experimental study, we investigated whether
and McGaugh (1995) conducted an experiment in which participants                 emotional activation of negative cognitions and subsequent restructur
were read to either a neutral or an emotionally arousing short story,            ing held in VR results in increased arousal compared to an imaginative
accompanied by 12 slides. Two weeks later, participants in the arousal           approach that can be used in face-to-face (F2F) therapy. Applying
condition outperformed the control group by more than two slides                 wearable technology, we monitored physiological arousal (i.e. SC and
correctly recalled (p < .01). This finding has since been replicated in          HR), while self-reported arousal was registered verbally. We used a
similar studies (Osugi & Ohira, 2018). Specific mechanisms in the brain,         virtual lunchroom called Lunchroom Zondag (English: Lunchroom
particularly the involvement of the amygdala, and increases in stress            Sunday), in which users experienced several situations that were
hormones (e.g. norepinephrine), have been identified as accounting for           designed to elicit negative automatic thoughts, such as being laughed at
the creation of long-lasting memories where emotional arousal is pre            by other guests. Next, users were asked to generate alternative thoughts.
sent (Anderson, Yamaguchi, Grabski, & Lacka, 2006; McGaugh, 2018).               These situations were also translated into a F2F protocol that served as a
Similarly, emotional arousal during therapy is thought to signify access         control condition. Therein, participants were asked to envision situa
to the underlying negative cognitive schemas and is, therefore, a pre           tions that are associated with negative automatic thoughts before
dictor of beneficial treatment outcomes (Carryer & Greenberg, 2010;              generating alternative thoughts. Since this is, to the best of our knowl
Greenberg et al., 2007).                                                         edge, the first study of its kind, we collected data from a non-clinical
    The rationale for using VR for emotionally activating and subse             sample of university students (henceforth referred to as students).
quently restructuring negative cognitive constructs lies thus in its power
to create vivid and immersive experiences, which are expected to result          3. Methods
in the emotional arousal required for learning and eventually thera
peutic change. Moreover, VR offers the possibility of adjusting stimuli to       3.1. Sampling procedure
create optimal levels of arousal, one of the key qualities of current VRET
interventions (Pot-Kolder et al., 2018). In the latter, this effect on              Participants were recruited among students in psychology and
arousal has already been demonstrated (Counotte et al., 2017; Salkevi           educational studies at the Vrije Universiteit (VU) Amsterdam using the
cius, Damaaevjius, Maskeljknas, & Laukien, 2019). For instance, phobic           university’s online research participation portal. Therein, students could
patients in a study by Diemer, Lohkamp, Mühlberger, and Zwanzger                 read information about the study along with inclusion and exclusion
                                                                             2
F. Bolinski et al.                                                                                                                  Behaviour Research and Therapy 142 (2021) 103877
criteria and sign up for a specific time slot. The experiment took place in                          stimuli could be presented either in VR or in F2F). Questionnaires were
a research laboratory at the university. Participants could choose be                               administered on a computer in the lab at pre-test, between the two
tween course credits or a gift voucher (15€) as a reward. In total, the                              modalities, and at post-test. The pre-test and intermediate assessments
experiment took around 45 min per participant (15 min completion of                                  were concluded by a short relaxation video of 01:30 (minutes:seconds)
questionnaires and 15 min each for the VR and F2F session). Approval                                 intended to reduce pre-test and carryover physiological stress responses.
for this study was granted by the scientific and ethical committee of the                            During the experiment, SC, BVP, and movement were recorded using the
faculty of behavioural and movement sciences at the VU (VCWE-2019-                                   Empatica E4 wristband wearable (Empatica, 2016). The experiment was
159).                                                                                                recorded on video to match physiological data with the different com
                                                                                                     ponents of the experiment and to log self-reported arousal. Fig. 1 shows
3.2. Inclusion and exclusion criteria                                                                a flowchart of the design.
    We collected data from a non-clinical sample since the VR applica                               3.4. Measures
tion had not been tested before in a patient population and the research
question did not pertain to clinical effects. We therefore did not want to                           3.4.1. Primary outcome variables
risk any unnecessary distress in vulnerable individuals. Inclusion and                                   Physiological arousal. We assessed recommended parameters of
exclusion criteria were assessed through self-report. Students between                               physiological arousal, namely the number of SC peaks per minute, the
16 and 35 years were eligible for participation in the experiment. They                              mean SC, and the mean HR over the individual experimental compo
were excluded if they a) had received psychotherapy during the past                                  nents (Boucsein et al., 2012). The respective data was gathered using the
year, so no recent experience with cognitive restructuring was present;                              wristband wearable Empatica E4 (Empatica, 2016). Research has shown
b) scored ≥15 on the depression subscale of the Hospital Anxiety and                                 that monitoring cardiac data this way correlates highly (r > 0.80) with
Depression Scale (HADS; Zigmond & Snaith, 1983) or c) suffered from                                  more complex stationary equipment, such as electrocardiograms (ECG),
epilepsy, as the use of light-emitting sources (e.g. computer, VR system)                            and that satisfactory estimations of mean HR can be acquired, even
might cause seizures in photosensitive epileptic individuals (Kaste                                 when the subject is moving (Konstantinou et al., 2020; Menghini et al.,
leijn-Nolst Trenite et al., 2002). Finally, participants had to provide                              2019; Ollander et al., 2016). The wearable monitoring of SC on the
written informed consent and be able to speak and understand Dutch.                                  wrist, however, shows less convergence with laboratory-based in
                                                                                                     struments that measure conductivity on the fingers, potentially due to
3.3. Design                                                                                          fewer sweat glands on the wrist, differences in skin temperature between
                                                                                                     the locations, and the generally lower sampling frequency of the wear
    We used a 2 (sequence) by 2 (stimulus set) factorial design. Sequence                            able device (Konstantinou et al., 2020; Menghini et al., 2019). However,
levels were: first the VR, followed by the F2F component vs. the other                               other studies suggest that despite this lack of convergence, wearable
way around. The stimulus set levels were: three situations contained in                              monitoring of SC on the wrist results in better detection of stress re
set A in the first part of the experiment, followed by three different                               sponses compared to stationary equipment (Ollander et al., 2016).
situations of set B in the second part vs. the other way around. Table 1                                 Self-reported arousal. This was registered verbally at specific
contains a description of all situations. The comparison with an active                              points during the experimental procedure by asking participants “How
imaginative F2F component was chosen to obtain an uninflated estimate                                strong is this feeling on a scale from 0-10, with 10 being very strong?”
of the effect of VR. Participants were randomly allocated (1:1:1:1) to the                           (see questions 5 and 2 in Table 2 and Table 3 respectively). If partici
four resulting conditions to counter potential spillover effects (e.g.                               pants gave more than one answer, the first response given was recorded.
participants receiving set A in VR might repeat responses in F2F). An                                These responses were part of the cognitive restructuring protocol and
independent researcher used the blockrand package in Rstudio (Snow,                                  have therefore no established psychometric properties. However,
2013) to generate a blocked randomisation sequence, which was                                        research on stress shows that for instance globally assessed subjective
implemented using envelopes sealed by the same researcher. Students                                  units of distress are a valid measure of distress (Tanner, 2012).
were informed beforehand about their equal chance of beginning the
experiment with either a VR or a F2F procedure. They were, however,                                  3.5. Secondary variables
blind to the content of the stimulus sets (i.e. to the fact that the same
                                                                                                         Emotions. Emotions were measured using the Positive and Negative
Table 1                                                                                              Affect Schedule (PANAS; Watson, Clark, & Tellegen, 1988). Participants
Description of situations per stimulus set.                                                          are asked to indicate the intensity with which they feel 20 emotions at
                                                                                                     the very moment on a 5-point Likert scale (1 = very slightly/not at all; 5
  Set A              Description
                                                                                                     = extremely). 10 items each comprise the subscales for negative (e.g.
  Situation 1        A student is sitting behind his laptop. He is struggling with his thesis,
                                                                                                     hostile) and positive (e.g. excited) emotions. Internal consistency of the
                     thinking “If I cannot get something unto paper today, I can forget about
                     my career.”                                                                     PANAS has proven to be satisfactory, with Cronbach’s α= 0.83 and 0.79
  Situation 2        In the role of waiter/waitress, the participant is approaching two girls        for the positive and negative subscales respectively (Peeters, Ponds, &
                     sitting at a table. They want to order something and laugh, seemingly           Vermeeren, 1996). The developers reported good construct validity
                     at the participant.                                                             (Watson et al., 1988).
  Situation 3        In the role of waiter/waitress, the participant is approaching a man
                                                                                                         Depression and anxiety. Symptoms of depression and anxiety were
                     sitting at a table. He asks “For how long have you been working here?”
                                                                                                     assessed using the HADS (Zigmond & Snaith, 1983), which consists of 14
  Set B              Description
                                                                                                     items, seven each related to symptoms of depression (e.g. “I have lost
  Situation 1        Three men are sitting at a table. One of them is thinking “What I have          interest in my appearance”) and anxiety (e.g. “I get sudden feelings of
                     to say does not matter. The other two should have met alone. I am not           panic.“). Each item is rated on a 4-point Likert scale with higher scores
                     worthy of this friendship.”
  Situation 2        The participant is touching a jar with cutlery. The jar falls to the
                                                                                                     referring to more severe symptoms. Bjelland, Dahl, Haug, and Neck
                     ground, making lots of noise and scattering cutlery all over the floor.         elmann (2002) have reported good psychometric properties with
  Situation 3        In the role of waiter/waitress, the participant is approaching a                Cronbach’s α = 0.82 and 0.83 for the depression and anxiety subscales
                     colleague standing at the bar. He responds roughly: “Not now!“.                 respectively. Stern (2014) suggests cut-off scores of larger than 14 out of
Note. All situations take place in the lunchroom. Participants either experience a                   21 for severe manifestations of depression and anxiety on the respective
person’s thoughts or are put in a situation that should trigger thoughts in                          subscales.
themselves.                                                                                              Other. Other questionnaires were administered but not used for
                                                                                                 3
F. Bolinski et al.                                                                                                      Behaviour Research and Therapy 142 (2021) 103877
Fig. 1. Flowchart and study design. (For interpretation of the references to colour in this figure legend, the reader is referred to the Web version of this article.)
analyses (descriptives in Appendix A). These were the Igroup Presence                 3.6. Technical equipment
Questionnaire (IPQ; Schubert, Friedmann, & Regenbrecht, 2001), the
Dysfunctional Attitude Scale-A (DAS-A; de Graaf, Roelofs, & Huibers,                     The HTC Vive is a commercially available VR system. It consists of a
2009; Weissman & Beck, 1978), the Simulator Sickness Questionnaire                    head-mounted display (HMD) with 1080 × 1200 pixels per eye per
(SSQ; Kennedy, Lane, Berbaum, & Lilienthal, 1993), and the System                     forming at a 90 Hz refresh rate (HTC & Valve Corporation, 2016). The
Usability Scale (SUS; Brooke, 1996; Mol et al., 2020).                                HMD allows for a 360-degree field of vision created by two base stations.
                                                                                      These use optical tracking by emitting light that is received by the HMD
                                                                                      and an appendant handheld controller. The latter steers movement and
                                                                                      confirmation of actions in the virtual environment. The setup was
                                                                                  4
F. Bolinski et al.                                                                                                                       Behaviour Research and Therapy 142 (2021) 103877
Table 2                                                                                          Table 3
Example of a protocol sheet for VR procedure (situation 1 from set B).                           Example of a protocol sheet for F2F procedure (situation 2 from set A).
  Introduction (read out loud): Next, I will turn on the VR application. In the VR                Introduction (read out loud): Next, I will read out a situation to you. I am asking you
    environment you will be a waiter/waitress [again]. Before you enter the lunchroom,            to really immerse yourself into that situation. You are working as a waiter/waitress in
    you will receive a tutorial, explaining how things work in the VR environment. You            a lunchroom. It is a nice, open, and light café. Different people are sitting in this café.
    will see different people in the lunchroom and I will tell you where you should go
                                                                                                  Situation (read out loud): Two girls are sitting at a table, they have not ordered yet.
    next. In between, I will ask you some questions [again]. These questions relate to
                                                                                                    Their cell phones are on the table. You approach them. One girl asks “Can we order
    feelings and thoughts. Also, I will take notes, so please wait before you go on. It is
                                                                                                    something?“. She sounds a bit giddy. Both girls look at you and start laughing.
    pretty much self-explanatory. If you have any questions, do not hesitate to ask them
    at any time.                                                                                  Feelings of the participant                                                          Rating
  Situation (read out loud): First, you will see three men sitting at a table. Could you          1. How do you feel?
    go there? You can activate them with the yellow button.                                       Potential follow-up questions:
                                                                                                  - Which emotions do you experience in this situation?
  Feelings of VR person                                                           Rating
                                                                                                  - Do you feel good or bad?
  1. How do you think the man feels?                                                              2. How strong is this feeling on a scale from 0-10, with 10 being very
  Potential follow-up questions:                                                                    strong?
  - Which emotions do you think he is experiencing in this situation?
                                                                                                  Thoughts and credibility                                                             Rating
  - So, you (emphasise you) would feel “[…]”?
  - Do you think he feels good or bad?                                                            3. What do you think in this situation?
  2. How strong do you think he is experiencing this feeling on a scale from 0-                   Potential follow-up question:
     10, with 10 being very strong?                                                               - What went through your head?
                                                                                                  4. How credible do you find this thought “[…]” on a scale from 0-10?
  Credibility                                                                     Rating
                                                                                                  Formulation of alternative thoughts                                                  Rating
  3. How credible would you rate this thought “[…]” on a scale from 0-10?
  Potential follow-up question:                                                                   5. What else could you think in this situation, instead of “[…]”?
  - Do you find this situation realistic?                                                         6. Now that you formulated this alternative thought, what do you feel in
                                                                                                    this situation?
  Feelings of participant                                                         Rating
                                                                                                  Potential follow-up question:
  4. How would you feel in this situation?                                                        - What emotion would you have?
  Potential follow-up question:                                                                   - Would you feel good or bad?
  - What emotions would you feel in this situation?                                               7. How strong is this feeling on a scale from 0-10?
  - Would you feel good or bad?                                                                   Conclusion: Thank you, we can continue with the next situation.
  5. How strong is this feeling on a scale from 0-10, with 10 being very
    strong?                                                                                      Note. Questions on the protocol sheets can differ slightly, based on whether the
                                                                                                 situation involves a direct interaction, or experiencing another person’s
  Formulation of alternative thoughts                                             Rating         thoughts (see also Table 2).
  6. What else could you think in this situation, instead of “[…]”?
  7. Now that you formulated this alternative thought, what do you feel in
                                                                                                 It was developed in an iterative way with end-users (healthy people,
    this situation?
  Potential follow-up question:                                                                  patients, and therapists) involved throughout the development process.
  - What emotion would you have?                                                                 The end product with rather simple human figures (see Fig. 2 for a
  - Would you feel good or bad?                                                                  screenshot) is the result of this iterative process. More high-end graphics
  8. How strong is this feeling on a scale from 0-10?                                            (e.g. detailed facial expressions of individuals in the virtual environ
  Conclusion: Thank you, we can continue with the next situation.
                                                                                                 ment) created expectations of perfection and therefore caused distrac
Note. Questions on the protocol sheets can differ slightly, based on whether the                 tion in some users. Before conducting this experiment, the intervention
situation involves a direct interaction, or experiencing another person’s                        was tested for acceptability and safety in a pilot study among 17 people
thoughts (see also Table 3).                                                                     from the general population.
                                                                                                     Lunchroom Zondag consists of three components: the virtual
connected to a desktop computer and display operated by the experi                              lunchroom, a virtual tutorial, and an interface displayed on a secondary
menter. The Steam VR platform (Valve Corporation, 2020) served as the                            screen. Using this interface, the experimenter (or therapist) can start and
connection between hardware input and the intervention software                                  terminate the application, select the stimuli to be displayed in the virtual
(IJsfontein, 2019). Sound was played through two external speakers                               environment, and monitor what the participant sees, as the output is
directed towards the HMD.                                                                        displayed in the interface. The tutorial is set in a town square just outside
    Physiological data were gathered using the Empatica E4 (Empatica,                            the lunchroom, which allows the user to walk around more freely,
2016), a research-grade wristband wearable. Its two silver-plated elec                          thereby learning how to move around. The participant is guided through
trodes measure electrodermal activity (i.e. SC) at the inner wrist at a 4                        the functions of the program and learns how to activate individuals. In
Hz sampling rate; HR was determined using a photoplethysmography                                 the virtual lunchroom, participants take on the role of a waiter/waitress.
(PPG) sensor, which measures the BVP using green and red LEDs emit                              Using the controller, they can activate three different situations in which
ting light that is reflected as a function of blood oxygenation. The more                        they either directly interact with the virtual environment (e.g. acci
the blood is oxygenated, the more light is absorbed. During a heartbeat,                         dently dropping spoons on the floor, being laughed at by other guests),
less light is therefore reflected. The BVP signal is measured at a sampling                      or hear the thoughts of individuals in the situation (e.g. a student
frequency of 64 Hz. From this BVP signal, HR was calculated. Movement                            struggling with finishing his thesis). The locations of these situations are
was recorded through the built-in three-axis accelerometer to allow for                          indicated by a yellow circle. After activation, the conclusion of every
the filtering of artefacts. Each session was videotaped using a handheld                         situation is confirmed by the experimenter using the dashboard,
camera stationed on a tripod.                                                                    allowing the participant to continue with the next situation. A descrip
                                                                                                 tion of all situations per stimulus set is given in Table 1. For the F2F
3.7. Application: Lunchroom Zondag                                                               intervention, the VR component was typed out.
   The VR application Lunchroom Zondag was developed by a con                                   3.8. General study procedure
sortium of game developers, mental health treatment centres, and uni
versities (https://www.e-mence.org/en). The aim was to create an                                    Potential participants were greeted by the experimenter in the lab
innovative VR application for the prevention or treatment of depression.                         and were given an information sheet, accompanied by a short verbal
                                                                                             5
F. Bolinski et al.                                                                                           Behaviour Research and Therapy 142 (2021) 103877
explanation of the general aim of the experiment and the fact that the        3.9. Experimental procedures
session would be recorded on video to match physiological data with the
experimental procedures. After written informed consent was given in              All experimental sessions were conducted by the same experimenter
the lab, the video recording was started, the wearable placed on the          (CVB, master student clinical psychology). Both the VR and F2F com
wrist of the participant’s non-dominant hand (Empatica Support, 2019),        ponents were conducted according to pre-defined protocols, in line with
and activated. Next, the pre-test questionnaire was administered and          the steps described in Clark (2013, pp. 1–22): Participants experienced
concluded by the relaxation video. This baseline assessment was also          the situation (emotional activation), either in the virtual environment or
implemented as a settling-in period for physiological measurement (De         by imagining it in the F2F component guided by the experimenter, and
Witte et al., 2020). Students answering in fulfilment of one of the           were asked about their feelings, thoughts, the credibility of these, and
exclusion criteria received an on-screen prompt informing them that           alternative thoughts. Detailed examples of the protocols are provided in
participation was not possible. They received minimal compensation (5€        Table 2 (VR) and Table 3 (F2F).
or equivalent credits) and in case of increased depression scores were
referred to their general practitioner. Eligible participants were            3.9.1. VR procedure
randomly allocated to one of the four conditions. The PANAS was                   After mounting and adjusting the VR headset, the participant
administered again during intermediate and post-test assessments (see         received the controller and the tutorial was started. Subsequently, the
Fig. 1).                                                                      experimenter closed the tutorial and activated the stimuli belonging to
                                                                              the randomised condition (i.e. set A vs. set B), thereby changing the
                                                                              virtual environment from tutorial to the lunchroom setting. Within the
                                                                          6
F. Bolinski et al.                                                                                                Behaviour Research and Therapy 142 (2021) 103877
virtual environment, the experimenter told the participant which stim            F2F component. Statistically significant main effects were followed up
ulus to activate. By activating the stimulus, the participant experienced         by three simple contrasts with Bonferroni correction (p = .017),
the situation in the virtual environment. Next, the experimenter asked            comparing 1) the VR vs the F2F component, 2) the VR tutorial vs the VR
about the participant’s feelings regarding the situation, their thoughts,         component, and 3) the VR tutorial vs the F2F component. The first
the credibility of these, and alternative cognitions, before progressing to       contrast reflected the primary research question. The second and third
the next situation. The participant remained in the virtual environment           contrast was conducted to investigate the unique effect of the emotional
throughout the entire experimental procedure (i.e. until after generating         activation and cognitive restructuring in VR, as opposed to the arousal
alternative thoughts for the third VR situation, after which the session          that potentially resulted from the mere excitement of using VR. Stu
was concluded). There was no further interaction with the virtual                 dentised residuals of each level of the within-subjects factor were
environment. All procedures in the VR component were carried out                  investigated and values > 3 SD or < -3 SD were considered outliers
while the participant was standing.                                               (Osborne, 2017, p. 63). Sensitivity analyses were run without these
                                                                                  outliers. Approximately normally distributed data, investigated through
3.9.2. F2F procedure                                                              Q-Q plots, was considered appropriate and violations of sphericity were
   In the F2F session, participants sat down opposite the experimenter.           countered by reporting Greenhouse-Geisser corrected results. A
The latter then explained the procedure again and followed the protocol           paired-samples t-test was conducted to investigate the difference in
by reading out loud the first scenario according to the randomisation             means between self-reported arousal in the VR and F2F component.
sequence (i.e. set A vs. set B). Before the scenario was read out, partic        Outliers were identified through boxplots, with difference scores beyond
ipants were asked to imagine it vividly and subsequently were asked               1.5 box lengths from the edge of the box removed in a sensitivity
about their feelings regarding the situation, their thoughts and the              analysis.
credibility of these, and to generate alternative cognitions. The experi
menter then continued with the next situation.                                    3.11.2. Other exploratory analyses
                                                                                      Associations between the physiological and the self-reported arousal
3.10. Data preparation                                                            in the experimental components were investigated using Pearson’s
                                                                                  correlations. Moreover, the main analyses on measures of physiological
    Two researchers independently added timestamps to the video re               and self-reported arousal were repeated as two-way mixed ANOVAs
cordings, marking 1) the start of the wearable recordings; 2) the                 with sequence (i.e. first VR vs. first F2F) entered as the between-subjects
beginning and end of the tutorial, and; 3) the beginning and end of the           factor to investigate its potential interaction with the outcome. Statis
six stimuli (i.e. three situations experienced in VR and three imaginative        tically significant interactions were followed by testing simple effects of
situations in F2F). Interrater discrepancies larger than 3 s were resolved        sequence per level of the between-subjects factor through one-way
by a joint re-evaluation of the recordings. The time stamps were then             ANOVAs. To test whether there was a differential change in emotions
used to extract the sequences from the recorded physiological arousal,            between the conditions, another set of two-way mixed ANOVAs was
more specifically 1) the pre-test assessment as a measure of baseline             conducted on the three assessments of the two PANAS subscales. Time
arousal, 2) the VR tutorial, 3) the VR component of the experiment, 4)            was entered as the within-subjects factor and sequence (i.e. first VR vs.
the mid-assessment, and 5) the F2F component of the experiment, with              first F2F) as the between-subjects factor. Statistically significant in
order depending on randomisation sequence.                                        teractions were again followed by testing simple main effects of
    The number of SC peaks per minute, mean SC, and mean HR were                  sequence through ANOVAs.
calculated after the completion of the experiment by a research team
(NDW, BB, GD) independent from those executing the study. The pre-                4. Results
processing of the data was executed as described by Debard et al.
(2020). First, the SC was filtered using a band-pass filter to remove ar         4.1. Participants, pre-test, and experimental characteristics
tefacts. Then, the number of SC peaks in the specific experimental
component was divided by the length of that component. Before                         The procedure was piloted with two individuals. Forty-two students
calculating the mean SC, a median filter was used to remove outliers              subsequently signed up for participation. One participant was excluded
caused by artefacts in the signal. Before calculating its mean, HR was            at pre-test (received psychotherapy in the previous year). The remaining
determined using the BVP-signal. First, a band-pass filter was used to            41 participants consisted of 34 female and seven male students between
remove artefacts, then a peak detection algorithm was used on this                17 and 34 years old (M = 20.59; SD = 3.07). Mean scores on the HADS
filtered signal. The time between these peaks is called the inter-beat            depression and anxiety subscales were 3.19 (SD = 2.61) and 6.07 (SD =
interval (IBI). HR is the inverse of the IBI and was calculated accord           3.77) respectively. Though no participant exceeded the exclusion cut-off
ingly. Moreover, self-reported arousal was retrieved from the recordings          on the HADS depression subscale (≥15), ten reached the cut-off score of
and transformed to mean scores (i.e. one mean score was calculated for            eight for mild clinical manifestations of anxiety only (N = 9), or both
the three VR scenarios, and one mean score was calculated for the three           anxiety and depression (N = 1). The average duration (minutes:seconds)
F2F scenarios).                                                                   for the VR tutorial was 01:55, for the VR component 08:06, and for the
                                                                                  F2F component 08:09.
3.11. Statistical analyses
                                                                                  4.2. Primary outcome: arousal
3.11.1. Main analyses
    If not otherwise specified, all analyses were conducted using SPSS                Differences in the physiological arousal measures per experimental
version 26, with the significance level set at α = 0.05. Means and                component are presented visually in Fig. 3a–c (means and standard
standard deviations of all variables were calculated. To test whether             deviations in Appendix B). Results of the RM-ANOVAs showed statisti
there was a difference in physiological arousal between the VR and F2F            cally significant differences in means between the levels of the within-
component, three one-way repeated measures analyses of variance (RM-              subjects factor for the number of SC peaks per minute, F (2.96, 118.46)
ANOVAs) were conducted, one on each of the three within-subject fac              = 17.52, p < .001, η2p = 0.31, the mean SC, F (1.6, 63.87) = 10.47, p <
tors: number of SC peaks per minute, mean SC, and mean HR. Each within-           .001, η2p = 0.21, and mean HR, F (2.81, 112.21) = 32.63, p < .001, η2p =
subjects factor had five levels denoting the discrete components within           0.45.
the experiment: 1) rest state (pre-test assessment), 2) VR tutorial, 3) VR            The simple contrasts revealed statistically significantly more SC
component, 4) rest state between components (mid-assessment), and 5)              peaks per minute in the VR (M = 4.13, SD = 3.21) compared to the F2F
                                                                              7
F. Bolinski et al.                                                                                                   Behaviour Research and Therapy 142 (2021) 103877
Fig. 3a. SC peaks per minute per experimental component with error bars indicating Bonferroni corrected 98.33% confidence intervals.
Fig. 3b. Mean SC per experimental component with error bars indicating Bonferroni corrected 98.33% confidence intervals.
component (M = 2.61, SD = 2.81), F (1, 40) = 13.89, p = .001, η2p = 0.26,            (1, 40) = 0.33, p = .57, η2p = 0.01. Removal of one outlier on this variable
and in the VR component compared to the VR tutorial (M = 1.69, SD =                  did not change the main or contrast effects.
2.01), F (1, 40) = 53.61, p < .001, η2p = 0.57. The contrast between the                 For mean HR, significantly higher values where found in the VR (M =
VR tutorial and the F2F component failed to reach the Bonferroni cor                115.46, SD = 13.37) compared to the F2F component (M = 104.9, SD =
rected significance level, F (1, 40) = 5.11, p = .03, η2p = 0.11. Similarly,         11.28), F (1, 40) = 75.84, p < .001, η2p = 0.66. Significantly higher mean
the contrasts showed higher mean SC in the VR (M = 2.38, SD = 3.19)                  HR was also found in the VR tutorial (M = 117.04, SD = 18.56)
compared to the F2F component (M = 1.31, SD = 1.34), F (1,40) = 7.47,                compared to the F2F component, F (1, 40) = 45.63, p < .001, η2p = 0.53.
p = .001, η2p = 0.16, and in the VR component compared to the VR                     The contrast between the VR tutorial and the VR component was not
tutorial (M = 1.21, SD = 1.47), F (1, 40) = 12.42, p = .001, η2p = 0.24. No          significant, F (1, 40) = 1.32, p = .26, η2p = 0.03.
difference was found between the VR tutorial and the F2F component, F                    The results of the paired sample t-test showed no difference in mean
                                                                                 8
F. Bolinski et al.                                                                                                 Behaviour Research and Therapy 142 (2021) 103877
Fig. 3c. Mean HR per experimental component with error bars indicating Bonferroni corrected 98.33% confidence intervals.
                                                                                9
F. Bolinski et al.                                                                                                  Behaviour Research and Therapy 142 (2021) 103877
measurement artefacts in HR that could not be accounted for. It is also             F2F procedure. One possible explanation is that those participants
conceivable that the difference in posture influenced the results since             experienced a contrast between the two conditions and had an internal
students were standing during the VR component, whereas the F2F                     expectation that arousal is higher in VR.
procedure was conducted sitting down. Whether - and if so, to what
extent - this had an effect is difficult to ascertain. Research is dated and        5.3. Limitations and future directions
limited to comparisons between the process of standing up and standing,
showing that the process is accompanied by an initial increase in HR,                   The results of our experiment need to be interpreted with caution,
which normalises in less than 20 s (Borst et al., 1982). The time between           given several limitations. First and foremost, it must be considered a first
the standing and sitting experimental components in our study would                 exploratory step and therefore requires replication. Though the use of a
have exceeded this timeframe. Lastly, HR was generally elevated                     non-clinical sample is warranted by safety principles when investigating
throughout the entire experiment. Reasons for this observation, such as             novel therapeutic approaches, it precludes generalisation to other pop
general excitement of participants, are difficult to ascertain, particularly        ulations. For instance, it is not possible to ascertain that the VR appli
in the absence of similar studies.                                                  cation can induce similar responses in depressed individuals,
    Given the novelty of the VR approach for emotionally activating and             particularly since previous research has shown disturbances in physio
restructuring cognitions, an integration of these findings into previous            logical arousal in this group (Hartmann, Schmidt, Sander, & Hegerl,
research is limited by default. We are not aware of any studies that                2019). However, the use of a direct translation of the VR procedure into
monitored arousal during F2F therapy for depression. However, results               a F2F protocol and the randomisation of both the sequence of the two
suggest that the F2F protocol we employed did induce some physio                   modalities, as well as the stimuli used, increases the confidence that our
logical arousal (see Fig. 3a–c and Appendix B). Instead, our results                findings are not inflated.
extend a line of studies showing that VRET for anxiety and related dis                 Caution is also warranted with regard to the physiological data ob
orders can increase physiological arousal, specifically SC (Diemer,                 tained through wearable monitoring. Though studies show that both SC
Mühlberger, Pauli, & Zwanzger, 2014; Meyerbröker & Emmelkamp,                      and HR can be measured on the wrist with satisfactory accuracy, mea
2010). More recently, Counotte et al. (2017) have shown that increasing             sures of SC in particular do not correspond to those of more complex
social stressors in a VRET environment, such as the number of avatars               stationary equipment (De Witte et al., 2020; Konstantinou et al., 2020;
present and their level of hostility, led to increased HR and SC in a group         Menghini et al., 2019; Ollander et al., 2016). Moreover, to develop
of 55 psychosis patients. In fact, the ability to manipulate the level of           optimal virtual environments for various users, within-subject compar
stressful stimuli in VR is considered one of the crucial advantages of              isons between different stimuli are needed. The duration of the scenarios
VRET for anxiety and psychotic disorders (Emmelkamp, Meyerbröker, &                in our experiment was relatively short and the low measurement fre
Morina, 2020; Pot-Kolder et al., 2018).                                             quency of wearable devices is not sensitive enough to compare arousal
    Our findings did not show differences in self-reported arousal be              between the individual situations that participants experienced, or be
tween the VR and F2F component. One explanation lies in the way these               tween the components within each situation. For instance, it was not
were assessed in this context, namely as part of several questions that             possible to test whether arousal generated during the emotional acti
compiled the cognitive restructuring protocol. For instance, some par              vation (i.e. through the initial experience) carried over to the generation
ticipants were undecided and gave multiple responses in a short dura               of alternative thoughts. In light of this limitation, future research could
tion, of which only the first response was recorded. Moreover, in                   prolong the duration of individual experimental components to allow for
contrast to the continuous measurement of SC and HR, self-reported                  comparison between them.
arousal was gathered from one specific instance per situation and then                  Another limitation concerns the conclusions based on self-reported
averaged across situations. Finally, response bias might have resulted in           arousal, for reasons outlined above. Also, the setup did not allow for
socially desirable answers, resulting in similar responses across the               discrimination between positive and negative emotional arousal. Given
experimental conditions (Lavrakas, 2008).                                           the need to wear a headset in parts of the experiment, non-verbal al
                                                                                    ternatives that required participants to indicate the intensity of different
5.2. Other outcomes                                                                 types of emotions using pictorials (Bynion & Feldner, 2017), were less
                                                                                    feasible. Recently, Toet, Heijn, Brouwer, Mioch, and van Erp (2019)
    In our study, self-reported arousal did not correlate significantly with        developed such a pictorial-based tool that could be integrated into the
its physiological counterpart, a finding that has been consistently re             VR environment. This could greatly improve the design of future studies.
ported in previous research (Ciuk et al., 2015). Furthermore, Busscher                  Besides replication, the next steps are to investigate whether
et al. (2020) have shown that even at different intensity levels of                 increased arousal is reflected in better retention of therapeutic content.
exposure therapy for fear of flying, self-reported arousal varies inde             The crossover design was chosen as a strong paradigm to investigate the
pendently from physiological measures. Participants in their study first            principle that VR emotional activation and cognitive restructuring leads
watched a flight video, then used a flight simulator, and ultimately went           to relatively more arousal compared to a F2F procedure. It is however
on an actual flight. Their results showed that HR and self-reported                 less suitable to assess the effect of arousal on retention of alternative
arousal correlated significantly only during the initial video screening,           thoughts. Based on the results of the current study, future experiments
though this relationship was small (r = 0.30), and that greater conver             should include follow-up assessments that measure retention of alter
gence between measures did not predict reductions in flight anxiety.                native thoughts. Moreover, to our knowledge, only one experimental
    Participants in our study who began the experiment with the F2F                 study has assessed whether using VR for cognitive restructuring leads to
procedure showed more SC peaks per minute in this component and a                   a reduction in negative cognitions. Prudenzi et al. (2019) compared
higher mean HR in the mid-assessment between the F2F and VR com                    cognitive defusion, a technique based on acceptance and commitment
ponents. It is conceivable that the former resulted from the direct                 therapy (ACT; Hayes, Luoma, Bond, Masuda, & Lillis, 2006) in VR to an
interaction with the experimenter, which increased stress that gradually            inactive control condition. Compared to the latter, participants in the VR
abated. The latter might have been a result of anticipation, i.e., some             condition reported their personal thoughts to be significantly less
students were looking forward to using the VR application. The results              negative, less believable, and less uncomfortable at post-test (Prudenzi
on the PANAS point to a similar direction, with negative emotions                   et al., 2019).
steadily decreasing during the experimental procedure, and participants
who ended with the VR component reporting significantly more positive               6. Conclusion
emotions at the post-assessment. However, self-reported arousal was
higher in the VR component for participants who first underwent the                    Emotional activation and subsequent cognitive restructuring in VR
                                                                               10
F. Bolinski et al.                                                                                                      Behaviour Research and Therapy 142 (2021) 103877
led to significantly higher levels of physiological arousal compared to                Investigation, Writing – original draft. Anne Etzelmüller: Conceptual
the F2F procedure. This effect exceeded the mere excitement of using a                 ization, Methodology, Investigation, Writing – review & editing. Nele A.
VR application since the actual VR component led to higher levels of SC                J. De Witte: Data curation, Formal analysis, Writing – review & editing.
compared to a VR tutorial. No differences were found on self-reported                  Cecile van Beurden: Investigation, Writing – review & editing. Glen
arousal.                                                                               Debard: Data curation, Formal analysis, Writing – review & editing.
                                                                                       Bert Bonroy: Data curation, Formal analysis, Writing – review & edit
Ethics                                                                                 ing. Pim Cuijpers: Writing – review & editing. Heleen Riper: Writing –
                                                                                       review & editing. Annet Kleiboer: Conceptualization, Methodology,
   This study was approved by the scientific and ethical committee of                  Investigation, Writing – review & editing.
the faculty of behavioural and movement sciences at the Vrije Uni
versiteit Amsterdam (VCWE-2019-159). It has been preregistered at                      Declaration of competing interest
www.figshare.com under DOI 10.6084/m9.figshare.9731306.v1.
                                                                                          Anne Etzelmueller is employed by the Institute for health trainings
Funding                                                                                online (GET.ON/HelloBetter) as research coordinator.
   This study was funded by Opportunities for West II (Dutch: Kansen                   Acknowledgements
voor West II) through the European Regional Development Fund
(ERDF).                                                                                    The authors would like to thank the eGGZ center consortium for their
                                                                                       collaboration, in particular IJsfontein, the ARQ center for psycho
CRediT authorship contribution statement                                               trauma, Interapy, and the Amsterdam Medical Center (AMC).
Pre-test Mean SD
      Table A2
      Means and standard deviations of the PANAS at all assessment points per modality.
Pre-test Mean SD
                                                                                  11
F. Bolinski et al.                                                                                                                              Behaviour Research and Therapy 142 (2021) 103877
M SD M SD M SD M SD
References                                                                                             Croft, R. J., Gonsalvez, C. J., Gander, J., Lechem, L., & Barry, R. J. (2004). Differential
                                                                                                            relations between heart rate and skin conductance, and public speaking anxiety.
                                                                                                            Journal of Behavior Therapy and Experimental Psychiatry, 35(3), 259–271. https://doi.
American Psychological Association. (2018). APA dictionary of psychology:
                                                                                                            org/10.1016/j.jbtep.2004.04.012.
     Overgeneralization. Retrieved from https://dictionary.apa.org/overgeneralization.
                                                                                                       Cuijpers, P., Karyotaki, E., Reijnders, M., & Ebert, D. D. (2019). Was eysenck right after
Anderson, A. K., Yamaguchi, Y., Grabski, W., & Lacka, D. (2006). Emotional memories
                                                                                                            all? A reassessment of the effects of psychotherapy for adult depression. Epidemiology
     are not all created equal: Evidence for selective memory enhancement. Learning &
                                                                                                            and Psychiatric Sciences, 28(1), 21–30. https://doi.org/10.1017/
     Memory (Cold Spring Harbor, N.Y.), 13(6), 711–718. https://doi.org/10.1101/
                                                                                                            S2045796018000057.
     lm.388906.
                                                                                                       De Witte, N. A. J., Scheveneels, S., Sels, R., Debard, G., Hermans, D., & Van Daele, T.
Beck, A. T., & Haigh, E. A. (2014). Advances in cognitive theory and therapy: The generic
                                                                                                            (2020). Augmenting exposure therapy: Mobile augmented reality for specific phobia.
     cognitive model. Annual Review of Clinical Psychology, 10, 1–24. https://doi.org/
                                                                                                            Frontiers in Virtual Reality, 1(8). https://doi.org/10.3389/frvir.2020.00008.
     10.1146/annurev-clinpsy-032813-153734.
                                                                                                       Debard, G., De Witte, N., Sels, R., Mertens, M., Van Daele, T., & Bonroy, B. (2020).
Beck, A. T., Rush, A. J., Shaw, B. F., & Emery, G. (1979). Cognitive therapy of depression.
                                                                                                            Making wearable technology available for mental healthcare through an online
     New York: The Guilford Press.
                                                                                                            platform with stress detection algorithms: The carewear project. Journal of Sensors,
Benjamin, C. L., O’Neil, K. A., Crawley, S. A., Beidas, R. S., Coles, M., & Kendall, P. C.
                                                                                                            2020, 8846077. https://doi.org/10.1155/2020/8846077.
     (2010). Patterns and predictors of subjective units of distress in anxious youth.
                                                                                                       Diemer, J., Lohkamp, N., Mühlberger, A., & Zwanzger, P. (2016). Fear and physiological
     Behavioural and Cognitive Psychotherapy, 38(4), 497–504. https://doi.org/10.1017/
                                                                                                            arousal during a virtual height challenge–effects in patients with acrophobia and
     S1352465810000287.
                                                                                                            healthy controls. Journal of Anxiety Disorders, 37, 30–39. https://doi.org/10.1016/j.
Bjelland, I., Dahl, A. A., Haug, T. T., & Neckelmann, D. (2002). The validity of the
                                                                                                            janxdis.2015.10.007.
     Hospital Anxiety and Depression Scale. An updated literature review. Journal of
                                                                                                       Diemer, J., Mühlberger, A., Pauli, P., & Zwanzger, P. (2014). Virtual reality exposure in
     Psychosomatic Research, 52(2), 69–77.
                                                                                                            anxiety disorders: Impact on psychophysiological reactivity. World Journal of
Borst, C., Wieling, W., Van Brederode, J., Hond, A., De Rijk, L., & Dunning, A. (1982).
                                                                                                            Biological Psychiatry, 15(6), 427–442. https://doi.org/10.3109/
     Mechanisms of initial heart rate response to postural change. American Journal of
                                                                                                            15622975.2014.892632.
     Physiology - Heart and Circulatory Physiology, 243(5), H676–H681.
                                                                                                       Emmelkamp, P. M. G., Meyerbröker, K., & Morina, N. (2020). Virtual reality therapy in
Boucsein, W., Fowles, D. C., Grimnes, S., Ben-Shakhar, G., roth, W. T., Dawson, M. E.,
                                                                                                            social anxiety disorder. Current Psychiatry Reports, 22(7), 32. https://doi.org/
     et al. (2012). Publication recommendations for electrodermal measurements.
                                                                                                            10.1007/s11920-020-01156-1.
     Psychophysiology, 49(8), 1017–1034. https://doi.org/10.1111/j.1469-
                                                                                                       Empatica. (2016). Empatica E4. Retrieved from https://empatica.com/.
     8986.2012.01384.x.
                                                                                                       Empatica Support. (2019). Empatica Support. Retrieved from https://support.empatica.
Boucsein, W., Haarmann, A., & Schaefer, F. (2007). Combining skin conductance and heart
                                                                                                            com/hc/en-us/articles/206374015-Wear-your-E4-wristband-#EDA-note.
     rate variability for adaptive automation during simulated IFR flight (Berlin, Heidelberg).
                                                                                                       Georgescu, R., Fodor, L. A., Dobrean, A., & Cristea, I. A. (2020). Psychological
Brooke, J. (1996). In P. W. Jordan, B. Thomas, B. A. Weerdmeester, & A. L. McClelland
                                                                                                            interventions using virtual reality for pain associated with medical procedures: A
     (Eds.), “SUS—a quick and dirty usability scale”. London, UK: Taylor and Francis.
                                                                                                            systematic review and meta-analysis. Psychological Medicine, 50(11), 1795–1807.
Bruijniks, S. J. E., DeRubeis, R. J., Hollon, S. D., & Huibers, M. J. H. (2019). The potential
                                                                                                            https://doi.org/10.1017/s0033291719001855.
     role of learning capacity in cognitive behavior therapy for depression: A systematic
                                                                                                       de Graaf, L. E., Roelofs, J., & Huibers, M. J. (2009). Measuring dysfunctional attitudes in
     review of the evidence and future directions for improving therapeutic learning.
                                                                                                            the general population: The dysfunctional attitude scale (form A) revised. Cognitive
     Clinical Psychological Science, 7(4), 668–692. https://doi.org/10.1177/
                                                                                                            Therapy and Research, 33(4), 345–355. https://doi.org/10.1007/s10608-009-9229-
     2167702619830391.
                                                                                                            y.
Busscher, B., Spinhoven, P., & de Geus, E. J. C. (2020). Synchronous change in subjective
                                                                                                       Greenberg, L. S., Auszra, L., & Herrmann, I. R. (2007). The relationship among emotional
     and physiological reactivity during flight as an indicator of treatment outcome for
                                                                                                            productivity, emotional arousal and outcome in experiential therapy of depression.
     aviophobia: A longitudinal study with 3-year follow-up. Journal of Behavior Therapy
                                                                                                            Psychotherapy Research, 17(4), 482–493. https://doi.org/10.1080/
     and Experimental Psychiatry, 67, 101443. https://doi.org/10.1016/j.
                                                                                                            10503300600977800.
     jbtep.2018.12.004.
                                                                                                       Gueorguieva, R., Chekroud, A. M., & Krystal, J. H. (2017). Trajectories of relapse in
Bynion, T.-M., & Feldner, M. T. (2017). Self-assessment manikin. Encyclopedia of
                                                                                                            randomised, placebo-controlled trials of treatment discontinuation in major
     personality and individual differences, 1–3.
                                                                                                            depressive disorder: An individual patient-level data meta-analysis. The Lancet
Cahill, L., & McGaugh, J. L. (1995). A novel demonstration of enhanced memory
                                                                                                            Psychiatry, 4(3), 230–237. https://doi.org/10.1016/S2215-0366(17)30038-X.
     associated with emotional arousal. Consciousness and Cognition, 4(4), 410–421.
                                                                                                       Hartmann, R., Schmidt, F. M., Sander, C., & Hegerl, U. (2019). Heart rate variability as
     https://doi.org/10.1006/ccog.1995.1048.
                                                                                                            indicator of clinical state in depression. Frontiers in Psychiatry, 9(735). https://doi.
Carl, E., Stein, A. T., Levihn-Coon, A., Pogue, J. R., Rothbaum, B., Emmelkamp, P., …
                                                                                                            org/10.3389/fpsyt.2018.00735.
     Powers, M. B. (2019). Virtual reality exposure therapy for anxiety and related
                                                                                                       Hayes, S. C., Luoma, J. B., Bond, F. W., Masuda, A., & Lillis, J. (2006). Acceptance and
     disorders: A meta-analysis of randomized controlled trials. Journal of Anxiety
                                                                                                            commitment therapy: Model, processes and outcomes. Behaviour Research and
     Disorders, 61, 27–36. https://doi.org/10.1016/j.janxdis.2018.08.003.
                                                                                                            Therapy, 44(1), 1–25. https://doi.org/10.1016/j.brat.2005.06.006.
Carryer, J. R., & Greenberg, L. S. (2010). Optimal levels of emotional arousal in
                                                                                                       HTC, & Corporation, V. (2016). HTC vive. Retrieved from https://www.vive.com/eu
     experiential therapy of depression. Journal of Consulting and Clinical Psychology, 78
                                                                                                            /product/.
     (2), 190.
                                                                                                       IJsfontein. (2019). Lunchroom Zondag. Retrieved from https://www.ijsfontein.nl/pro
Christopoulos, G. I., Uy, M. A., & Yap, W. J. (2019). The body and the brain: Measuring
                                                                                                            jecten/lunchroom-zondag.
     skin conductance responses to understand the emotional experience. Organizational
                                                                                                       James, S. L., Abate, D., Abate, K. H., Abay, S. M., Abbafati, C., Abbasi, N., …
     Research Methods, 22(1), 394–420. https://doi.org/10.1177/1094428116681073.
                                                                                                            Murray, C. J. L. (2018). Global, regional, and national incidence, prevalence, and
Ciuk, D., Troy, A., & Jones, M. (2015). Measuring emotion: Self-reports vs. physiological
                                                                                                            years lived with disability for 354 diseases and injuries for 195 countries and
     indicators. Physiological Indicators (April 16, 2015).
                                                                                                            territories, 1990–2017: A systematic analysis for the global burden of disease study
Clark, D. A. (2013). Cognitive restructuring. The Wiley handbook of cognitive behavioral
                                                                                                            2017. The Lancet, 392(10159), 1789–1858. https://doi.org/10.1016/S0140-6736
     therapy.
                                                                                                            (18)32279-7.
Corporation, V. (2020). SteamVR. Retrieved from https://steamvr.com/en/.
                                                                                                       Kasteleijn-Nolst Trenite, D. G., Martins da Silva, A., Ricci, S., Rubboli, G., Tassinari, C. A.,
Counotte, J., Pot-Kolder, R., van Roon, A. M., Hoskam, O., van der Gaag, M., & Veling, W.
                                                                                                            Lopes, J., … Segers, J. P. (2002). Video games are exciting: A European study of
     (2017). High psychosis liability is associated with altered autonomic balance during
                                                                                                            video game-induced seizures and epilepsy. Epileptic Disorders, 4(2), 121–128.
     exposure to Virtual Reality social stressors. Schizophrenia Research, 184, 14–20.
                                                                                                       Kennedy, R. S., Lane, N. E., Berbaum, K. S., & Lilienthal, M. G. (1993). Simulator sickness
     https://doi.org/10.1016/j.schres.2016.11.025.
                                                                                                            questionnaire: An enhanced method for quantifying simulator sickness. The
                                                                                                            International Journal of Aviation Psychology, 3(3), 203–220.
                                                                                                  12
F. Bolinski et al.                                                                                                                            Behaviour Research and Therapy 142 (2021) 103877
Konstantinou, P., Trigeorgi, A., Georgiou, C., Gloster, A. T., Panayiotou, G., & Karekla, M.              behavioural therapy versus waiting list control for paranoid ideation and social
    (2020). Comparing apples and oranges or different types of citrus fruits? Using                       avoidance in patients with psychotic disorders: A single-blind randomised controlled
    wearable versus stationary devices to analyze psychophysiological data.                               trial. The Lancet Psychiatry, 5(3), 217–226. https://doi.org/10.1016/S2215-0366
    Psychophysiology, 57(5). https://doi.org/10.1111/psyp.13551. e13551.                                  (18)30053-1.
Krohn, S., Tromp, J., Quinque, E. M., Belger, J., Klotzsche, F., Rekers, S., … Thöne-                Prudenzi, A., Rooney, B., Presti, G., Lombardo, M., Lombardo, D., Messina, C., et al.
    Otto, A. (2020). Multidimensional evaluation of virtual reality paradigms in clinical                 (2019). Testing the effectiveness of virtual reality as a defusion technique for coping
    neuropsychology: Application of the VR-check framework. Journal of Medical Internet                   with unwanted thoughts. Virtual Reality, 23(2), 179–185. https://doi.org/10.1007/
    Research, 22(4). https://doi.org/10.2196/16724. e16724.                                               s10055-018-0372-1.
Kuyken, W., Padesky, C. A., & Dudley, R. (2009). Collaborative case conceptualization:                Rizzo, A. S., & Koenig, S. T. (2017). Is clinical virtual reality ready for primetime?
    Working effectively with clients in cognitive-behavioral therapy. New York, NY, US:                   Neuropsychology, 31(8), 877–899. https://doi.org/10.1037/neu0000405.
    Guilford Press.                                                                                   Salkevicius, J., Damaaevjius, R., Maskeljknas, R., & Laukien, I. (2019). Anxiety level
Kyriakou, K., Resch, B., Sagl, G., Petutschnig, A., Werner, C., Niederseer, D., … Pykett, J.              recognition for virtual reality therapy system using physiological signals. Electronics,
    (2019). Detecting moments of stress from measurements of wearable physiological                       8, 1039.
    sensors. Sensors, 19(17), 3805.                                                                   Scher, C. D., Ingram, R. E., & Segal, Z. V. (2005). Cognitive reactivity and vulnerability:
Lavrakas, P. J. (2008). Encyclopedia of survey research methods. https://doi.org/10.4135/                 Empirical evaluation of construct activation and cognitive diatheses in unipolar
    9781412963947.                                                                                        depression. Clinical Psychology Review, 25(4), 487–510. https://doi.org/10.1016/j.
Lindner, P., Hamilton, W., Miloff, A., & Carlbring, P. (2019). How to treat depression                    cpr.2005.01.005.
    with low-intensity virtual reality interventions: Perspectives on translating cognitive           Schubert, T., Friedmann, F., & Regenbrecht, H. (2001). The experience of presence:
    behavioral techniques into the virtual reality modality and how to make anti-                         Factor Analytic insights. Presence, 10, 266–281. https://doi.org/10.1162/
    depressive use of virtual reality–unique experiences. Frontiers in Psychiatry, 10(792).               105474601300343603.
    https://doi.org/10.3389/fpsyt.2019.00792.                                                         Sharot, T., & Phelps, E. A. (2004). How arousal modulates memory: Disentangling the
Lorenzo-Luaces, L., German, R. E., & DeRubeis, R. J. (2015). It’s complicated: The                        effects of attention and retention. Cognitive, Affective, & Behavioral Neuroscience, 4(3),
    relation between cognitive change procedures, cognitive change, and symptom                           294–306. https://doi.org/10.3758/cabn.4.3.294.
    change in cognitive therapy for depression. Clinical Psychology Review, (41), 3–15.               Snow, G. (2013). Package ’blockrand’. Randomization for block random clinical trials.
    https://doi.org/10.1016/j.cpr.2014.12.003.                                                            Retrieved from https://cran.r-project.org/web/packages/blockrand/.
Malik, M. (1996). Heart rate variability: Standards of measurement, physiological                     Stern, A. F. (2014). The hospital anxiety and depression scale. Occupational Medicine, 64
    interpretation and clinical use. Task force of the European society of cardiology and                 (5), 393–394. https://doi.org/10.1093/occmed/kqu024.
    the North American society of pacing and electrophysiology. Circulation, 93(5),                   Tanner, B. A. (2012). Validity of global physical and emotional SUDS. Applied
    1043–1065.                                                                                            Psychophysiology and Biofeedback, 37(1), 31–34. https://doi.org/10.1007/s10484-
McGaugh, J. L. (2018). Emotional arousal regulation of memory consolidation. Current                      011-9174-x.
    Opinion in Behavioral Sciences, 19, 55–60. https://doi.org/10.1016/j.                             Thew, G. R., Gregory, J. D., Roberts, K., & Rimes, K. A. (2017). Self-critical thinking and
    cobeha.2017.10.003.                                                                                   overgeneralization in depression and eating disorders: An experimental study.
Mellick, W. H., Mills, J. A., Kroska, E. B., Calarge, C. A., Sharp, C., & Dindo, L. N. (2019).            Behavioural and Cognitive Psychotherapy, 45(5), 510–523. https://doi.org/10.1017/
    Experiential avoidance predicts persistence of major depressive disorder and                          S1352465817000327.
    generalized anxiety disorder in late adolescence. Journal of Clinical Psychiatry, 80(6),          Toet, A., Heijn, F., Brouwer, A.-M., Mioch, T., & van Erp, J. B. F. (2019). The EmojiGrid as
    18m12265. https://doi.org/10.4088/JCP.18m12265.                                                       an immersive self-report tool for the affective assessment of 360. Cham: VR Videos.
Menghini, L., Gianfranchi, E., Cellini, N., Patron, E., Tagliabue, M., & Sarlo, M. (2019).            Tremayne, P., & Barry, R. J. (2001). Elite pistol shooters: Physiological patterning of best
    Stressing the accuracy: Wrist-worn wearable sensor validation over different                          vs. worst shots. International Journal of Psychophysiology, 41(1), 19–29. https://doi.
    conditions. Psychophysiology, 56(11). https://doi.org/10.1111/psyp.13441. e13441.                     org/10.1016/s0167-8760(00)00175-6.
Meyerbröker, K., & Emmelkamp, P. M. G. (2010). Virtual reality exposure therapy in                   Watson, D., Clark, L. A., & Tellegen, A. (1988). Development and validation of brief
    anxiety disorders: A systematic review of process-and-outcome studies. Depression                     measures of positive and negative affect: The PANAS scales. Journal of Personality
    and Anxiety, 27(10), 933–944. https://doi.org/10.1002/da.20734.                                       and Social Psychology, 54. https://doi.org/10.1037/0022-3514.54.6.1063.
Mol, M., van Schaik, A., Dozeman, E., Ruwaard, J., Vis, C., Ebert, D. D., … Smit, J. H.               Weissman, A. N., & Beck, A. T. (1978). Development and validation of the dysfunctional
    (2020). Dimensionality of the system usability scale among professionals using                        attitude scale: A preliminary investigation.
    internet-based interventions for depression: A confirmatory factor analysis. BMC                  Wenze, S. J., Gunthert, K. C., & Forand, N. R. (2010). Cognitive reactivity in everyday life
    Psychiatry, 20(1), 218. https://doi.org/10.1186/s12888-020-02627-8.                                   as a prospective predictor of depressive symptoms. Cognitive Therapy and Research,
North, M. M., North, S. M., & Coble, J. R. (1998). Virtual reality therapy: An effective                  34(6), 554–562. https://doi.org/10.1007/s10608-010-9299-x.
    treatment for phobias. Studies in Health Technology and Informatics, 58, 112–119.                 Wilhelm, F. H., Pfaltz, M. C., Gross, J. J., Mauss, I. B., Kim, S. I., & Wiederhold, B. K.
Ollander, S., Godin, C., Campagne, A., & Charbonnier, S. (9-12 Oct. 2016). A comparison                   (2005). Mechanisms of virtual reality exposure therapy: The role of the behavioral
    of wearable and stationary sensors for stress detection. Paper presented at the 2016.                 activation and behavioral inhibition systems. Applied Psychophysiology and
    In IEEE international conference on systems, man, and cybernetics (SMC).                              Biofeedback, 30(3), 271–284. https://doi.org/10.1007/s10484-005-6383-1.
Opris, D., Pintea, S., Garcia-Palacios, A., Botella, C., Szamoskozi, S., & David, D. (2012).          Wojnarowski, C., Firth, N., Finegan, M., & Delgadillo, J. (2019). Predictors of depression
    Virtual reality exposure therapy in anxiety disorders: A quantitative meta-analysis.                  relapse and recurrence after cognitive behavioural therapy: A systematic review and
    Depression and Anxiety, 29(2), 85–93. https://doi.org/10.1002/da.20910.                               meta-analysis. Behavioural and Cognitive Psychotherapy, 47(5), 514–529.
Osborne, J. (2017). Regression & linear modeling: Best practices and modern methods.                  Wolfensberger, W., & O’Connor, N. (1967). Relative effectiveness of galvanic skin
Osugi, A., & Ohira, H. (2018). Emotional arousal at memory encoding enhanced P300 in                      response latency, amplitude and duration scores as measures of arousal and
    the concealed information test. Frontiers in Psychology, 8(2334). https://doi.org/                    habituation in normal and retarded adults. Psychophysiology, 3(4), 345–350. https://
    10.3389/fpsyg.2017.02334.                                                                             doi.org/10.1111/j.1469-8986.1967.tb02718.x.
Peeters, F., Ponds, R., & Vermeeren, M. (1996). Affectiviteit en zelfbeoordeling van                  Zigmond, A. S., & Snaith, R. P. (1983). The hospital anxiety and depression scale. Acta
    depressie en angst. [Affect and self-report of depression and anxiety.]. Tijdschrift                  Psychiatrica Scandinavica, 67(6), 361–370. https://doi.org/10.1111/j.1600-
    Voor Psychiatrie, 38(3), 240–250.                                                                     0447.1983.tb09716.x.
Pot-Kolder, R. M. C. A., Geraets, C. N. W., Veling, W., van Beilen, M., Staring, A. B. P.,
    Gijsman, H. J., … van der Gaag, M. (2018). Virtual-reality-based cognitive
13