0% found this document useful (0 votes)
27 views75 pages

Master Jacob

This master's thesis investigates predicting human perception and cognitive state while reading using eye tracking and physiological sensing. The author conducted two experiments: 1) A feasibility study correlating sensor data like pupil diameter and nose temperature with cognitive states like effort and confidence for 12 high school students reading physics materials. Results found these measures highly correlated. 2) A main study detecting reader "interest" in 18 news articles for 15 university students using eye tracking features and physiological data like EDA and BVP from a wristband. Interest was classified with 75%/50% accuracy using eye movements and 68%/50% using physiological data. This research can help predict reader interest in real-time and improve human-document interaction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
27 views75 pages

Master Jacob

This master's thesis investigates predicting human perception and cognitive state while reading using eye tracking and physiological sensing. The author conducted two experiments: 1) A feasibility study correlating sensor data like pupil diameter and nose temperature with cognitive states like effort and confidence for 12 high school students reading physics materials. Results found these measures highly correlated. 2) A main study detecting reader "interest" in 18 news articles for 15 university students using eye tracking features and physiological data like EDA and BVP from a wristband. Interest was classified with 75%/50% accuracy using eye movements and 68%/50% using physiological data. This research can help predict reader interest in real-time and improve human-document interaction.
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 75

M ASTERS T HESIS

Predicting human perception while reading using

eye tracking and other metrics

Author: Supervisors:
Soumy J ACOB Shoya I SHIMARU
Dr. Ing. Syed Saqib B HUKHARI
Prof. Dr. Andreas D ENGEL

Thesis submitted in partial fulfillment of the requirements


for the degree of Master of Science in Computer Science

in the

Department of Computer Science


TU Kaiserslautern, Germany
ii

August 10, 2018


iii

Declaration of Authorship
I, Soumy J ACOB, declare that this thesis titled, “Predicting human perception while
reading using eye tracking and other metrics” and the work presented in it are my
own. I confirm that:

• This work was done wholly or mainly while in candidature for a research de-
gree at this University.

• Where any part of this thesis has previously been submitted for a degree or
any other qualification at this University or any other institution, this has been
clearly stated.

• Where I have consulted the published work of others, this is always clearly
attributed.

• Where I have quoted from the work of others, the source is always given. With
the exception of such quotations, this thesis is entirely my own work.

• I have acknowledged all main sources of help.

• Where the thesis is based on work done by myself jointly with others, I have
made clear exactly what was done by others and what I have contributed
myself.

Signed:

Date:
v

“So we fix our eyes not on what is seen, but on what is unseen, since what is seen is tempo-
rary, but what is unseen is eternal. ”

..............
vii

Abstract

Cognitive state is the state of mind or the process of thought. Eye movements,
heart rate variability and electrodermal activity have been effectively used to in-
vestigate neuropsychology. These measures have been used to analyse emotional
and cognitive states involving arousal, frustration, mental workload, comprehen-
sion, and tiredness. However, we realized that these measures can be used during
the task of reading to predict a person’s level of interest in the reading material. Be-
ing aware of this can promote a person’s habit of reading, provide internal feedback,
present relevant information and also improve teaching and subsequently learning.
In this thesis, we present how various useful features were extracted from eye
tracking and physiological (wristband) data recorded in natural environment and
further used to detect the interest of a reader on the text and his/her level of con-
fidence while solving questions based on the text. On the dataset of 12 high school
students’ reading behaviors, we found that the changing of pupil diameter and nose
temperature are highly correlated with their cognitive states including their level of
confidence and effort for reading/solving tasks on learning materials in Physics. To
support the main research topic, we used a different dataset which comprised of
the reading behavior of 15 university students on 18 newspaper articles, we have
used eye tracking measures and physiological data like heart rate and electrodermal
activity (etc.) to predict the documents each participant finds interesting or uninter-
esting. We have classified their interests into two/four classes with an accuracy of
75%/50% using eye movements and 68%/50% using physiological data from an E4
wristband. This research can be incorporated in the real-time prediction of a user’s
reading interest, for the betterment of future designs of human-document interac-
tion.
ix

Acknowledgements
I would like to thank my supervisors, Shoya Ishimaru, Dr,-Ing. Syed Saqib Bukhari
and Prof. Dr. h.c. Andreas Dengel for their unfailing support throughout my stud-
ies at TU Kaiserslautern. My special gratitude to Shoya Ishimaru, who offered me
relentless guidance, knowledge and resources during my research thesis at DFKI.
I am also grateful to my parents, brothers and other family members for their
moral and emotional support all along the way. And finally, last but by no means
the least, to my friends and colleagues, for their assistance with the experiments, for
all the detailed discussions and for making DFKI a cracking place to work at . . .
xi

Contents

Declaration of Authorship iii

Abstract vii

Acknowledgements ix

1 Introduction 1
1.1 Motivation . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
1.2 Contributions of the thesis . . . . . . . . . . . . . . . . . . . . . . . . . . 4
1.3 Thesis Structure . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5

2 Technical Background & Related Work 7


2.1 Eye Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
2.2 Medical and physiological sensing . . . . . . . . . . . . . . . . . . . . . 8
2.2.1 Electrodermal Activity . . . . . . . . . . . . . . . . . . . . . . . . 10
2.2.2 Blood Volume Pulse and Heart Rate . . . . . . . . . . . . . . . . 12

3 Biometric Sensors 13
3.1 SMI REDn Scientific eye tracker . . . . . . . . . . . . . . . . . . . . . . . 13
3.2 Empatica E4 wristband . . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Flir One thermal imaging camera . . . . . . . . . . . . . . . . . . . . . . 16

4 Feasibility Study : Sensor data correlation with cognitive state while read-
ing 17
4.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.1 Eye Tracking . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.2 Nose temperature tracking . . . . . . . . . . . . . . . . . . . . . 18
4.2 Experimental Design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
xii

4.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21

5 Main research : Detecting ’interest’ while reading using biometric sensors 25


5.1 Method . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.1 Gaze event detection . . . . . . . . . . . . . . . . . . . . . . . . . 25
5.1.2 Features from Eye Gaze data . . . . . . . . . . . . . . . . . . . . 28
5.1.3 Features from EDA . . . . . . . . . . . . . . . . . . . . . . . . . . 29
5.1.4 Features from BVP . . . . . . . . . . . . . . . . . . . . . . . . . . 31
5.1.5 Classification . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 32
Support Vector Classifier . . . . . . . . . . . . . . . . . . . . . . 33
Random Forest Classifier . . . . . . . . . . . . . . . . . . . . . . 34
Cross Validation . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.2 Experimental design . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.3 Results . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38

6 Discussion 47

7 Conclusion and Future work 51

Bibliography 55
xiii

List of Figures

2.1 EEG measurement setup. . . . . . . . . . . . . . . . . . . . . . . . . . . . 9


2.2 ECG measurement setup. . . . . . . . . . . . . . . . . . . . . . . . . . . 10

3.1 SMI REDn Scientific eye tracker. . . . . . . . . . . . . . . . . . . . . . . 13


3.2 E4 wristband and signals . . . . . . . . . . . . . . . . . . . . . . . . . . . 14
3.3 Flir One thermal imaging camera. . . . . . . . . . . . . . . . . . . . . . . 15

4.1 Fixations during solving . . . . . . . . . . . . . . . . . . . . . . . . . . . 18


4.2 Changes in nose temperature . . . . . . . . . . . . . . . . . . . . . . . . 19
4.3 Experimental setup for Experiment 1 . . . . . . . . . . . . . . . . . . . . 20
4.4 Pearson correlation between surveys and slope of nose temperature. . 22
4.5 Pearson correlation between surveys and mean of pupil diameter. . . . 22

5.1 Gaze scan path . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 26


5.2 Gaze events calculated by sensor signal from an eye tracker . . . . . . 27
5.3 Fixation detection by Buscher et al. . . . . . . . . . . . . . . . . . . . . . 28
5.4 EDA Signal and its components - Phasic and Tonic. . . . . . . . . . . . 30
5.5 Blood Volume pulse signal after peak detection. . . . . . . . . . . . . . 32
5.6 Cross Validation methods. Color Notation : Red - A single recording,
Yellow - Data specific to one document, Green - Data specific to one
participant. ’pi dj ’ denotes the recording of the ith participant’s j th
document. . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 34
5.7 Experiment 2 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 35
5.8 Experiment 2 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.9 Experiment 2 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 36
5.10 Experiment 2 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
5.11 Experiment 2 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 37
xiv

5.12 Experiment 2 setup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38


5.13 SVM LORO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.14 SVM LORO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 40
5.15 RF LORO . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 41
5.16 Pearson correlations between features and interests of each participant 41
5.17 Confusion matrix with 50% accuracy for four-class SVC. . . . . . . . . 43
5.18 Confusion matrix with 68% accuracy for binary SVC. (1 and 2 vs. 3
and 4) . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 44
5.19 Confusion matrix with 38% accuracy for four-class Random Forest. . . 44

6.1 Error in Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 48


6.2 Error in Experiment . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 49
xv

List of Tables

4.1 Thirteen Survey Questions. . . . . . . . . . . . . . . . . . . . . . . . . . 20


4.2 Pearson Correlation and p-values of features on all 13 survey questions. 23

5.1 List of features from eye data. ’SD’ is short for standard deviation . . . 29
5.2 List of features from E4 data . . . . . . . . . . . . . . . . . . . . . . . . . 33
5.3 Classification accuracy using SVM classifier on eye-related and other
features[%] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 38
5.4 Classification accuracy using Random Forest on eye-related and other
features[%] . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 39
5.5 Importance of eye-related features in the Random Forest using all fea-
tures . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 42
5.6 SVM classification of E4-related features . . . . . . . . . . . . . . . . . . 43
5.7 Importance of physiological features with the Random Forest Classi-
fier (values between 0 and 1) . . . . . . . . . . . . . . . . . . . . . . . . . 45

7.1 SVM classification results. . . . . . . . . . . . . . . . . . . . . . . . . . . 51


1

Chapter 1

Introduction

’Empowering education to produce innovation’, the aim of Education 4.0 is imperative


for the rapid pace of emergence of the fourth Industrial Revolution, which requires
human capital enhancements. A very high demand is put on knowledge produc-
tion and its innovation applications. The learning model in Education 4.0 put high
emphasis on imbibing – learning concepts, iterating – rigorous use of the skills, inter-
preting – application of these skills in situations, interest – invoking curiosity about
a subject so that it can be studied deeply, and innovating – originality and novelty
of thinking and building new concepts. This model is deeply rooted on reading,
learning and cultivating interest and this has to start during schooling and has to be
given utmost importance.
Reading has always been the foundation of all learning and thereby education.
This soft skill is a source of creative power but one which is to be easily instilled in
people. Reading is a complex linguistic activity which is not just based on perception
and motion but also cognition. Interest in reading can be motivated by concentra-
tion, curiosity, and demand. It may not rise out of habit but will help motivate it
and subsequently the learning process. Interest motivates earnestness in a person.
According to Ibrahim Bafadal, "Reading is a process of capturing or acquiring the
concepts intended by the author, interpret, evaluate the author’s concepts, and re-
flect, or act as intended of those concepts". Hence it not only depends on the ability
to interpret and evaluate the contents but also the will to do the same for comprehen-
sive understanding [1–3]. This urge in reading, if recognized, can be used to improve
the data made available to the reader and also help in better human-document in-
teraction and the design. Predicting a reader’s interest can help to make document
2 Chapter 1. Introduction

more interactive or dynamic.

1.1 Motivation

To promote learnability, readers are required to find the reading material interesting
and if so, be presented with even more information until the reader is fully satisfied
with the knowledge he has acquired. Cognitive states like cognitive load, tiredness,
comprehension has been discussed and worked upon by other researchers but ’inter-
est’ on the reading material is most important to determine if the user will retain and
reuse the text he/she just read. Recognizing a reader’s interest or boredom on the
reading material will not only foster learning but also assist teachers to find out what
students are passionate about. This can help provide students with a one-teacher-
for-every-student system which the current education system just cannot provide.
Hence this thesis focuses on how interest and other such cognitive processes can be
detected and predicted by using measures from eye trackers and other biometric
sensors. And how data mining is done for efficient data/signal processing to extract
useful features from the raw biometric data.
For this purpose, we needed factors (sensors) that would help to better under-
stand the reader’s thought and emotion and thereby, evolve personalization to re-
flect the same. Various sensors has been used to subtly extract relevant data that
represent human thought process. Aggregated data from such devices can help the
system understand how internal and external factors affect the reader, his interest
and his engagement while reading. As a result, text can be redesigned in order to
keep the reader engaged, attentive and interested. Emotion sensing is a popular
field these days but it makes use of natural language understanding which requires
explicit speech-based feedback from the user, which is not always available. Instead,
emotion sensing via sensors will enable comfortable and natural user interactions.
This motivated us to find sources of feedback which can be extracted implicitly and
without disturbing the user’s task at hand. By capturing reactions to the content of
the text, the system can come up with more recommendations and remedies. More
than just creating a visual design, it could combine monitoring and analysis of be-
haviour or thought patterns and assess the emotion and respond accordingly.
1.1. Motivation 3

To develop such a system, it is necessary to investigate what kind of features


can be used to recognize the learners’ cognitive states. One of the most effective
approaches to investigate reading activity is to utilize data that represent a person’s
cognitive neuropsychology and physiology - eye movements, heart rate, skin con-
ductance, face/skin temperature etc.
While reading. eyes recognize information, and eye movements are a form of
continuous reaction to the visual information. They form a vocabulary of their own
comprised of fixations, saccades, regressions, pupil diameter and blinks [4]. During
this process, sensory, motor and mental processes like attention, memory, pattern
recognition, decision making takes place. Eyes make a series of short and fast move-
ments called saccades during which vision is suppressed and a set of short pauses in
between saccades called fixations when visual information is perceived. The process
of perceiving information is constituted in the frontal, temporal and parietal lobes.
This is further understood with neuro-physiological data obtained with functional
magnetic resonance imaging. But fMRI has limitations on giving an insight on the
study of reading as eye movements are removed from the process. Saccades and
fixations indirectly represent the brain’s linguistic processes and thus its cognitive
functions, which lead to comprehension during reading. These measures behave
differently in changing circumstances and give us an insight of cognitive states –
thought processes and emotions, during the reading task [5]. Thus eye tracking or
real-time tracking of the gaze position have been widely used to study the process
of reading and also analyze a person’s attention, tiredness and cognitive load.
Cognition or emotion is consistently constituted by autonomous nervous system
responses. Physiological measures like heart rate variability, electrodermal activ-
ity and skin temperature reflect a person’s affective state during anger, boredom,
frustration, engagement and arousal. Research has been done on how to correlate
these affective states with learning to empower the experience. EEG, ECG etc. have
been used to measure this data in previous research but these days, wristbands mea-
sure physiological data just as well. The Empatica E4 wristband (Figure 3.2) is one
such device which has biometric sensors to monitor this data, real-time. These mea-
sures, if used to predict a user’s interest, comprehension and difficulty, can influence
his/her interaction with the learning environment and thereby foster the learning
4 Chapter 1. Introduction

process [6]. This can further assist teaching techniques and promote understanding
and active interest in students.
For this research, cognitive processes during the process of reading is monitored
by using pervasive remote eye trackers, thermal cameras and wristbands instead of
obtrusive and distracting equipment like electrodes, head mounted eye trackers or
glasses etc. This ensures that students involved in the reading task can stay focused
on acquiring knowledge. We use the SMI REDn Scientific eye tracker, FLIR ONE
for iOS (thermal camera) and the Empatica E4 wristband for heart rate and skin
conductance in this research. E4 data processing is even more useful as the research
can be extended to reading on paper as well and not just off computer screens. These
sensors were unobtrusive to the user and are widely used in daily life.

1.2 Contributions of the thesis

The main objective of this thesis is to investigate if learning and the experience of
learning can be improved by monitoring/predicting the interest and engagement
of the reader by using various sensors in real-time. In this context, the following
problem statements are also investigated.
(1) How accurately can unobtrusive biometric sensors like remote eye trackers
and physiological data-sensing wristbands be used to record eye gaze, pupil diame-
ter, blinks, photoplethysmography (PPG), electrodermal activity, and skin tempera-
ture during the task of reading.
(2) How to extract meaningful features from this data (biometric data-mining) to
provide better insight on the affective state of the reader.
(3) How to use machine learning algorithms for efficient feature selection and
classification of the level of interest, confidence, comprehension etc. of the reader.
(4) What are the challenges faced during the data collection, feature extraction
and classification.
We also conducted experiments with students to retrieve data for this research,
which is also explained in detail for future purposes.
1.3. Thesis Structure 5

1.3 Thesis Structure

Chapter 2 discusses the technical background of the study namely cognitive state
analysis, eye tracking, medical and physiological sensing - HR, EDA and skin tem-
perature; and the related work that has been surveyed for learning and research pur-
poses. The first experiment that we conducted in a high school, to better understand
the feasibility of this research topic and to investigate augmenting dynamic text ac-
cording to eye responses and nose temperature, is described in detail in Chapter 3.
In Chapter 4, we delve into the research topic of this thesis; the experiment, biomet-
ric sensor data-mining, and results. The evaluation and discussion on the results
and findings is discussed in Chapter 5 and further conclude in Chapter 6.
7

Chapter 2

Technical Background & Related


Work

There has been a lot of study done on designing systems which focus on visual-
ization and interaction methods that allows the user to better attend to a task with
minimal mental effort. This research focuses on interpreting the interest of a person
in reading or doing the task at hand. This in turn can be used to provide more infor-
mation or better information that the user subconsciously requires, more like having
a one teacher for every student to improve the education system.

2.1 Eye Tracking

Cognitive state is your thought processes and state of mind which pertains to cogni-
tion or how we process information. Interest, curiosity, confidence, comprehension,
amnesia, lack of attention, tiredness etc. are all examples of cognitive states. Vari-
ous researches discuss eye tracking measures and its hand in interpreting cognitive
processes and cognitive load. The measures used are voluntary – fixations and sac-
cades, and involuntary – pupil diameter and blinking. According to Ruddmann et
al. the direction of gaze indicates repeated interest in an area and the importance of
the area of interest in the current activity [7]. The path followed by the eye gaze on
the various zones, the time spent on each zone, the number of saccadic eye move-
ments , the average fixation duration and the total distance travelled by the gaze are
used as input to detect the cognitive state of the user during human-computer inter-
action. Chen et al. observed that fixation duration and fixation rate are indicators
8 Chapter 2. Technical Background & Related Work

of an increase in attention due to complexity of the current task [8]. They delved
into the relevance of saccades in interpreting human mental effort and performance
when solving a task. They have also found that an increase in blink latency and a
decrease in blink rate indicated high mental effort and that studying the diameter
of the pupil helps to realize the task difficulty and the cognitive effort. Blinking is
more often an involuntary measure but in certain cases it is a voluntary measure
which can be studied to measure attention and tiredness [9]. To understand how
well the content is understood, Zagermann et al. suggested aligning actual saccadic
eye movements with ideal saccadic movements. They also used regression count
and reading speed together with fixations and dwell time on certain areas of interest
on the screen to predict cognitive load and understanding. Saccade speed and length
with other measures achieved high accuracy in measuring human performance. On
the other hand, Manuel et al. suggested a decrease in saccade speed indicated tired-
ness and an increase in the same indicated task complexity [10]. According to them,
an increase in blink rate, a decrease in blinking velocity and a decrease in the eyelid
openness are indicators of tiredness and stress. Porta et al. have further observed
that decrease in pupil diameter at the end of the task indicated tiredness [11]. To
estimate the proficiency in English language (low, mid and high), Yoshimura et al.
used the sum of duration of fixations and sum of angular saccadic velocity with
SVM classification and achieved an accuracy of 90.9% [12]. Yang et al. suggested
that saccades are initiated in response to the results of cognitive processes taking
place during the fixations [4].

2.2 Medical and physiological sensing

Brain activity or cognitive processes can be measured through EEG and MEG, by
capturing changes in magnetic fields at the scalp caused by electrical currents in
brain neurons [13]. It refers to the recording of the brain’s spontaneous electrical
activity over a period of time, as recorded from multiple electrodes placed on the
scalp [14]. fMRI or Functional magnetic resonance imaging, measures the changes
in blood flow to an area of the brain to detect which part of the brain is in use.
Functional Near-Infrared Spectroscopy (fNIRS) uses near-infrared spectroscopy for
2.2. Medical and physiological sensing 9

F IGURE 2.1: EEG measurement setup.

functional neuroimaging and measures brain activity through hemodynamic re-


sponses [15]. However, using these techniques in day to day activities are quite
intrusive and too bulky to be worn regularly, considering the application scenario
that high school students use the system. They can neither be used in situ nor in
public (Figure 2.1) due to the time consuming and expensive setup and the complex
analysis.
On the other hand, autonomic nervous system (ANS) including the sympathetic
nervous system and the parasympathetic nervous system is a measurable compo-
nent of emotional response [16]. The nose temperature drops when a person feels
high workload [17] and high engagement on reading [18]. The diameter of the pupil
also reflects the workload. The relation has been investigated in the tasks of memory
[19] and driving a car [20]. Compared to sensing brain activation, these signals have
an advantage that they can be measured remotely without wearing devices.
There has been a growing interest in the study of the relation between cogni-
tive performance and heart rate variability (HRV). HRV is a simple measurement
of interactions between the autonomic nervous system (ANS) and the cardiovascu-
lar system, ideally measured using electrocardiography (EEG) as in Figure 2.2. The
analysis of the HRV is based on the study of temporal oscillations between heart-
beats [21]. EDA or Electrodermal activity (changes in electrical conductance of the
10 Chapter 2. Technical Background & Related Work

F IGURE 2.2: ECG measurement setup.

skin) is a sensitive psychophysiological index of changes in autonomic sympathetic


arousal that are integrated with emotional and cognitive states [22]. HRV, EDA,
respiration and brain signals have been used to predict mental stress and cognitive
load in various studies [23].

2.2.1 Electrodermal Activity

EDA or electrodermal activity refers to the autonomic changes in the electrical prop-
erties of the skin. It is a physiological signal which reliably represents the sympa-
thetic nervous system (SNS) and hence can be used to assess the human emotional
state. Skin conductance makes an ideal measure for sympathetic activation as it is
not connected to parasympathetic nerves unlike other physiological measures like
2.2. Medical and physiological sensing 11

heart rate. EDA is recorded by measuring the conductivity of the skin because the
relationship between the skin conductance is proportional to the sweat secretion.
EDA is composed of a low frequency tonic component and a high frequency pha-
sic component, which have different time scales and relationships to exogeneous
stimuli. Tonic phenomena include slow drifts of the baseline skin conductance level
(SCL) and spontaneous fluctuations (SF) in SC. The phasic component, i.e., the skin
conductance response (SCR), reflects the short-time response to the stimulus. The
typical shape of SCR is comprised of a relatively rapid rise from the conductance
level followed by a slower, asymptotic exponential decay back to the baseline. SCR
amplitude hve been studied to be a good representation of different levels of arousal
in a human. Anatomically, EDA changes are due to the sudomotor nerve activ-
ity (SMNA), which is part of the SNS, that directly control the eccrine sweat glands.
Therefore, EDA can be easily monitored through voltage/current measures between
two fingers, where there is a higher concentration of the eccrine sweat glands with
respect to other body sites.
EDA has been closely associated with emotional and cognitive processing, and
also used as a sensitivity index for emotional processing and sympathetic activ-
ity [24]. This coupling between cognitive states, arousal, emotion and attention en-
ables EDA to be used as an objective index of emotional states. Implicit emotional
responses that may occur unconsciously or unintentionally, like threat, anticipation,
salience, novelty; can also be examined using EDA.
Research has shown that EDA is also a useful indicator of attentional process-
ing, where salient stimuli and resource demanding tasks evoke increased EDA re-
sponses. Analysis of the data by Setz et al. showed that the EDA peak height and the
instantaneous peak rate depict the stress level of a person [25]. Thereby, distinguish-
ing stress from cognitive load in an office environment. Boucsein gives an extensive
summary of early EDA research in relation to stress [26]. He shows that the SCL
and the non-specific SCRs are sensitive and valid indicators of stress, whereas other
physiological measures (e.g., the heart rate) do not show equal sensitivity [27]. Most
cited studies use electrical stimuli or movies to induce a reaction of the EDA. The
12 Chapter 2. Technical Background & Related Work

Lazarus Group showed that the SCL and the heart rate increased significantly dur-
ing the presentation of violent films [28]. Nomikos et al. showed that even the ex-
pectation of an unpleasant event could cause a similar reaction in SCL as the event
itself [29]. Several studies investigating the effect of the anticipation of electrical
stimulation suggest that the rising of the SCL rather reflects an increased cognitive
activity related to the avoidance of aversive events than an emotional component.
On the other hand, the frequency of the non specific SCRs reveals the emotional com-
ponent of the stress reaction. Further studies used experimental settings that were
closer to a real-life office environment than simple electrical stimuli. Several authors
investigated how involuntary interruptions in the work flow due to long system re-
sponse times influenced the EDA. An increase of non specific SCRs for long system
response times could be demonstrated. Jacobs et al. also showed an increase in skin
conductivity during mental stress [30]

2.2.2 Blood Volume Pulse and Heart Rate

Blood volume pulse (BVP) is the change in volume of blood over a given period of
time. BVP can be affected by heart rate, heart rate variability (HRV) and respira-
tion rate. Certain emotion can trigger the release of hormones, such as epinephrine
and norepinephrine, which will increase blood flow to bring more oxygen to the
muscles. Blood volume can also change due to widening or contraction of blood
vessels. These emotions can then be interpreted using BVP as the blood flow will be
affected [31].
BVP can be monitored using photoplethysmography (PPG) which is a non-
invasive technique that relies on light absorption and reflection. It can give impor-
tant data on the cardiovascular system of the patient/user. It is also relatively cheap
but is only used on certain skin areas. The signals detected will form a wave, which
represent the change in blood volume. This wave will also be proportional to the
heart rate. Adjacent local peaks from this wave indicates heart beats and the time
interval between these peaks is the inter-beat interval(IBI). Heart rate, IBI and BVP
has been associated with frustration and anxiety [32].
13

Chapter 3

Biometric Sensors

3.1 SMI REDn Scientific eye tracker

The SMI REDn Scientific eye tracker (Figure 3.1) is a light, scientific-grade mobile
device that can be used in environments comfortable to users. It can be connected
to any laptop or desktop with a USB 3.0 port running iViewRED. It has a sampling
frequency of 60Hz, gaze position accuracy of 0.4◦ and a 13-point Smart Calibration
Technology. Ideally, it operates at a distance of 40 to 100cm and is compatible with
most eye-wear like glasses and lenses. The recorded gaze data has timestamp, left
and right gaze coordinates (x and y with the screen edges as the coordinate system),
left and right pupil diameter.

F IGURE 3.1: SMI REDn Scientific eye tracker.


14 Chapter 3. Biometric Sensors

3.2 Empatica E4 wristband

The Empatica E4 wristband was used for the acquisition of real-time physiological
data with the help of sensors designed to gather high-quality data. It has a PPG
sensor which measures BVP, an EDA sensor to measure electrical properties of the
skin, a 3-axis accelerometer to monitor activity, an infrared thermopile to measure
skin temperature, an event mark button to log events related to the signals and an
internal real-time clock.
The data is recorded in .csv format where the first row of every csv is the ini-
tial time of the session expressed as unix timestamp in UTC and the second row
is the sample rate expressed in Hz. TEMP.csv holds data from temperature sensor
expressed in degrees on the Celsius scale and has a sampling rate of 4Hz. EDA.csv
has data from the electrodermal activity sensor expressed as microsiemens at a 4Hz
sampling rate. BVP.csv is the data from photoplethysmograph recorded at 64Hz.
ACC.csv contains data from 3-axis accelerometer sensor. The accelerometer is con-
figured to measure acceleration in the range [-2g, 2g]. Therefore the unit in this file is
1/64g. Data from x, y, and z axis are respectively in first, second, and third column.
This data is not being used for our research since there is not much movement on

F IGURE 3.2: E4 wristband and signals


3.2. Empatica E4 wristband 15

the non-dominant hand during reading. IBI.csv has the time between individuals
heart beats extracted from the BVP signal. No sample rate is needed for this file. The
first column is the time (with respect to the initial time) of the detected inter-beat in-
terval expressed in seconds (s). The second column is the duration in seconds (s) of
the detected inter-beat interval (i.e., the distance in seconds from the previous beat).
HR.csv has the average heart rate extracted from the BVP signal. The first row is
the initial time of the session expressed as unix timestamp in UTC. The second row
is the sample rate expressed in Hz. Tags.csv contains event mark times. Each row
corresponds to a physical button press on the device; the same time as the status
LED is first illuminated. The time is expressed as a unix timestamp in UTC and it is
synchronized with initial time of the session indicated in the related data files from

F IGURE 3.3: Flir One thermal imaging camera.


16 Chapter 3. Biometric Sensors

the corresponding session.

3.3 Flir One thermal imaging camera

The FLIR ONE thermal imaging camera for iOS (Figure 3.3) blends images from
both a thermal camera and a VGA visible light camera to create thermal images
with enhanced detail and resolution. It easily connects to Apple mobile devices and
measures the temperature of any spot in a scene between -20◦ C to 120◦ C. It can also
detect temperature differences as small as 0.1◦ C.
17

Chapter 4

Feasibility Study : Sensor data


correlation with cognitive state
while reading

4.1 Method

We select the combination of eye tracking and thermal image analysis to analyse
cognitive states because they can be sensed without bothering readers and they do
not interfere with each other. This study was done to measure the feasibility of
this research on users who had no prior exposure to biometric sensors. This section
describes the pre-processing and feature calculations of the sensing [33].

4.1.1 Eye Tracking

We utilize a SMI remote eye tracker which can be attached to a display to track eye
movements. Eye gaze data is composed of two metrics - fixations and saccades. A
fixation occurs when the gaze falls on something of interest on the screen area and
usually lasts for about 100 - 150 ms. The rapid movement of the eye between fixa-
tions is called a saccade. As preprocessing, we filter raw eye movements to fixations
and saccades on the basis of the approach proposed by Buscher et al. which is ex-
plained in detail later [34]. Figure 4.1 shows one of the examples of the filtering.
The average of left and right pupil diameters at any time instant is used as the
pupil diameter feature for this work. The duration of each fixation during each
question is aggregated to get the fixation duration feature. The length of a saccade
Chapter 4. Feasibility Study : Sensor data correlation with cognitive state while
18
reading

F IGURE 4.1: Fixations on a display during solving a task. The colors


represent the order (from red to blue) and the durations are visualized
as radii.

is derived from the known 2-dimensional coordinate of fixations of the eye on the
screen, at a given timestamp. Similar to the fixation duration feature, the summation
of the saccade length corresponding to each participant for each question is used to
obtain the feature value. The mean and standard deviation of fixation durations and
saccade lengths are calculated as features.

4.1.2 Nose temperature tracking

We utilize FLIR One for iOS, a commercial thermal camera which can be attached
to a smartphone or a mobile tablet to measure face temperatures. We utilize the
sensor logging application of the device developed by Ishimaru et al. and record
the changing of temperatures as a video. Positions of the face and the nose on each
frame are detected by utilizing the method proposed by Baltrusaitis et al. [35]. The
temperature data consists of the nose temperature of each participant at a given time
during the experiment as shown in Figure 4.2. From an initial analysis, we have
found that generally, the temperature increases when the students read the textbook
and decreases when they start solving exercises. From this data, the slope and the
standard deviation of the participant’s temperature during the process of solving
4.2. Experimental Design 19

F IGURE 4.2: Examples of changing of nose temperature of two partic-


ipants (top: low workload, bottom: high workload) during they are
reading the textbook (red) and solving the tasks (orange). The x-axis
represents timestamps (sec.).

each question are calculated. Finding out the slope and standard deviation serves to
measure the ascend/descend and the fluctuations in temperature.

4.2 Experimental Design

We have recorded students’ reading behaviors and investigated effective features to


recognize cognitive states. In the following sections, we present the experimental
setup, the analysis results, and findings from this experiment.
Figure 4.3 shows an overview of the experimental setup. The SMI 60 Hz remote
eye tracker was set up alongside a normal computer desktop to record the eye gaze
data and FLIR One for iOS was set to capture the thermal energy from the face.
The eye tracker uses a reflection of infrared light to measure eye gaze. We made
sure that there is no significant affect in thermal images before starting the exper-
iment. We asked 14 sixth-grade students (11 or 12 years old) to participate in the
experiment. They read a Physics textbook on a screen and solved eight exercises
related to the content. As shown in Figure 4.3, the content textbook was displayed
Chapter 4. Feasibility Study : Sensor data correlation with cognitive state while
20
reading

F IGURE 4.3: An experimental setup using an eye tracker and a ther-


mal camera.

Id Type Scope Survey


s1 interest macro I enjoy solving physics problems.
s2 interest macro I am concerned about homework with
topics dealing with physics.
s3 interest micro I like the content of the textbook.
s4 interest micro I am interested in learning more about the subject
of the textbook as well as lectures and homework.
s5 interest micro I would like to know more on the topic
of textbook in school.
s6 confidence macro I am good at physics more than other
subjects.
s7 confidence micro The textbook text was easy to understand.
s8 confidence micro I knew what I had to answer during
solving the tasks.
s9 workload micro I had to make an effort to solve the questions.
s10 workload micro It was difficult to find the right
information to solve the questions in the text.
s11 workload micro I needed more assistance while reading
the textbook.
s12 workload micro The textbook made me curious to know
more about vibration and acoustics.
s13 expertise macro My physics record is about....

TABLE 4.1: Thirteen Survey Questions.

on the left page on the screen, and exercises were displayed on the right page. Par-
ticipants are allowed to use a calculator while solving the exercises. Eye movements
4.3. Results 21

on the calculator were excluded in the analysis. Note that 7 participants read the
text first and other 7 participants read the questions first. But we treat them as the
same condition because there are no significant differences in their performances
calculated by the score of the exercises. To collect ground truth of cognitive states,
we asked participants to answer surveys on a paper form after the recording. We
prepared the surveys as shown in Table 4.1 from the viewpoint of Physics education
research. They can be categorized with two indexes: the type and the scope. We
asked three questions on subjective cognitive states - interest, confidence, workload
together with one objective measurement expertise. Some of the surveys are general
questions about Physics learning (macro) and the others are specific questions about
the content (micro). The survey brought to light the interest these students had in
learning and researching about physics. The ratings ranged from 1 to 6, 6 being “I
agree completely and wholly” and 1 being “I do not agree with it at all”. Note that
there are two differences about the survey between the original form used in the ex-
periment and reported in this paper. (1) The order of the surveys in the form during
the experiment was s7, s3, s8, s6, s12, s11, s1, s10, s4, s2, s9, s5, and s13. Here, they
are sorted by their semantic meaning. (2) The surveys were in German during the
experiment. They are translated into English in the table for readers of this thesis.
We reluctantly exclude two of 14 participants’ data as outliers. Their reading and
solving time were too fast than other participants, they seemed to select the answers
randomly, and their score of exercises were zero. They could not be attentive enough
or understand the purpose of the experiment.

4.3 Results

Table 4.2 represents the Pearson correlation and p-values (in brackets) between the
features and surveys. High correlations with p-values less than 0.05 are highlighted
as bold fonts. From these values, we have found three insights. First, surveys related
to workload including s10 “It was difficult to find the right information to solve the
questions in the textbook.” and s9 “I had to make an effort to solve the questions.”
can be measured by a decrease of the nose temperature during solving exercises (p
= 0.001 and p = 0.012). Second, increase of pupil diameter represents a student’s
Chapter 4. Feasibility Study : Sensor data correlation with cognitive state while
22
reading

F IGURE 4.4: Pearson correlation between surveys and slope of nose


temperature.

F IGURE 4.5: Pearson correlation between surveys and mean of pupil


diameter.

interest including s3: “I like the content of the textbook.” and s1 “I enjoy solving
physics problems.” (p = 0.008 and p = 0.030 during reading; p = 0.006 and 0.013 dur-
ing solving). Third, students who read a textbook and exercises with small saccades
felt high confidence in their understandings reflected in s7 “The textbook was easy
to understand” and s8 “I knew what I had to answer during solving the tasks” (p =
0.025 and p = 0.035). Details of the relation between surveys and effective features,
the slope of nose temperature and mean of pupil diameter are visualized in Figure
4.4 and 4.5 for clear visualizations.
4.3. Results 23

interest
feature s1 s2 s3 s4 s5
nose slope reading -0.0 (0.97) 0.3 (0.37) 0.4 (0.23) 0.0 (0.88) 0.0 (0.89)
nose slope solving -0.3 (0.39) 0.3 (0.40) -0.4 (0.25) -0.0 (0.96) 0.2 (0.53)
nose std reading 0.0 (0.92) 0.3 (0.38) 0.4 (0.18) 0.1 (0.73) 0.1 (0.84)
nose std solving 0.6 (0.04) -0.4 (0.22) 0.0 (0.90) -0.1 (0.76) -0.3 (0.42)
pupil mean reading 0.6 (0.03) -0.6 (0.03) 0.7 (0.01) 0.1 (0.72) 0.5 (0.14)
pupil mean solving 0.7 (0.01) -0.6 (0.06) 0.7 (0.01) 0.1 (0.80) 0.5 (0.12)
pupil std reading 0.4 (0.16) -0.4 (0.23) 0.8 (0.00) 0.4 (0.22) 0.5 (0.14)
pupil std solving 0.2 (0.57) -0.4 (0.21) 0.6 (0.03) 0.6 (0.06) 0.4 (0.25)
fixation mean reading -0.4 (0.25) 0.3 (0.36) -0.4 (0.18) -0.1 (0.87) -0.1 (0.80)
fixation mean solving -0.3 (0.29) 0.0 (0.91) -0.4 (0.16) 0.0 (0.99) -0.2 (0.51)
fixation std reading -0.3 (0.30) 0.2 (0.63) -0.3 (0.40) -0.1 (0.86) -0.0 (0.89)
fixation std solving -0.4 (0.26) -0.1 (0.84) -0.1 (0.83) -0.1 (0.83) -0.1 (0.68)
saccade mean reading 0.1 (0.81) -0.6 (0.05) -0.1 (0.82) 0.2 (0.50) -0.1 (0.75)
saccade mean solving 0.3 (0.28) -0.3 (0.41) 0.4 (0.22) -0.3 (0.41) -0.3 (0.29)
saccade std reading 0.0 (0.96) -0.5 (0.10) -0.1 (0.80) 0.3 (0.28) -0.1 (0.68)
saccade std solving 0.3 (0.35) -0.2 (0.61) 0.4 (0.18) -0.0 (0.96) -0.2 (0.49)
confidence
feature s6 s7 s8
nose slope reading 0.2 (0.60) -0.2 (0.49) 0.1 (0.75)
nose slope solving 0.6 (0.04) 0.1 (0.74) 0.4 (0.22)
nose std reading 0.0 (0.88) -0.2 (0.55) 0.3 (0.40)
nose std solving -0.2 (0.58) -0.4 (0.19) -0.6 (0.03)
pupil mean reading 0.1 (0.80) 0.1 (0.88) -0.7 (0.01)
pupil mean solving 0.1 (0.76) 0.1 (0.77) -0.7 (0.02)
pupil std reading -0.1 (0.67) -0.0 (0.92) -0.4 (0.20)
pupil std solving -0.3 (0.40) -0.0 (1.00) -0.2 (0.47)
fixation mean reading 0.0 (0.97) 0.3 (0.42) 0.5 (0.08)
fixation mean solving -0.2 (0.44) 0.5 (0.14) 0.4 (0.19)
fixation std reading -0.0 (0.97) 0.2 (0.51) 0.4 (0.24)
fixation std solving -0.2 (0.44) 0.5 (0.11) 0.1 (0.70)
saccade mean reading -0.0 (0.93) -0.6 (0.02) -0.5 (0.14)
saccade mean solving -0.3 (0.42) -0.5 (0.10) -0.6 (0.03)
saccade std reading -0.1 (0.87) -0.7 (0.01) -0.3 (0.29)
saccade std solving -0.2 (0.50) -0.5 (0.11) -0.5 (0.12)
workload expertise
feature s9 s10 s11 s12 s13
nose slope reading 0.1 (0.73) -0.5 (0.09) -0.2 (0.50) 0.2 (0.53) -0.2 (0.61)
nose slope solving -0.7 (0.01) -0.8 (0.00) -0.2 (0.48) 0.3 (0.41) -0.4 (0.16)
nose std reading 0.2 (0.51) -0.4 (0.15) -0.1 (0.76) -0.1 (0.73) 0.1 (0.85)
nose std solving 0.1 (0.73) 0.3 (0.29) 0.2 (0.57) 0.1 (0.70) 0.1 (0.71)
pupil mean reading 0.3 (0.39) 0.5 (0.10) 0.5 (0.08) -0.0 (1.00) -0.2 (0.50)
pupil mean solving 0.3 (0.27) 0.5 (0.12) 0.5 (0.10) -0.2 (0.63) -0.1 (0.71)
pupil std reading 0.5 (0.07) 0.5 (0.09) 0.4 (0.25) 0.0 (0.99) -0.1 (0.69)
pupil std solving 0.5 (0.13) 0.6 (0.03) 0.3 (0.40) 0.0 (0.91) -0.2 (0.57)
fixation mean reading -0.1 (0.85) -0.1 (0.83) -0.2 (0.58) -0.5 (0.14) 0.3 (0.37)
fixation mean solving 0.1 (0.81) 0.3 (0.32) -0.4 (0.24) -0.3 (0.33) 0.2 (0.59)
fixation std reading -0.1 (0.73) -0.1 (0.87) 0.1 (0.76) -0.3 (0.31) 0.2 (0.63)
fixation std solving 0.2 (0.52) 0.4 (0.26) -0.2 (0.53) 0.1 (0.84) -0.1 (0.75)
saccade mean reading -0.5 (0.12) -0.0 (0.95) 0.6 (0.05) 0.7 (0.02) -0.4 (0.23)
saccade mean solving 0.3 (0.43) 0.1 (0.74) 0.3 (0.41) 0.2 (0.50) 0.2 (0.60)
saccade std reading -0.4 (0.26) -0.0 (0.94) 0.5 (0.10) 0.7 (0.01) -0.4 (0.24)
saccade std solving 0.3 (0.42) 0.1 (0.85) 0.1 (0.64) 0.4 (0.24) -0.1 (0.87)

TABLE 4.2: Pearson Correlation and p-values of features on all 13


survey questions.
25

Chapter 5

Main research : Detecting ’interest’


while reading using biometric
sensors

5.1 Method

This section describes the main topic of research for this thesis which focuses on in-
vestigating if reading interest of a person can be predicted from neurophysiological
biometric data [36].

5.1.1 Gaze event detection

Reading behavior of the eye is composed of three basic metrics: fixations, saccades
and blinks, as depicted in figure 5.1. A fixation occurs when the gaze falls on some-
thing of interest to the screen area and usually lasts for about 100 - 150 ms. The rapid
movement of the eye between fixations is called a saccade. Blinks and pupil diam-
eters can also be obtained from the eye tracker. Figure 5.2 shows an example of the
gaze events while reading an article.
As part of preprocessing, we filter raw eye movements to get fixations and sac-
cades on the basis of the approach proposed by Buscher et al. [34]. The midpoint of
the left and the right gaze coordinates is taken as the gaze point, only if both values
are non-zero, else the (left or right) non-zero coordinates is taken as the gaze point.
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
26
sensors

F IGURE 5.1: An example of a news article and eye gaze (circle: fixa-
tion, line: saccade) on the document.

A fixation typically consists of more than six successive gaze locations grouped to-
gether in succession. This makes the minimum fixation duration 100ms as men-
tioned earlier for data recorded at a rate of 60Hz. The successive gaze points making
a new fixation should fit inside a threshold rectangle of 30x30 pixels. All further
gaze points falling inside a 50x50 pixel rectangle is considered to belong to the cur-
rent fixation. This is done so that noise and small eye movements are tolerated. If
the gaze point does not fall in the rectangle, it is either an outlier or the start of a
new fixation, which further merges with six other points. The fixation is considered
to have ended if at least six successive gaze points cannot be merged (see figure 5.3).
The movement or transition from one fixation to the other is recorded as a sac-
cade. Saccades are further divided into forward saccades and backward saccades.
5.1. Method 27

F IGURE 5.2: Gaze events calculated by sensor signal from an eye


tracker

The x-coordinate of successive fixations denotes the direction of the saccade. For-
ward saccades imply regular reading behavior, while backward saccades can either
be regressions or line breaks. Regressions are backward eye movements which allow
re-reading of the text [37]. Line breaks are separated from regressions by analyzing
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
28
sensors

F IGURE 5.3: Fixation detection by Buscher et al.

the length of the backward saccade. If the length is equal to or greater than the length
of a line, then they are categorized as line breaks (observed as peaks in the saccade
length in Figure 5.2). Also, regressions move in a direction opposite to the reading
direction and mostly upward, while line breaks move downward to facilitate read-
ing of the next line.
Further, we use pupil diameter obtained from raw gaze data, which is the aver-
age of left and right pupil diameter, if both are non-zero. Another characteristic eye
behavior we computed was blink. The average duration for a blink of a human eye is
100-400ms. Hence, 6-24 consecutive zeroes in the left and the right gaze coordinates
is considered as a blink in our approach. The average latency of two consecutive
blinks is one second and blinks detected in between are considered as noise.

5.1.2 Features from Eye Gaze data

On the basis of the gaze events, we extracted seventeen features for further analysis,
as listed in Table 5.1. Fixation duration is the time taken for each fixation. Forward
saccade length is the distance between the two consecutive fixations that makes the
saccade. Forward saccade speed and Regression speed is the length of the saccade di-
vided by the time taken for the saccade. Regression ratio describes the fraction of
regressions out of the total number of saccades (i.e., (11) / ((11) + (12)). Regression
length is the distance between the two fixation coordinates that makes the regres-
sion. Pupil diameter is the diameter of the right pupil obtained from the raw gaze
5.1. Method 29

No. Feature

1-2 {mean, SD} of fixation duration


3-4 {mean, SD} of forward saccade length
5-6 {mean, SD} of forward saccade speed
7-8 {mean, SD} of regression length
9-10 {mean, SD} of regression speed
11-12 count of {forward saccades, regressions}
13 regression ratio
14-15 {mean, SD} of pupil diameter
16 blink frequency
17 SD of blink interval

TABLE 5.1: List of features from eye data. ’SD’ is short for standard
deviation

data, provided the x and y gaze coordinates are non-zero else taken as zero. Since
pupil diameter is user and environment dependent, it was taken as a relative value
compared to the pupil diameter during the questionnaire which was taken as a base-
line. Blink frequency is the number of blinks divided by the total time taken by the
reader (for each document). Blink interval is the time lag between two consecutive
blinks. Mean 5.1 and Standard deviation 5.3 are calculated as follows.

N −1
1 X
M ean(X) = xi (5.1)
N −1
i=0

N −1
1 X
V ar(X) = (xi − M ean(X))2 (5.2)
N −1
i=0

p
SD(X) = V ar(X) (5.3)

5.1.3 Features from EDA

EDA processing is done using cvxEDA algorithm, which proposed a representation


of the phasic responses as the output of a linear time-invariant system to a sparse
non-negative driver signal [38]. The model assumes that the observed EDA is the
sum of the phasic activity, a slow tonic component, and an additive independent and
identically distributed zero-average Gaussian noise term which incorporates model
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
30
sensors

F IGURE 5.4: EDA Signal and its components - Phasic and Tonic.

prediction errors as well as measurement errors and artifacts. This model is physio-
logically inspired and fully explains EDA through a rigorous methodology based on
Bayesian statistics, mathematical convex optimization and sparsity. The algorithm
was evaluated in three different experimental sessions to test its robustness to noise,
its ability to separate and identify stimulus inputs, and its capability of properly de-
scribing the activity of the autonomic nervous system in response to strong affective
stimulation. Once the processing is done, the EDA signal is decomposed into time
domain and frequency domain features in order to assess the sympathetic system
activity.
5.1. Method 31

The parameters retrieved from the algorithm are - phasic component, sparse
SMNA driver of phasic component, tonic component, coefficients of tonic spline,
offset and slope of the linear drift term, model residuals and value of objective func-
tion being minimized. The features that were extracted for the purpose of classifi-
cation are (1) the slope of the tonic part of the signal for which the slope of the line
of best fit was used (Linear Regression), (2) EPC - sum of all positive EDA changes,
(3) Minimum peak amplitude of the phasic signal, (4) Maximum peak amplitude of
the phasic signal, (5) Mean amplitude of the phasic signal and (6)Number of phasic
responses [39].

5.1.4 Features from BVP

The Empatica developer SDK includes a sample application which calculates IBI
values from the wristband. The method was accurate when the wrist was held com-
pletely still. However, this was not very reliable since even the smallest of movement
artifacts resulted in getting no IBI data at all. Hence, the blood volume pulse data
was used to retrieve HR and IBI. BVP data comprises of beats which is characterized
by a QRS complex, of which R component is the peak which is relevant to finding
HR and IBI. To detect peaks from the data we determine region of interest for each
R-peak and separate them from one another. A moving average is calculated for this
purpose so that the region of interest will lie above this. The highest point of the
retrieved region of interest will be the position of the peak in the BVP signal (see
Figure 5.5). The time interval in milliseconds between the R-peaks is the inter beat
interval and 60/IBI is the heart rate. Although the sensor handles suppressing noise
on its own, manual distortions are not considered. Noise is created by manual de-
tachment of the device or other movement like sneezing etc. To further smoothen
the signal and remove unwanted frequencies, the Butter-worth low-pass filter was
used before the peak detection was performed. It acts as a frequency gate, cutting
off frequencies above a certain threshold though its a soft filter and will not sup-
press values quite close to the threshold. We used a cutoff frequency of 2.5Hz for
this research.
Features relevant to HR, IBI and BVP were then extracted from the data. The
features used were - (7-8) mean and standard deviation of BVP, (9-10) mean and
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
32
sensors

F IGURE 5.5: Blood Volume pulse signal after peak detection.

standard deviation of HR, (11-12) difference in mean and standard deviation of HR


amplitude during task and baseline, (13) standard deviation of IBI normalized by
baseline (data recorded 5 seconds before start), (14) RMSSD - square root of the mean
of the square of the successive differences between IBI.
Features from skin temperature recorded by the wearable were also included
namely, (15-16) mean and standard deviation of skin temperature, (17) difference in
mean of temperature during task and baseline, and (18) slope of temperature (of the
line of regression).

5.1.5 Classification

The aforementioned features were used for classification of the eye tracking data
and the E4 data into two and four levels of interest. We utilize both Support Vector
5.1. Method 33

TABLE 5.2: List of features from E4 data

No. feature
1 Slope of the tonic component
2 EPC - number of positive EDA changes
3, 4 {Min, max} peak amplitude in the phasic component
5 Mean amplitude of the phasic component
6 Number of phasic responses
7, 8 {Mean,SD} of BVP
9, 10 {Mean, SD} of HR
11, 12 Difference in {mean, SD} of HR amplitude during task and baseline
13 SD of IBI normalized by baseline
14 RMSSD between IBI normalized by baseline
15, 16 {Mean, SD} of temperature
17 Mean difference of temperature with baseline
18 Slope of temperature

Machine (SVM) and Random Forest Classifier to evaluate its accuracy in classification.

Support Vector Classifier

SVM categorizes the test data with a separating optimal hyper-plane which is de-
fined by the labeled training data. A hyperplane is a line that best separates the
input subspace. In order to allow the hyperplane some room to divide the subspace,
hyperparameters like C and gamma are used. C is a tuning parameter that decides
how sensitive the algorithm is to the training data, zero being most sensitive. SVM
is often implemented using a kernel, which transforms the input space into higher
dimensions. Here we use a complex radial function which uses a parameter called
gamma with a value between zero and one to create complex regions in the feature
space.
Hyper-parameters are not optimized by SVC or Random Forest classifer. To
tune the best hyper-parameters for these estimators, GridSearchCV was used. Grid-
SearchCV uses a grid of parameters and performs an exhaustive search to optimize
them. A list of parameters were passed as arguments with the classifiers. We also
used it on different iterations of the cross validation (explained later) to find which
parameters were most optimally used.
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
34
sensors

F IGURE 5.6: Cross Validation methods. Color Notation : Red - A sin-


gle recording, Yellow - Data specific to one document, Green - Data
specific to one participant. ’pi dj ’ denotes the recording of the ith par-
ticipant’s j th document.

Random Forest Classifier

Random Forest Classifier randomly selects subsets of training data and creates sets
of decision trees and further uses the votes from the decision trees to find the test
data category. Selecting hyper parameters (the number of estimators, the maximum
depth of the tree, the maximum features to consider, the minimum number of sam-
ples required to split an internal node, the minimum number of samples required to
be at a leaf node in Random Forest) is crucial and non-trivial.

Cross Validation

In our approach, we used 3-fold grid search cross validation as a hyper parame-
ter optimization technique. It searches exhaustively through a manually defined
set of parameters and find those that achieved the highest score in the validation
procedure. We separate data into training for parameter optimization and valida-
tion/evaluation for each classification task.
We followed three different approaches to separate the train-test data before clas-
sification. Leave one-recording-out uses each recording (data of each participant on each
document, see figure 5.6) as test data, the rest as training and the average of the ac-
curacy in all cases together is taken as the classification accuracy. Similar to this ap-
proach, leave-one-document-out approach exempts the data of a document completely
5.2. Experimental design 35

F IGURE 5.7: An overview of the experimental setup. A participant is


reading a news article on a display.

from the training set and uses it for testing. Leave-one-participant-out approach uses
the data from all participants except one as training and uses the data from the left-
out participant as testing. This approach is quite significant as, in a realistic scenario;
the machine will not have had training data from a new user.

5.2 Experimental design

In order to evaluate our proposed interest detection method, we conducted an ex-


periment. This section describes the experimental design and the analysis results.
Figure 5.7 shows an overview of the experimental setup. We used an SMI REDn
Scientific 60Hz remote eye tracker set up alongside a normal computer desktop to
record gaze data. The eye tracker was unobtrusive and at a distance of around 30cm
away from the participant. Heart rate and electrical skin conductance was also mea-
sured during the experiment with an E4 wristband which the participant was asked
to wear at the beginning of the experiment.
Newspaper articles seemed to attain the purpose of the experiment better than
any other text and in order to capture the reader’s interest, we obtained a wide range
of topics from different platforms like technology, politics, sports, cooking etc. Thir-
teen university students (mean age: 25, std: 3) participated in the experiment where
each of them was asked to read eighteen newspaper articles comprising of 403 - 649
words each (mean: 555, std: 70) as shown in Figure 5.1.
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
36
sensors

F IGURE 5.8: Screen 1 : Start screen with an eye-pleasing background


including step-wise instructions. a) Press power on E4 wristband. b)
Calibrate eye tracker. c) Start reading.

F IGURE 5.9: Screen 2 : 13-point calibration.

Once the experiment started, the participant was asked to go through a sequence
of steps as shown in Figures 5.8 through 5.12. The start screen shown in Figure
5.8 was where the participant could rest his/her eyes and was also used to remind
them of the steps they had to take before reading the article namely - calibration (see
Figure 5.9) and pressing the wristband power button to log the starting timestamp
before the start of reading. The next step was to read the article as in Figure 5.10
5.2. Experimental design 37

F IGURE 5.10: Screen 3 : Reading the newspaper article.

F IGURE 5.11: Screen 5 : Subjective question on interest with four


answer-options(0%, 33%, 66% and 100%)

which takes about 3-4 minutes. After reading each document, participants answered
six questions. (1) Degree to which the reader comprehended the article; (2) the level
of interest they had on the article; (3) how difficult the article was for the reader; (4)
and (5) was based on their attention and tiredness respectively. These were used as
ground truth (from 1 to 4, where for question (2), 1 indicated "No interest" and 4
indicated "High interest"). The last screen had one question based on the content of
the article (i.e., objective comprehension) and four answers to choose from.
The recordings were done in two sessions of one hour each, in order to avoid
eye-fatigue and these were conducted in a controlled environment. The tracker and
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
38
sensors

F IGURE 5.12: Screen 9 : Objective question on comprehension with


four answer-options based on the newspaper article

the desktop were maintained in a stable position. The lighting of the room was set so
as not to affect the gaze data (pupil diameter). A 13-point calibration was done after
reading every document to avoid error or shift in the gaze points. The experiment
was supervised by at least one person to help the participant and control the settings.

5.3 Results

leave-one leave-one leave-one


-participant-out -document-out -recording-out

1. reading speed 25 32 35
2. subj. comprehension 25 32 35
3. obj. comprehension 34 34 34
4. eye movements 32 47 50
combination 1 and 4 36 47 46
combination 2 and 4 50 66 64
combination 3 and 4 35 47 46
combination all 49 62 62

TABLE 5.3: Classification accuracy using SVM classifier on eye-


related and other features[%]
5.3. Results 39

leave-one leave-one leave-one


-participant-out -document-out -recording-out

1. reading speed 29 29 27
2. subj. comprehension 56 58 56
3. obj. comprehension 38 38 38
4. eye movements 33 45 47
combination 1 and 4 30 44 45
combination 2 and 4 46 60 58
combination 3 and 4 32 43 43
combination all 42 58 58

TABLE 5.4: Classification accuracy using Random Forest on eye-


related and other features[%]

Table 5.3 represents the classification accuracies using SVM with a Radial Basis
Function kernel. The optimal hyper parameters are C: 8, gamma: 0.012. Table 5.4
shows the classification accuracies using Random Forest with the hyper parameters
- the number of estimators: 90, the maximum depth of the tree: 10, the maximum
features to consider: square of the number of all features, the minimum number of
samples required to split an internal node: 17, the minimum number of samples
required to be at a leaf node: 3. We also incorporated feature reduction techniques
like PCA and LDA, but found no commendable improvement in the classification.
The data when visualized in 2D and 3D form seemed randomly distributed which
indicated low variance. After classification we got 50% accuracy with SVM classifier
for 4-class and 75% for binary classification.
The distribution of the predicted classes observed as confusion matrices shown
in Figure 5.13 and Figure 5.15, are quite promising. The overall correlation between
the various features used and the labels are still quite small. However, when indi-
vidual participants were considered, from correlations between features and inter-
ests shown in Figure 5.16, we found that there was (1) a negative correlation between
mean/standard deviations of fixation duration with the interest labels, (2) a consid-
erably small positive correlation existed for the standard deviation of regression ve-
locity, (3) also with the number of forward saccades. But this was observed for only
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
40
sensors

F IGURE 5.13: The confusion matrix of the 4-class SVM classifier with
only gaze features and leave-one-recording-out training

F IGURE 5.14: The confusion matrix of the binary SVM classifier with
only gaze features and leave-one-recording-out training

half the number of participants or less, the rest having no or very slight correlations
with the features. Feature importances were also computed using the Random For-
est classifier as seen in the Table 5.7. Mean forward-saccade velocity, mean fixation
duration and mean regression velocity was found to have the highest importance
(in that order) and number of regressions had the lowest. Since the importances of
features in SVM could not be estimated from the classifier using an rbf kernel, we
5.3. Results 41

F IGURE 5.15: A confusion matrix of the Random Forest classifier with


gaze features and leave-one-recording-out training

F IGURE 5.16: Pearson correlations between gaze features and interest


of each participant

used a backward stepwise method to eliminate features and find the least important
features which lead to misclassification. Following this method we eliminated SD of
forward saccade velocity, SD of forward-saccade length and blink frequency.
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
42
sensors

Feature Importance

subjective comprehension 0.2760


mean forward saccade speed 0.1019
mean fixation duration 0.0961
mean regression speed 0.0524
count forward saccade 0.0471
regression ratio 0.0413
SD fixation duration 0.0408
SD regression speed 0.0386
SD blink interval 0.0368
SD forward saccade length 0.0365
objective comprehension 0.0337
reading speed 0.0336
mean regression length 0.0305
mean forward saccade length 0.0213
blink frequency 0.0213
SD regression length 0.0205
SD forward saccade speed 0.0201
mean pupil size 0.0191
count regression 0.0163
SD pupil size 0.0161

TABLE 5.5: Importance of eye-related features in the Random Forest


using all features

For further analysis, we included non-eye related measures like reading speed
and the level of subjective/objective comprehension of the user. The accuracies us-
ing SVM and Random Forest classifiers are tabulated in Table 5.3 and Table 5.4. Fig-
ure 5.16 shows the Pearson’s correlation of the features for each participant with the
level of interest. The level of subjective comprehension of a person can be seen to
have a very high effect on a person’s level of reading interest (denoted by red-high
and blue-low). We got an accuracy of 60% when subjective comprehension was used
for classification (Table 5.3). And, an accuracy of 61% for leave-one-document-out
when all measures - reading speed, subjective and objective comprehension, eye
5.3. Results 43

F IGURE 5.17: Confusion matrix with 50% accuracy for four-class


SVC.

gaze features, were used (Table 5.4). By just using subjective comprehension and
eye-related features, we achieve an accuracy of 62%.

2-class 4-class

LOPO 52 37
LODO 63 46
LORO 68 50

TABLE 5.6: SVM classification of E4-related features

The features from E4 data was used for classification similar to the method fol-
lowed for eye gaze data. Table 5.6 represents the classification accuracies using the
SVM classifier with hyper parameters C: 4, gamma: 0.18 and kernel: Radial Basis
Function.
We eliminated features like - maximum phasic peak, mean BVP and standard
deviation of temperature; owing to their role in mis-classification of the interest lev-
els. An accuracy of 50% was achieved by using a 4 class SVM classifier and 68% was
achieved for binary classification using cross validation and the leave one recording
out approach (LORO) (see figure 5.17 ). Feature importances were calculated using
Chapter 5. Main research : Detecting ’interest’ while reading using biometric
44
sensors

F IGURE 5.18: Confusion matrix with 68% accuracy for binary SVC. (1
and 2 vs. 3 and 4)

F IGURE 5.19: Confusion matrix with 38% accuracy for four-class Ran-
dom Forest.

RF and is shown in Table 5.7. Although the classification results with RF were not
quite good (see confusion matrix in figure 5.19).
SVM Classification was performed on the eye and the e4 data together aiming
for higher results but the accuracy achieved was just around 50% as shown in the
confusion matrix in figure. Different approaches were taken for this purpose. (1)
Combining all features of E4 and eye to perform classification as mentioned earlier
5.3. Results 45

Feature Importance

Mean HR 0.0879
Mean Temperature 0.0721
Number of phasic responses 0.0708
EPC - number of positive EDA changes 0.0639
SD of IBI normalized by baseline 0.0598
Difference in SD of HR amplitude during task and baseline 0.0550
Slope of temperature 0.0545
Slope of the tonic component 0.0542
Mean difference of temperature with baseline 0.0537
Mean amplitude of the phasic component 0.0484
Sum of all Phasic Amplitudes 0.0435
Maximum peak of Phasic component 0.0434
RMSSD between IBI normalized by baseline 0.0431
SD of BVP 0.0425
Mean of BVP 0.0413
SD of HR 0.0407
SD of Temperature 0.0366
Difference in Mean of HR amplitude during task and baseline 0.0305
Maximum peak of Phasic component 0.0291
Area under curve of Phasic component 0.0289

TABLE 5.7: Importance of physiological features with the Random


Forest Classifier (values between 0 and 1)

and, (2) using two separate classifiers for E4 and eye features to find the prediction
probability to the 4 interest levels and then using the average to determine the final
interest level.
47

Chapter 6

Discussion

For the first experiment. the temporal resolution of temperature sensing is an issue.
Although the change in the nose temperature is an effective feature to understand a
student’s effort, it requires a long time to be observed (see Figure 4.2). In the applica-
tion scenario, it can be used for the measurement on each learning unit or page. But
it seems difficult to apply our measurements on small parts such as each paragraph,
image, or sentence. We need to investigate how much the time resolution can be
minimized. We conducted two experiments before the final experiment, using the
Tobii Eye Tracker, but found calibration errors and missing data which rendered the
data useless. Hence, we continued to use the SMI eye tracker for the final experi-
ment.
The results of the main research topic is promising enough for further research
on this field. Although the SVM and RF classification accuracies were not as high
as expected, this research threw light on using a remote eye tracker and wristbands
for cognitive state measurement. The initial analysis and pre-processing done on
the noisy raw gaze data to extract useful features from it was challenging and suc-
cessful. We also found that mean forward-saccade velocity, mean fixation duration,
mean regression velocity, mean of heart rate, number of phasic peaks and positive
EDA changes plays a vital role in predicting a reader’s interest. And that an SVM
classifier with an RBF kernel is best to classify gaze-based features. We also expected
better accuracy when the features from both the sensors were combined, but was
disappointed at the results.
However, a higher correlation of the features to the labels was expected, though
it was observed to be quite small. The correlation was quite varied in the case of
48 Chapter 6. Discussion

F IGURE 6.1: Fixations are completely shifted to one direction due to


calibration error before reading.

each participant for all features except for ones earlier mentioned. This led us to
believe that physiological predictions are user-dependent and not just document-
dependent. For example, pupil diameter may not undergo the same changes in
every user during the same physiological process. These features are dependent on
the user, his/her cognitive state and his/her reading behavior. We also found that
the collection of ground truth related to interest and understanding are widely prone
to human error and individual behavior. Records pertaining to document number 16
were removed as only one of the participants had correctly answered the objective
question, which suggests that participants did not comprehend the document as
expected. We also removed the data of two participants from the train-test data due
to error in calibration of the eye tracker. There were cases where the fixations were
completely shifted to one direction (See Figure 6.1) or just not detected by the eye
tracker (See Figure 6.2). We also performed micro-level analysis on windowed data
for each document but was unsuccessful owing to the ground truth pertaining on
each document as a whole.
Subjective comprehension of a person has a very high correlation to the level of
interest a person can have while reading, which makes sense, since interest can only
be realized if the person truly understands the text. But using eye measures during
Chapter 6. Discussion 49

F IGURE 6.2: Fixations are not detected in the latter part of the article.

reading is an added advantage to understand this and should be deeply explored,


since it can be realistically collected during reading without reader intervention. An
investigation of how the human behave and the affects of psychological factors on
our learning ability was done before this research, but to claim to have an in-depth
knowledge on this would be bold. Though this research could be a start for better
findings in this field to make learning easier. We also focused on using unobtrusive
and pervasive sensors, so that learning can be attained in any environment. But
using better sensors will reliably monitor biometric activity and could be used for
better accuracy.
Technical improvements that could be made include better filtering, parameter
optimization and other classification algorithms to extract the data. Feature selec-
tion could also be improved although the selected features in this project were used
after exhaustive research and learning. It cannot be denied that the data is biased
based on the participant behaviour, age, thought processes, attention level, time of
the day, events of the day etc. The number of participants for the experiment limited
the amount of training data and hence crippled the various algorithms that could
have been used for classification purposes. Sensor related factors that affected the
50 Chapter 6. Discussion

data were noise, bad calibration, missing data due to shielded eyes or wrist move-
ment, the tightness of the wristband which can easily flatten the pulse wave or not
detect it at all and also nervousness which can result to an undetected pulse. Higher
sampling frequency of the EDA signal would have helped better in extracting the
phasic signal.
51

Chapter 7

Conclusion and Future work

The idea of using a system with an internal implicit feedback could be much more
useful for a student than using an external affective agent, as it might throw the
user’s concentration. This thesis demonstrated cognitive state analysis by using an
eye tracker and a wristband. To find the feasibility of this research topic, we con-
ducted an experiment with 12 high school students to read/solve learning materials
in Physics and investigated the relation between sensor signals and their cognitive
states (from surveys). The changing of the pupil diameter was highly correlated
with interest. Although the temporal resolution was not enough high for a real-time
application scenario, the changing of the nose temperature represented their efforts
for reading/solving learning materials. This work investigated “when” and "how"
additional information should be displayed on an anticipating textbook based on
the measurement of cognitive states.

LOPO LODO LORO

EYE - SVM, 4-class 32 47 50


EYE - SVM, Binary 62 73 75
E4 - SVM, 4-class 37 46 50
E4 - SVM, Binary 52 63 68
EYE and E4 - SVM, 4-class 41 47 50
EYE and E4 - SVM, Binary 68 71 71

TABLE 7.1: SVM classification results.

The main research based on the experiment with 15 students reading newspaper
52 Chapter 7. Conclusion and Future work

articles demonstrated that eye measures from a remote eye tracker and psychologi-
cal data from a wristband can be successfully used to predict a reader’s interest. We
extracted 17 features from raw gaze data obtained from the tracker and used it for
further classification of the data into different levels of interest. Although the cor-
relation of the features with the interest labels were not as high as expected, mean
of forward-saccade velocity, fixation duration and regression velocity were signifi-
cant. The same analysis was done on the wristband data as well with 16 features.
Both resulted in an accuracy of 50% in predicting the level of interest of the user if
split into 4 classes and around 75% accuracy to predict boredom/interest in binary
classification.
This work can be extended to include data from other sensors like an infrared
thermal camera to measure nose temperature and camera to recognize facial expres-
sions. Heat loss from the body of the user indicates drowsiness which could also
be monitored. A device created at MIT’s Computer Science and Artificial Intelli-
gence Laboratory emits radio signals that reflect off a person’s front and back body.
By measuring heartbeat and breathing, the device can accurately detect emotional
reactions. This is another sensor than can be used for contactless sensing. Data ac-
quisition could also be improved by controlling the environment to present a stress-
free or an at-home experience for the reader. This research can open the door to a
one-teacher-for-every-student education system. It can also motivate learning with
an internal feedback which is not annoying or distracting as an external feedback
system could be. A head mounted eye tracker or a more accurate wristband would
have been better suited to get results but this thesis was rooted on using sensors
which do not trouble the user at all and those which are easily available for further
research. To top it all off, wristband analysis can be used not just to read from a
normal desktop but also from paper.
For future work, with better accuracy, text can be personalized according to the
user reading the article and better data can be presented to him/her to keep user
engaged. The augmentation and anticipation will be based on the user’s internal
thought processes, state of mind, tiredness etc. This research can be extended to cre-
ate a teaching assistant to provide better insight for the teacher in a classroom about
Chapter 7. Conclusion and Future work 53

the students and their understanding. Better designed articles for an improved expe-
rience of human-computer interaction could also be a good outcome of this research.
55

Bibliography

[1] S. Hidi. “Interest, reading, and learning: Theoretical and practical considera-
tions”. In: Educational Psychology Review 13.3 (2001), pp. 191–209.

[2] S. Squires. The effects of reading interest, reading purpose, and reading maturity on
reading comprehension of high school students. Baker University, 2014.

[3] G. E. Raney, S. J. Campbell, and J. C. Bovee. “Using eye movements to eval-


uate the cognitive processes involved in text comprehension”. In: Journal of
visualized experiments: JoVE 83 (2014).

[4] S.-N. Yang and G. W. McConkie. “Eye movements during reading: A theory
of saccade initiation times”. In: Vision research 41.25 (2001), pp. 3567–3585.

[5] K. Rayner, K. H. Chace, T. J. Slattery, and J. Ashby. “Eye movements as reflec-


tions of comprehension processes in reading”. In: Scientific studies of reading
10.3 (2006), pp. 241–255.

[6] L. Copeland, T. Gedeon, and S. Caldwell. “Framework for Dynamic Text Pre-
sentation in eLearning”. In: Procedia Computer Science 39 (2014), pp. 150–153.

[7] D. S. Rudmann, G. W. McConkie, and X. S. Zheng. “Eyetracking in cognitive


state detection for HCI”. In: Proceedings of the 5th international conference on Mul-
timodal interfaces. ACM. 2003, pp. 159–163.

[8] S. Chen, J. Epps, N. Ruiz, and F. Chen. “Eye activity as a measure of human
mental effort in HCI”. In: Proceedings of the 16th international conference on Intel-
ligent user interfaces. ACM. 2011, pp. 315–318.

[9] J. Zagermann, U. Pfeil, and H. Reiterer. “Measuring Cognitive Load using Eye
Tracking Technology in Visual Computing”. In: BELIV’16: Proceedings of the
Sixth Workshop on Beyond Time and Errors on Novel Evaluation Methods for Visu-
alization. 2016, pp. 78–85.
56 BIBLIOGRAPHY

[10] V. M. G. Barrios, C. Gütl, A. M. Preis, K. Andrews, M. Pivec, F. Mödritscher,


and C. Trummer. “AdELE: A framework for adaptive e-learning through eye
tracking”. In: Proceedings of IKNOW (2004), pp. 609–616.

[11] M. Porta, S. Ricotti, and C. J. Perez. “Emotional e-learning through eye track-
ing”. In: Global Engineering Education Conference (EDUCON), 2012 IEEE. IEEE.
2012, pp. 1–6.

[12] K. Yoshimura, K. Kise, and K. Kunze. “The eye as the window of the language
ability: Estimation of English skills by analyzing eye movement while reading
documents”. In: Document Analysis and Recognition (ICDAR), 2015 13th Interna-
tional Conference on. IEEE. 2015, pp. 251–255.

[13] J. Frey, M. Daniel, J. Castet, M. Hachet, and F. Lotte. “Framework for


electroencephalography-based evaluation of user experience”. In: Proceedings
of the 2016 CHI Conference on Human Factors in Computing Systems. ACM. 2016,
pp. 2283–2294.

[14] T. Lan, A. Adami, D. Erdogmus, and M. Pavel. “Estimating cognitive state


using EEG signals”. In: Signal Processing Conference, 2005 13th European. IEEE.
2005, pp. 1–4.

[15] S. Ishimaru, K. Kunze, K. Kise, and M. Inami. “Position Paper: Brain Teasers -
Toward Wearable Computing That Engages Our Mind”. In: Proceedings of the
2014 ACM International Joint Conference on Pervasive and Ubiquitous Computing:
Adjunct Publication. UbiComp ’14 Adjunct. Seattle, Washington: ACM, 2014,
pp. 1405–1408.

[16] R. M. Sapolsky. Why Zebras Don’t Get Ulcers. W. Ross MacDonald School Re-
source Services Library, 2008.

[17] P. Karthikeyan, M. Murugappan, and S. Yaacob. “Descriptive analysis of skin


temperature variability of sympathetic nervous system activity in stress”. In:
Journal of Physical Therapy Science 24.12 (2012), pp. 1341–1344.

[18] K. Kunze, S. Sanchez, T. Dingler, O. Augereau, K. Kise, M. Inami, and T.


Tsutomu. “The augmented narrative: toward estimating reader engagement”.
BIBLIOGRAPHY 57

In: Proceedings of the 6th augmented human international conference. ACM. 2015,
pp. 163–164.

[19] D. Kahneman and J. Beatty. “Pupil diameter and load on memory”. In: Science
154.3756 (1966), pp. 1583–1585.

[20] O. Palinko, A. L. Kun, A. Shyrokov, and P. Heeman. “Estimating cognitive


load using remote eye tracking in a driving simulator”. In: Proceedings of the
2010 symposium on eye-tracking research & applications. ACM. 2010, pp. 141–144.

[21] A. Luque-Casado, M. Zabala, E. Morales, M. Mateo-March, and D. Sanabria.


“Cognitive performance and heart rate variability: the influence of fitness
level”. In: PloS one 8.2 (2013), e56935.

[22] H. D. Critchley. “Electrodermal responses: what happens in the brain”. In: The
Neuroscientist 8.2 (2002), pp. 132–142.

[23] K. Masood. “EDA as a Discriminate Feature in Computation of Mental Stress”.


In: The Second International Conference on Digital Information Processing, Data
Mining, and Wireless Communications (DIPDMWC2015). 2015, p. 199.

[24] A. Greco, A. Lanata, L. Citi, N. Vanello, G. Valenza, and E. P. Scilingo. “Skin


admittance measurement for emotion recognition: A study over frequency
sweep”. In: Electronics 5.3 (2016), p. 46.

[25] C. Setz, B. Arnrich, J. Schumm, R. La Marca, G. Tröster, and U. Ehlert. “Dis-


criminating stress from cognitive load using a wearable EDA device”. In: IEEE
Transactions on information technology in biomedicine 14.2 (2010), pp. 410–417.

[26] W. Boucsein. Electrodermal activity. Springer Science & Business Media, 2012.

[27] S. for Psychophysiological Research Ad Hoc Committee on Electrodermal


Measures, W. Boucsein, D. C. Fowles, S. Grimnes, G. Ben-Shakhar, W. T. Roth,
M. E. Dawson, and D. L. Filion. “Publication recommendations for electroder-
mal measurements”. In: Psychophysiology 49.8 (2012), pp. 1017–1034.

[28] R. S. Lazarus and E. M. Opton Jr. “The study of psychological stress: A sum-
mary of theoretical formulations and experimental findings”. In: Anxiety and
behavior 1 (1966).
58 BIBLIOGRAPHY

[29] M. S. Nomikos, E. Opton Jr, and J. R. Averill. “Surprise versus suspense in the
production of stress reaction.” In: Journal of Personality and Social Psychology
8.2p1 (1968), p. 204.

[30] S. C. Jacobs, R. Friedman, J. D. Parker, G. H. Tofler, A. H. Jimenez, J. E. Muller,


H. Benson, and P. H. Stone. “Use of skin conductance changes during mental
stress testing as an index of autonomic arousal in cardiovascular research”. In:
American heart journal 128.6 (1994), pp. 1170–1177.

[31] D Combatalade. “Basics of heart rate variability applied to psychophysiol-


ogy”. In: MAR953-00, Thought Technology Ltd., Montreal (2010).

[32] A. Kushki, J. Fairley, S. Merja, G. King, and T. Chau. “Comparison of blood vol-
ume pulse and skin conductance responses to mental and affective stimuli at
different anatomical sites”. In: Physiological measurement 32.10 (2011), p. 1529.

[33] S. Ishimaru, S. Jacob, A. Roy, S. S. Bukhari, C. Heisel, N. Großmann, M. Thees,


J. Kuhn, and A. Dengel. “Cognitive State Measurement on Learning Materials
by Utilizing Eye Tracker and Thermal Camera”. In: Proceedings of the 14th IAPR
International Conference on Document Analysis and Recognition. Vol. 8. IEEE. 2017,
pp. 32–36.

[34] G. Buscher, A. Dengel, and L. van Elst. “Eye movements as implicit relevance
feedback”. In: CHI’08 extended abstracts on Human factors in computing systems.
ACM. 2008, pp. 2991–2996.

[35] T. Baltrusaitis, P. Robinson, and L.-P. Morency. “Constrained local neural fields
for robust facial landmark detection in the wild”. In: Computer Vision Work-
shops (ICCVW), 2013 IEEE International Conference on. IEEE. 2013, pp. 354–361.

[36] S. Jacob, S. S. Bukhari, S. Ishimaru, and A. Dengel. “Gaze-based interest detec-


tion on newspaper articles”. In: Proceedings of the 7th Workshop on Pervasive Eye
Tracking and Mobile Eye-Based Interaction. ACM. 2018, p. 4.

[37] R. W. Booth and U. W. Weger. “The function of regressions in reading: Back-


ward eye movements allow rereading”. In: Memory & cognition 41.1 (2013),
pp. 82–97.
BIBLIOGRAPHY 59

[38] A. Greco, G. Valenza, A. Lanata, E. P. Scilingo, and L. Citi. “cvxEDA: A convex


optimization approach to electrodermal activity processing”. In: IEEE Transac-
tions on Biomedical Engineering 63.4 (2016), pp. 797–804.

[39] D. Leiner, A. Fahr, and H. Früh. “EDA positive change: A simple algorithm
for electrodermal activity to measure general audience arousal during media
exposure”. In: Communication Methods and Measures 6.4 (2012), pp. 237–250.

You might also like