0% found this document useful (0 votes)

92 views13 pages

Academic Performance

This article proposes a model called Augmented Education (AugmentED) to predict students' academic performance using multisource behavioral data. The model aggregates data from online/offline learning and in-class/out-of-class behaviors. Metrics are estimated to measure linear/nonlinear behavioral changes. Features representing temporal lifestyle patterns are extracted using LSTM. Machine learning is used to predict performance. Visualized feedback is designed to help students optimize interactions with the university. An experiment on a real-world dataset of 156 students found the AugmentED model can accurately predict academic performance.

Uploaded by

NAGA KUMARI ODUGU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

0% found this document useful (0 votes)

92 views13 pages

Academic Performance

Uploaded by

NAGA KUMARI ODUGU

We take content rights seriously. If you suspect this is your content, claim it here.

Available Formats

Download as PDF, TXT or read online on Scribd

You are on page 1/ 13

This article has been accepted for publication in a future issue of this journal, but has not been

fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access

Date of publication xxxx 00, 0000, date of current version xxxx 00, 0000.
Digital Object Identifier 10.1109/ACCESS.2017.Doi Number

Academic Performance Prediction Based

on Multisource, Multifeature Behavioral
Data
Liang Zhao1,*, Kun Chen1, Jie Song1, Xiaoliang Zhu1, Jianwen Sun1, Brian Caulfield2, Brian
Mac Namee2
1
National Engineering Laboratory for Educational Big Data (NELEBD), National Engineering Research Center for E-learning (NERCEL), Central China
Normal University (CCNU), Wuhan, P. R. China
2
Insight Center for Data Analytics, University College Dublin (UCD), Dublin, Ireland

Corresponding author: Liang Zhao (liang.zhao@mail.ccnu.edu.cn)

The authors acknowledge the support received from the National Key R&D Program of China (Grant No: 2017YFB1401300 and
2017YFB1401303), the National Natural Science Foundation of China (Grant No: 61977030), and “the Fundamental Research Funds for the
Central University” (Grant No: 20205170443)

ABSTRACT Digital data trails from disparate sources covering different aspects of student life are stored
daily in most modern university campuses. However, it remains challenging to (i) combine these data to
obtain a holistic view of a student, (ii) use these data to accurately predict academic performance, and (iii)
use such predictions to promote positive student engagement with the university. To initially alleviate this
problem, in this paper, a model named Augmented Education (AugmentED) is proposed. In our study, (1)
first, an experiment is conducted based on a real-world campus dataset of college students (N = 156) that
aggregates multisource behavioral data covering not only online and offline learning but also behaviors inside
and outside of the classroom. Specifically, to gain in-depth insight into the features leading to excellent or
poor performance, metrics measuring the linear and nonlinear behavioral changes (e.g., regularity and
stability) of campus lifestyles are estimated; furthermore, features representing dynamic changes in temporal
lifestyle patterns are extracted by the means of long short-term memory (LSTM). (2) Second, machine
learning-based classification algorithms are developed to predict academic performance. (3) Finally,
visualized feedback enabling students (especially at-risk students) to potentially optimize their interactions
with the university and achieve a study-life balance is designed. The experiments show that the AugmentED
model can predict students’ academic performance with high accuracy.

INDEX TERMS academic performance prediction, behavioral pattern, digital campus, machine learning (ML),
long short-term memory (LSTM)

I. INTRODUCTION ⚫ Lifestyle Behaviors (e.g., eating, physical activity,

As an important step to achieving personalized education, sleep patterns, social tie, and time management) [7-28];
academic performance prediction is a key issue in the and
education data mining field. It has been extensively ⚫ Learning Behaviors (e.g., class attendance, study
demonstrated that academic performance can be profoundly duration, library entry, and online learning) ([7,8,23-
affected by the following factors: 26,28-38]).
⚫ Students’ Personality (e.g., neuroticism, extraversion, For example, [2] investigated the incremental validity of the
and agreeableness) [1-4]; Big Five personality traits in predicting college GPA. [21]
⚫ Personal Status (e.g., gender, age, height, weight, demonstrated that physical fitness in boys and obesity status
physical fitness, cardiorespiratory fitness, aerobic in girls could be important factors related to academic
fitness, stress, mood, mental health, intelligence, and achievement. Meanwhile, [22] showed that a regular lifestyle
executive functions) [1-12]; could lead to good performance among college students. [24]
showed that the degree of effort exerted while working could

VOLUME XX, 2017 1

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.
This article has been accepted for publication in a future issue of this journal, but has not been fully edited. Content may change prior to final publication. Citation information: DOI
10.1109/ACCESS.2020.3002791, IEEE Access
Author Name: Preparation of Papers for IEEE Access (February 2017)

be strongly correlated with academic performance. systems) are summarized in Table I. We first discuss the
Additionally, [32] showed that compared with high- and online prediction system, System A [32] (proposed by Z. Liu).
medium-achieving students, low-achieving students were less This system is relatively simple because its data is only
emotionally engaged throughout the semester and tended to captured from either SPOC or MOOC. Regarding the latter
express more confusions during the final stage of the semester. three offline prediction systems, i.e., Systems B ~ D [8,22,24]
By analyzing the effect of the factors influencing (proposed by R. Wang, Y. Cao, and Z. Wang respectively),
academic performance, many systems using data to predict the number of data sources is reduced, while the
academic performance have been developed in the literature corresponding scale size rapidly increases; Unfortunately, the
[1-4,7,8,13-19,22-27,29-31,33,34,37-41]. For instance, in number of different types of behaviors that could be
[8], academic performance was predicted based on passive considered is decreased. Ideally, multisource data at a
sensing data and self-reports from students’ smart phones. In medium/large scale could help lead to a better prediction
[23], a multitask predictive framework that captures system design. However, in practice, due to limitations, such
intersemester and intermajor correlations and integrates as computing capability, either data diversity or the sample
student similarity was built to predict students’ academic size is sacrificed during the system design process.
performance. In [34], based on homework submission data, TABLE I
FOUR TYPICAL PREDICTION SYSTEMS (PROPOSED BY PREVIOUS
the academic performance of students enrolled in a blended RESEARCHERS)
learning course was predicted.
Systems Scale Data Source Behaviors
According to their predicted academic performance, early Size (N)
feedbacks and interventions could be individually applied to System A
at-risk students. For example, in [33], to help students with (single- ⚫ online study
On-
source + 243 SPOC/MOOC ⚫ discussions on the
a low GPA, basic interventions are defined based on GPA line
medium forum
predictions. However, the research on the scale) [32]
feedback/intervention is still in the early stage, its ⚫ wearable sensors ⚫ activity
achievements are relatively few.  smart phone ⚫ conversation
 accelerometer ⚫ sleeping
In recent years, compared with primary and secondary System B
 light sensor ⚫ location
(multisource
education (i.e. K12) [6,10,12,17], more and more attentions 30  microphone ⚫ socializing
+ small
have been paid to the academic performance prediction for  GPS/Bluetooth ⚫ exercising
scale) [8]
⚫ self-reports ⚫ mental health
higher education [7-9,14,15,22-25,27,28,30-32,36-38]. The  SurveyMonkey ⚫ stress
reasons contributing to this phenomenon warrant further  mobile EMA ⚫ mood
investigation and might include the following. First, for ⚫ usage of smart card
Off-  showering
college students on a modern campus, life involves a System C
line ⚫ WiFi  eating
combination of studying, eating, exercising, socializing, etc. (almost
⚫ campus network  consumption
multisource 528
(see Fig. 1) [7,8,22-25,27,42,43]. All activities that students ⚫ smart card ⚫ trajectory
+ medium
engage in (e.g., borrowing a book from the library) leave a ⚫ class schedule  wake-up time…
scale) [24]
⚫ network
digital trail in some database. Therefore, it is relatively easy  network cost…
to track college students’ behaviors, e.g. online learning System D ⚫ usage of smart card
behaviors captured from massive open online courses (single-  showering
source + 18960 smart card  eating
(MOOC) and small private online courses (SPOC) large scale)  library entry-exit
platforms [30-32,36-38]. Second, given the diverse range [22]  fetching water
of activities listed above, it could be difficult for college
students to maintain balanced, self-discipline, well-being To initially alleviate the challenges mentioned above, a
university experiences, including excellent academic model named Augmented Education (AugmentED) is
performance. proposed in this paper. As shown in Fig. 2, this model mainly
Although many academic performance prediction systems consists of the following three modules: (1) a Data Module in
have been developed for college students, the following which multisource data on campus covering a large variety of
challenges persist: (i) capturing a sufficiently rich profile of a data trails are aggregated and fused, and the
student and integrating these data to obtain a holistic view; (ii) characteristics/features that can represent students’ behavioral
exploring the factors affecting students’ academic change from three different perspectives are evaluated; (2) a
performance and using this information to develop a robust Prediction Module in which academic performance
prediction model with high accuracy; and (iii) taking prediction is considered a classification problem that is solved
advantage of the prediction model to deliver personalized by machine learning (ML)-based algorithms; and (3) a
services that potentially enable students to drive behavioral Feedback Module in which visualized feedback is delivered
change and optimize their study-life balance. individually based on the predictions made and feature
To address these challenges, four representative prediction analysis. Finally, AugmentED is examined using a real-world
systems (including one online system and three offline dataset of 156 college students.

2 VOLUME XX, 2017

The remainder of this paper is organized as follows. In IV, the experimental results are discussed and analyzed.
Section II, a literature review is given. In Section III, the Finally, a brief conclusion is given in Section V.
methodology of AugmentED is described in detail. In Section

Consumption Clinic visit

Online

Meal Trajectory

Outside
Inside the Digital of the
Digital classroom Campus
classroom
Campus

Library interaction Learning

Offline

Emotion
(a) (b)
FIGURE 1. Digital data remaining on a modern campus: (a) Multisource; (b) Multispace, covering not only online and offline learning but also students’
behaviors inside and outside of the classrooms.

Data Module Prediction Module Feedback Module

Raw Data Data Trails Characters & Features

➢Slope feature
➢Breakpoint selection
online smart learning emotion library Behavioral Change ➢RSS
learning card interaction -Linear visualized feedback
(BC-Linear)
➢Entropy
➢HMM-based intelligence
Entropy algorithm
meal consumption trajectory Behavioral Change ➢LyE
-nonLinear ➢HurstE
(BC-nonLinear) ➢DFA

➢LSTM-based
WiFi central clinic Behavioral Change features cross
storage visit -LSTM validation feature analysis
(BC-LSTM)

FIGURE 2. Overview of AugmentED. In the data module, the features blocked in dashed boxes (including LyE, HurstE, DFA, and LSTM-based features)
are proposed in our study, to the best of our knowledge, which is used for the first time in student’s behavioral analysis.

II. RELATED WORK 1) BEHAVIORAL CHANGE-LINEAR (BC-LINEAR)

Traditionally, behavioral change is mainly quantified by two
A. FEATURE EXTRACTION linear metrics: behavioral slope and behavioral breakpoint.
Feature evaluation plays an important role in designing First, the behavioral slope can be captured by computing
prediction systems. Features that measure the various the slope of the behavioral time series of each student using
behavioral patterns can enhance our understanding of how a a linear regression [8]. The value of the slope indicates the
student’s behavior changes as the semester progresses. In direction and strength of the behavioral changes, e.g., a
this part, on the one hand, previous features that quantify positive slope with a greater absolute value indicates a faster
students’ behavioral patterns are summarized; On the other increase in behavioral change [8]. Given a mid-term day
hand, new features worthy of inclusion are also introduced. during the semester [8], both the pre-slope and post-slope can
In general, behavioral change can be quantified by the be calculated to represent the students’ behavioral change
following three groups of metrics.

2 VOLUME XX, 2017

during the first and second halves of the semester, statistical self-affinity or long-range dependence) of a
respectively. time series [56]. For example, in [56], DFA is used to
Second, the behavioral breakpoint can be captured by quantify the long-range correlation of a heart rate time
computing the rate of behavioral changes occurring across series, and it is demonstrated that a time series with a
the semester. The value of the breakpoint identifies the day small DFA value indicates less long-range correlation
during the semester before and after which a student’s behavior than a series with a large DFA value.
behavioral patterns differed. Two linear regressions can be Therefore, in heart rate analyses, DFA is considered a
used to fit a behavioral time series and then use the Bayesian long-range correlation indicator that can distinguish
information criterion (BIC) to select the best breakpoint [8]. healthy subjects from those with severe heart disease
If a single regression algorithm is selected, the breakpoint [56].
can be set to the last day. In summary, the above three nonlinear metrics can measure
2) BEHAVIORAL CHANGE-NONLINEAR (BC-NONLINEAR) the stability, predictability, and long-range correlation of a
In recent years, nonlinear metrics have been increasingly time series. Although these metrics have already been
applied to time series analysis [22,44-59]. extensively applied in time series analyses, e.g., gait time
Regarding the students’ behavioral time series, nonlinear series [47], in this study, for the first time, they are used in a
metrics have been used to discover nonlinear behavioral behavioral time series analysis. These metrics can enhance
patterns. We consider entropy an example. In [22], entropy our understanding of not only whether a student’s behavior
is proposed to quantify the regularity/orderliness of students’ is stable, predictable, and long-range correlated, but also how
behaviors, and it was demonstrated that a small entropy value good a student’s behavior is (e.g., self-discipline).
generally leads to high regularity and high academic 3) BEHAVIORAL CHANGE-LSTM (BC-LTSM)
performance. Another example is entropy calculated based Features represent temporal change over time is also worthy
on a Hidden Markov Model (HMM) analysis [44], which is of study. Such features can be extracted by long short-term
called HMM-based entropy for simplicity in our study. memory (LSTM) [58], which in this paper is called LSTM-
HMM-based entropy is proposed to quantify the based features for short. LSTM-based features have been
uncertainty/diversity of students’ behaviors, e.g., the applied in many fields, including for example emotion
uncertainty between the transition of different behaviors and recognition [59,60], traffic forecast [61] and video action
the various activities that a behavior exhibits. In [44], HMM- classification [62]. However, these features have not been
based entropy is evaluated by the following two steps: (i) applied in lifestyle behavioral analysis previously.
extracting the hidden states of a behavioral time series by
HMM [45,46]; and (ii) subsequently calculating the HMM- B. PREDICTION ALGORITHMS
based entropy of the extracted hidden states. In general, academic performance prediction can be
To further recognize students’ activities and discover their considered either a regression or a classification problem. A
nonlinear behavioral patterns, the following three new wide variety of algorithms have been used/proposed in
metrics, which have not been applied in students’ behavioral literatures to predict academic performance.
time series analysis previously, are also worth to be studied. For example, in [8], Lasso (least absolute shrinkage and
 Lyapunov Exponent (LyE) [47-51] is a measure of the selection operator) regularized linear regression model,
stability of a time series. For example, in [47], LyE is proposed by Tibshirani [63] in 1996, is used to predict
used to quantify the stability of a gait time series, and academic performance. In [24], four supervised learning
the results demonstrate that a time series with a large algorithms (consisting of support vector machine (SVM),
LyE value is less stable than a series with a small LyE logistic regression (LR), decision tree and naï ve Bayes) are
value, i.e., generally, a large LyE value indicates high used to classify students’ performance. In [22], RankNET, a
instability. Therefore, in gait analyses, LyE is neural network method proposed by Burges et al. [64] in
considered a stability risk indicator for falls [47] that 2015, is used to predict the ranks of students’ semester
can distinguish healthy subjects from those at a high grades. Similarly, in [27], a layer-supervised MLP-based
risk of falling. method is proposed for academic performance prediction. In
 Hurst Exponent (HurstE) [52-54] is a measure of [32], a temporal emotion-aspect model (TEAM), modeling
predictability (in some studies, it is also called long- time jointly with emotions and aspects extracted from SPOC
term memory) of a time series. For example, in [53], platform, is proposed to explore the effect of most concerned
HurstE is applied to quantify the predictability of a emotion-aspects as well as their evolutionary trends on
financial time series, and the results demonstrate that a academic achievement. In [65], four classification methods
time series with a large HurstE value can be predicted (consisting of Naï ve-Bayes, SMO, J48, and JRip) are used to
more accurately than a series with a HurstE value close predict students’ performance by considering student
to 0.5. heterogeneity.
 Detrended Fluctuation Analysis (DFA) [54-57] is a In general, due to the lack of open-access, large-scale, and
measure of the long-range correlation (also called multisource data sets in the education field, on the one hand,

2 VOLUME XX, 2017

to some extent, it is impossible to compare the performances following reasons: (1) more students were enrolled in this
of the existing academic performance prediction algorithms; course (N = 156) than other comparable courses, and (2) these
On the other hand, the algorithms proposed in this field are 156 students were more active on our self-developed SPOC
relatively simple, which are mainly based on basic statistics platform, thus providing abundant valuable behavioral data.
models (e.g. ANOVA and Post hoc tests) or ML algorithms Our dataset consists of the following four data sources (see
(e.g. SVM and LR). Table II):
⚫ SPOC Data. Two different types of data were collected
C. MULTISOURCE AND MULTIFEATURE on the SPOC platform. The first type is log files, which
It has been verified in many literatures that the predictive are recorded when a student logs in or out of the system,
power could be improved by multisource data and and the second type is posts on the SPOC discussion
multifeatured fusion. For example, it is demonstrated that forum, which records discussions related to students’
the performances of predicting both at-risk students [65] and learning experience.
stock market [66] could be improved by combining multi- ⚫ Smart Card Data. Similar to most modern universities,
source data. Similarly, in [22,23], the performances of in our university, all students have a campus smart card
academic performance prediction are improved by combing registered under their real name. The usage of this smart
traditional diligence features with orderliness (and sleep card, such as for borrowing books from the library,
patterns) features. In [67], the accuracy of scholars’ entering the library, consuming meals in campus
scientific impact prediction is improved by using multi-field cafeterias, shopping on campus, or making an
feature extraction and fusion. In [68], a contrast experiments appointment with the school clinic, is captured daily.
of eleven different feature combinations were conducted, ⚫ WiFi Data. There are approximately 3000 wireless
demonstrating that the performances of sentiment access points at our university, covering most areas of
classification can be improved by multifeatured fusion. campus. Once a student passes by one of these points,
However, we note that multisource and/or multifeature the MAC address of his/her device (e.g., tablet, laptop,
data cannot always guarantee a higher predictive power. For or smart phone) can be recorded [40]. In our study, to
instance, [69] shows that the results of predictive modeling, distinguish among diverse behaviors, the entire campus
notwithstanding the fact that they are collected within a is divided into several different areas, including a study
single institution, strongly vary across courses. Actually, area and a relaxation/dormitory area.
compared with single course, the portability of the prediction ⚫ Central Storage Data. As shown in Table II, other
models across courses (multisource data) is lower [69]. features used in our study, including the students’
Therefore, the effect of multisource and multifeature data personal information and academic records, are recorded
needs to be varied in experiments. by the central storage system of our university.
For simplicity, the former three data sources are designated D1,
III. Methodology D2, and D3, see Table II. To evaluate the effect of multisource
In our study, academic performance prediction is considered data on the academic performance prediction, which is similar
as a classification problem. According to the high-low to the studies introduced in Section II.C, contrast experiments
discrimination index proposed by Kelley [41], academic of different data source combinations were conducted in our
performance is divided into low-, medium-, and high- groups. study (see Section IV). To be specific, based on D1, D2, and
Given a digital campus dataset, according to Fig. 2, the main D3, in total, the following seven data combinations could be
task is to first extract features from the raw multisource data; obtained: D1, D2, D3, D1+D2, D1+D3, D2+D3, and D1+D2+D3.
then select the features that are strongly correlated with The latter data source, i.e., Central Storage (which is relatively
academic performance and use these features to train the static and simple), is considered fundamental information
classification algorithm; and finally provide visualized shared by all seven combinations.
feedback based on the prediction results. In our study, privacy protection is seriously considered, and
In this section, the three modules designed in AugmentED all students’ identifying information is anonymized. The
(see Fig. 2) are described in detail. infringement of students’ privacy is avoided during both the
data collection period and data analysis period. First, the
A. DATA MODULE student IDs are already pseudonymous in our raw data.
A flowchart of this module is shown in Fig. 3, which includes Moreover, the resolution of the students’ spatial-temporal
the following three parts. trajectory is reduced. All information regarding the exact
1) RAW DATA date/area showing when/where a behavior occurred is
Permission to access the raw data was granted by the removed. Therefore, it would be reasonably difficult to
Academic Affairs Office of our university. The raw dataset reidentify individuals through our dataset.
used in our study was captured from students engaging in the 2) DATA TRIALS
course of “Freshman Seminar” during the fall semester of In our study, to initially understand how a student’s behavior
2018-2019. The “Freshman Seminar” was chosen for the changes as the semester progresses, on the one hand, data

2 VOLUME XX, 2017

trails across the whole semester is processed and organized in slopes are calculated. Additionally, to further measure
chronological order, including when, where and how a the amount of variance in the dataset that is not
behavior occurs; On the other hand, data trails per week is explained by the traditional regression model, the
summarized according to preliminary statistics, including the residual sum of squares (RSS) is also evaluated (see
flowing information in each week, e.g., how often a behavior Table II). In our study, those linear metrics are mainly
occurs (i.e. total frequency), how long does a behavior last (i.e. calculated by the python model sklearn.linear_model.
duration), and how much money does a student need. ⚫ BC-nonLinear. Similar to the traditional approach,
Regarding the SPOC data (D1), online learning is first, entropy and HMM-based entropy are evaluated in
quantified by (i) learning frequency and duration, which are our study, measuring the regularity and diversity of
extracted from the raw log files; and (ii) online learning campus lifestyles respectively. Notably, the hidden
emotion, which is extracted from the discussion forum. states are numerically extracted by the MATLAB
Regarding the Smart Card data (D2), multiple behaviors are function hmmestimate, then the HMM-based entropy of
involved, e.g. library interaction (including borrowing a book the extracted hidden states is evaluated by the
and library entry), see Table II. Regarding the WiFi data (D3), MATLAB function entropy. Second, to further
first, student’s trajectory is calculated, mainly including when discover nonlinear behavioral patterns, the following
a student comes to a place; how often does he/she visit this three nonlinear metrics are proposed and extracted for
place (i.e. frequency); how long does he/she stay there (i.e. the first time: LyE, HurstE, and DFA, measuring the
duration). Second, attendance is calculated by combining stability, predictability, and long-range correlation of
WiFi data with class schedules. Specifically, to distinguish campus lifestyles respectively. In our study, four
among behavioral patterns during different periods, three nonlinear metrics (entropy, LyE, HurstE, and DFA) are
types of durations (namely, durations on working days, on evaluated by a numpy-based python library, i.e. nolds,
weekends, and throughout the semester) and two types of based on the 0&1 sequence (see Appendix A).
attendances (namely, attendance during the final study week ⚫ BC-LSTM. LSTM-based features representing
and attendance throughout the semester) are evaluated in our dynamic changes in temporal behavioral patterns are
study. calculated as follows. First, as input information, data
3) FEATURE EXTRACTION trails from multiple behaviors are organized together
To gain a deeper insight into students’ behavioral patterns, week by week, see Fig. 3. In each week, the basic
as summarized in Section II.A, in our study behavioral information of all multiple behaviors involved in our
change is evaluated by linear, nonlinear, and deep learning study is summarized, including for example how many
(LSTM) methods, see Fig. 3. times having breakfast and borrowing books from
⚫ BC-Linear. Similar to the traditional approach, linear library etc. occurred respectively. Subsequently, this
behavioral change is quantified by behavioral slope and weekly information is fitted into a Keras LSTM
behavioral breakpoint. Students behavioral series are network, then features representing the weekly
fitted by two linear regressions, subsequently the behavioral patterns that might change throughout
optimized breakpoint is selected by BIC and behavioral semester are extracted.

TABLE II
CHARACTERISTICS AND FEATURES EVALUATED IN OUR STUDY
Raw Data BC-Linear BC-nonLinear BC-LSTM Basic info.
(Behavioral Change-Linear) (Behavioral Change-nonLinear) (Behavioral
Change-LSTM)
Data Data Data Content Slope Breakpoint RSS Entropy HMM-based LyE HurstE DFA LSTM-based Freq Duration
Source Label Entropy Features

On- online study         

SPOC D1 
line emotion   
borrowing a book         
library entry         
Smart eating         
D2 
Card breakfast         
Off- consumption         
line clinical visit         
study area         
WiFi D3 
relaxation area         
⚫ academic records: academic history and class schedule
Central Storage
⚫ personal information: gender, age, and grade
Note: (i) “” denotes features/characteristics extracted; (ii) “emotion” represents the students’ emotion patterns on SPOC forums in terms of engagement
behaviors and the following three different emotions [36-38]: positivity, negativity, and confusion; (iii) “Freq” represents the total times of a behavior
throughout the semester; and (iv) “grade” represents the year when a student starts his/her college study.

2 VOLUME XX, 2017

Feature Extraction
data trials per week Behavioral Change-Linear (BC-Linear)
Linear regressions + ➢Slope
Bayesian information ➢Breakpoint

Statistical Results
Statistical Results
Statistical Results
criterion (BIC) ➢RSS

Behavioral Change-LSTM (BC-LSTM)

FIGURE 3. Flowchart of the data module.

Finally, the robustness of the algorithm is tested by 10-fold

B. PREDICTION MODULE cross validation.
The main task of this module is to select features and use these
features to train the prediction algorithm. C. VISUALIZATION MODULE
1) FEATURE SELECTION The main task of this module is to provide personalized
In our study, 708 different types of features are extracted, feedback, including GPA prediction and a visualized summary
including 510 linear features, 119 nonlinear features, 50 of the students’ behavioral patterns.
LSTM-based features, and 29 basic features (including e.g.
frequency and duration, gender, age, and grade). For instance, IV. EXPERIMENTAL RESULTS
because multiple behaviors are involved in our study, there are In this section, first, the experimental results of AugmentED
20 DFA related features in total to quantify long-range is presented and analyzed. Second, to evaluate the
correlation for each behavior individually (e.g. library entry). effectiveness of multisource and multifeature, contrast
The distributions of the evaluated features and GPA are experiments are conducted, and the corresponding results are
spread in different value scopes. Therefore, to eliminate a discussed. Finally, visualized feedback offered to students are
potential effect on the correlation analysis, both the features designed.
and GPA are normalized by min-max normalization.
Additionally, to improve the performance of the prediction A. PREDICTION RESULTS
algorithms, the top 130 features with the most significant The experimental results of AugmentED are shown in the last
effect on academic performance are selected by the five rows of Table III (i.e., RF*, GBRT*, KNN*, SVM* and
SelectKBest function in a python library named scikit-learn. XGBoost*), which are highlighted in bold. Five indexes
(accuracy, precision, recall, f1, and AUC) are used to evaluate
2) PREDICTION ALGORITHM
the performance.
Subsequently, the selected features are used to train the ML- Notely, AugmentED is proposed based on
based classification algorithm for the academic performance (i) multisource data, i.e. D1+D2+D3 (including SPOC,
prediction. Specifically, in our study, five ML algorithms are Smart Card, and WiFi data);
applied, including RF (random forest), GBRT (gradient boost (ii) multiple features, i.e. C-III (including BC-Linear,
regression tree), KNN (k-nearest neighbor), SVM, and
BC-nonLinear, and BC-LSTM features).
XGBoost (extreme gradient boosting). The hyperparameters * in Tabe III denotes that C-III feature combination is used in
of the ML and LSTM algorithms are optimized by the corresponding ML algorithms for academic performance
GridSearchCV in scikit-learn.
prediction.
3) CROSS VALIDATION From Table III, it can be seen that, first, the academic
Our dataset is divided into a training set and a test set at the performance can be predicted by AugmentED with quite high
ratio of 7:3. The classification algorithm is first trained and accuracy. Second, the performance of the five different ML
then applied to the test set to predict academic performance. algorithms (RF*, GBRT*, KNN*, SVM* and XGBoost*) are

2 VOLUME XX, 2017

similar, which can all lead to a good prediction result. To SVM* 0.807 0.843 0.807 0.799 0.843
XGBoost* 0.807 0.843 0.807 0.803 0.867
clarify, we consider the case of precision values, see the 5th
RF 0.584 0.612 0.584 0.567 0.662
column of Table III. The precision values of five ML GBRT 0.571 0.584 0.571 0.560 0.626
algorithms are 0.873, 0.877, 0.863, 0.889, and 0.871 C-I KNN 0.551 0.593 0.551 0.558 0.641
respectively, indicating that (i) its minimum value is 0.863, i.e. D2+D3 SVM 0.545 0.550 0.545 0.530 0.579
the precision of AugmentED is no less than 86.3%; (ii) the XGBoost 0.559 0.563 0.559 0.544 0.579
(Smart C-II LSTM 0.449 0.469 0.449 0.434 0.551
difference between the minimum and maximum values is Card + RF* 0.781 0.819 0.781 0.782 0.837
0.026, which is quite small, i.e. AugmentED is independent of WiFi) GBRT* 0.781 0.801 0.781 0.779 0.816
ML algorithms. C-III KNN* 0.755 0.793 0.755 0.754 0.793
TABLE III SVM* 0.762 0.794 0.762 0.763 0.815
PREDICTION RESULTS (THE AVERAGE CLASSIFICATION RESULTS OF 10- XGBoost* 0.762 0.801 0.762 0.767 0.813
FOLDER CROSS VALIDATION) RF 0.630 0.679 0.630 0.630 0.699
Data Feature Algorithm accuracy precision recall f1 AUC GBRT 0.637 0.671 0.637 0.620 0.703
RF 0.604 0.642 0.604 0.602 0.698 D1+D2+ C-I KNN 0.604 0.647 0.604 0.596 0.669
GBRT 0.584 0.643 0.584 0.581 0.684 D3 SVM 0.635 0.696 0.635 0.644 0.716
C-I KNN 0.500 0.497 0.500 0.473 0.627 XGBoost 0.608 0.670 0.608 0.598 0.691
SVM 0.539 0.590 0.539 0.534 0.641 (SPOC + C-II LSTM 0.501 0.544 0.501 0.473 0.578
D1 XGBoost 0.521 0.586 0.521 0.520 0.605 Smart RF* 0.852 0.873 0.852 0.844 0.857
C-II LSTM 0.488 0.517 0.488 0.491 0.583 Card + GBRT* 0.852 0.877 0.852 0.852 0.876
(SPOC) RF* 0.776 0.806 0.776 0.774 0.807 WiFi) C-III KNN* 0.847 0.863 0.847 0.841 0.851
GBRT * 0.764 0.784 0.764 0.761 0.800 SVM* 0.866 0.889 0.866 0.865 0.872
C-III KNN* 0.801 0.825 0.801 0.802 0.840 XGBoost* 0.859 0.871 0.859 0.850 0.874
SVM* 0.795 0.818 0.795 0.792 0.836
Note: (i) D1+D2+D3 is the multiple data source used in AugmentED to predict
XGBoost* 0.789 0.823 0.789 0.787 0.836
academic performance, including SPOC, Smart Card and WiFi data; (ii) The
RF 0.507 0.538 0.507 0.496 0.600
rows highlighted in light blue, light pink, light green, denote the three
GBRT 0.487 0.510 0.487 0.481 0.604 following feature or feature combinations are used for prediction, C-I (BC-
C-I KNN 0.462 0.523 0.462 0.454 0.617 linear and BC-nonlinear features), C-II (only BC-LSTM, i.e., LSTM-based
SVM 0.528 0.571 0.528 0.515 0.637 features), C-III (BC-Linear, BC-nonLinear, and BC-LSTM features).
D2 XGBoost 0.508 0.522 0.508 0.495 0.617
C-II LSTM 0.436 0.467 0.436 0.425 0.552
(Smart B. COMPARATIVE EXPERIMENTS
RF* 0.742 0.795 0.742 0.746 0.808
Card)
GBRT* 0.737 0.778 0.737 0.733 0.791 In this part, contrast experiments are conducted to evaluate the
C-III KNN* 0.733 0.780 0.730 0.729 0.771 prediction effect of multisource and multifeature
SVM* 0.719 0.744 0.719 0.712 0.773 combinations.
XGBoost* 0.733 0.760 0.733 0.733 0.767
RF 0.424 0.408 0.424 0.395 0.511 1) MULTISOURCE
GBRT 0.437 0.473 0.437 0.428 0.532 Comparisons of the performance of different data source
C-I KNN 0.399 0.424 0.399 0.396 0.531 combinations are conducted, see the 1st column of Table III.
SVM 0.391 0.432 0.391 0.387 0.480
XGBoost 0.425 0.383 0.425 0.395 0.503
As shown in Table III, a large number of multiple data sources
D3
C-II LSTM 0.413 0.398 0.413 0.380 0.512 can lead to a more accurate prediction result.
(WiFi) RF* 0.545 0.620 0.545 0.539 0.639 To clarify, we consider the case of SVM*, from D1 to D1+D2
GBRT* 0.558 0.600 0.558 0.548 0.650 and D1+D2+D3 (see Tables III and Fig. 4), all five evaluation
C-III KNN* 0.501 0.591 0.501 0.487 0.607
indexes significantly improves with the types of data sources
SVM* 0.487 0.404 0.487 0.416 0.590
XGBoost* 0.546 0.624 0.546 0.529 0.646 increases. Specifically, (i) the accuracy values of D1, D1+D2,
RF 0.616 0.660 0.616 0.608 0.688 and D1+D2+D3 are 0.795, 0.821, 0.866, respectively; (ii) the
GBRT 0.603 0.676 0.603 0.605 0.692 precision values are 0.818, 0.848 and 0.889; (iii) the recall
C-I KNN 0.534 0.608 0.534 0.526 0.626
values are 0.795, 0.821 and 0.866; (iv) the f1 values are 0.792,
D1+D2 SVM 0.608 0.661 0.608 0.617 0.683
XGBoost 0.565 0.603 0.565 0.556 0.641 0.821 and 0.865; and (v) the AUCs value are 0.836, 0.862 and
(SPOC + C-II LSTM 0.493 0.520 0.493 0.488 0.579 0.872. It is verified that multisource data can enhance the in-
Smart RF* 0.814 0.839 0.814 0.815 0.855 depth insight gained into students’ behavioral patterns.
Card) GBRT* 0.809 0.843 0.809 0.809 0.853
C-III KNN* 0.815 0.839 0.815 0.815 0.857
SVM* 0.821 0.848 0.821 0.821 0.862
XGBoost* 0.833 0.860 0.833 0.826 0.813
RF 0.616 0.650 0.616 0.614 0.697
GBRT 0.616 0.664 0.616 0.612 0.702
C-I KNN 0.538 0.594 0.538 0.529 0.626
D1+D3 SVM 0.602 0.640 0.602 0.598 0.690
XGBoost 0.573 0.606 0.573 0.568 0.626
(SPOC +
C-II LSTM 0.488 0.500 0.488 0.469 0.559
WiFi)
RF* 0.800 0.867 0.800 0.799 0.849
C-III GBRT* 0.793 0.851 0.793 0.795 0.851
KNN* 0.814 0.859 0.814 0.811 0.857

VOLUME XX, 2017 9

accuracy value of SVM* is 0.866, which is much higher than

that of SVM and LSTM (i.e., 0.635 and 0.501, respectively),
see the 4th column of Table III. This result indicates that the
multifeature combination proposed in our study (i.e. C-III) can
0.8 significantly improve the predictive power.

C. IDENTIFICATION OF AT-RISK STUDNETS BASED ON

accuracy THE PREDICITON
percision
recall The prediction result obtained by AugmentED can be used to
f1 identify at-risk students, i.e., determine whether a student
AUC
0.6 belongs in a low performance group. It could be quite helpful
for early warning and feedback to be provided to at-risk
students before the final exam week.

D1+D2+D3
To illuminate how AugmentED could potentially help
D2+D3

D1+D3

D1+D2
students optimize their college lifestyles and consequently
improve their academic performance, a feedback example
0.4
D3

delivered to one at risk student is shown in Fig. 6.

FIGURE 4. Comparisons of the SVM* performance of different data source 0.25

3.0 0.35
combinations.
2.5
0.15 0.25
2) MULTIFEATURE 2.0
Comparisons of the performance of three different feature 1.5 at risk! 0.05 at risk!
0.15
at risk!
combinations (C-I, CI-II, C-III) are also conducted, see the 2nd low medium high low medium high low medium high
(a1) (a2) (a3)
column of Table III and Fig. 5. 1.1 0.7
⚫ C-I (including BC-linear and BC-nonlinear features), see 1.8
0.6
the rows of Table III highlighted in light blue. Its 1.4 0.8
0.5
corresponding MLs are denoted as RF, GBRT, KNN, at risk! at risk! at risk!
1.0 0.5 0.4
SVM, and XGboost; low medium high low medium high low medium high
⚫ C-II (only including BC-LSTM features), see the rows (b1)
1.00
(b2) (b3)
1.0
of Table III highlighted in light pink; 14
0.99
⚫ C-III (including BC-Linear, BC-nonLinear, and BC- 10
0.6
0.98
LSTM features), see the rows of Table III highlighted in
6 at risk! at risk! 0.2 at risk!
light green. Its corresponding MLs are denoted as RF*, 0.97
low medium
low medium high low medium high low high
GBRT*, KNN*, SVM*, and XGboost*. (c1) (c2) (c3)

FIGURE 6. Example of a feedback given to one at risk student, including the

average values and 95% confidence intervals of the following nine assistant
0.8 indicators from the low-, medium-, and high- performance groups: First, (a1)
D-Linear, (a2) D-postRSS, (a3) D-preSlope are the indicators representing
(weighted) linear, RSS in post-semester, slope in pre-semester patterns of all
behaviors (rather than one single behavior) respectively; Second, (b1) D-
nonLinear; (b2) D-Entropy; (b3) D-DFA are the indicators representing
0.6 (weighted) nonlinear, entropy, and DFA patterns of all behaviors. Finally, (c1) D-
LSTM is the indicators representing the temporal pattern of all behaviors, while
(c2) LSTM-49 and (c3) LSTM-1 are two of the 50 features extracted by our LSTM
network.
0.4 We note that except for the prediction result itself, the
extracted features that are strongly correlated with academic
performance can also be taken as assistant indicators, to
XGBoost*

0.2 identify at-risk students. Traditionally, those features can be

XGBoost
GBRT*

selected by either statistical analysis (e.g. by ANOVA) or ML

LSTM

LSTM
GBRT

KNN*

SVM*
SVM
KNN

algorithms (e.g. feature importance returned by RF). We

RF*
RF

0 recall that in our study, multiple behaviors are involved, and

each behavior is quantified by a plenty of -linear, -nonlinear,
FIGURE 5. Comparisons of the accuracy values of three feature
combinations in the (D1+D2+D3) dataset. and -LSTM features. Therefore, a particular feature (e.g.
As shown in Fig. 5 and Table III, all five evaluation indexes entropy) of one single behavior (e.g. either having breakfast or
(accuracy, precision, recall, f1, and AUC) of C-III are learning online) might not make sense to gain a
significantly higher than those of C-I and C-II. To clarify, we comprehensive evaluation of student’ behavioral patterns.
consider the case of SVM* in the (D1+D2+D3) dataset, the

VOLUME XX, 2017 9

From this perspective, in Fig. 6, nine assistant indicators are depth insight into student behavioral patterns and potentially
calculated and plotted. help students to optimize their interactions with the university.
We begin by discussing the indicators of -linear, -nonlinear, In our study, a model named AugmentED is proposed to
and -LSTM features (see Appendix B), which are denoted as predict the academic performance of college students. Our
D-linear, D-nonLinear and D-LSTM respectively, contributions in this study are related to three sources. First,
representing the (weighted) linear, nonlinear and temporal regarding data fusion, to the best of our knowledge, this work
pattern of all multiple behaviors involved in our study (rather is the first to capture, analyze and use multisource data
than one single behavior). Regarding these three indicators, covering not only online and offline learning but also campus-
(i) The average values and 95% confidence intervals life behaviors inside and outside of the classroom for academic
(from the low-, medium-, and high- academic performance prediction. Based on these multisource data, a
performance groups) are plotted in the left column of rich profile of a student is obtained. Second, regarding the
Fig. 6. feature evaluation, behavioral change is evaluated by linear,
(ii) The Pearson correlation between the indicators and nonlinear, and deep learning (LSTM) methods respectively,
academic performance is calculated, see the 2nd, 5th, which provides a systematical view of students’ behavioral
and 8th rows of Table IV which are highlighted in light patterns. Specifically, it is the first time that three novel
gray. nonlinear metrics (LyE, HurstE, and DFA) and LSTM are
Furthermore, six more indicators are calculated and applied in students’ behavioral time series analysis. Third, our
provided as supplementary, see the 2nd and 3rd columns of Fig. experimental results demonstrate that AugmentED can predict
6. The Pearson correlation between these indicators and academic performance with quite high accuracy, which help
academic performance is also calculated and listed in Table IV. to formulate personalized feedback for at-risk (or unself-
From Table IV it can be seen that all the nine indicators are disciplined) students.
strongly correlated with academic performance. Additionally, However, there are also some limitations in our study. To
in Fig. 6, the apparent distinction among three academic gain a multisource dataset, we scarified the scale the dataset
performance groups demonstrates that all the nine indicators by only using student-generated data within a single course.
can offer strong support in at-risk student identification. This limitation might have a certain negative influence on the
To clarify, we consider the case of D-linear. On the one generalization of AugmentED. Furthermore, in this study, we
hand, its average values and 95% confidence intervals from mainly focus on behavioral change. Other
low-, medium-, and high- academic performance groups are characteristics/features (e.g., peer effect, sleep) that are worthy
(1.4570.199, 2.1600.193, 3.0350.341), see Fig. 6(a1), of consideration were not evaluated in this study.
indicating clear separation. On the other hand, its correlation In conclusion, our study is based on a complete passive
coefficient is 0.534, see the 3rd row of Table IV, i.e., this daily data capture system that exists in most modern
indicator is significantly correlated with academic universities. This system can potentially lead to continual
performance. Therefore, D-linear can be taken as an indicator investigations on a larger scale. The knowledge obtained in
to explore which student is at risk because of the low this study can also potentially contribute to related research
performance he/she will achieve. among K-12 students.
TABLE IV
CORRELATION COEFFICIENT AND P-VALUE Appendix A
Assistant Indexes Correlation coefficient P-value
Fig.6(a1) D-Linear 0.534 7.18e-13 To evaluate the four nonlinear metrics (entropy, LyE, HurstE,
Fig.6(a2) D-postRSS 0.366 2.65e-06 and DFA) of the time series, we concentrate on the precise
Fig.6(a3) D-preSlope 0.425 3.12e-08 time of day during which the behaviors occurred. Therefore,
Fig.6(b1) D-nonLinear 0.392 4.28e-07 in our study, the involved time is first converted to a discrete
Fig.6(b2) D-entropy 0.402 2.02e-07
Fig.6(b3) D-DFA 0.345 1.05e-05
time sequence. Then, according to the represented discrete
Fig.6(c1) D-LSTM 0.703 1.32e-24 time sequence, the raw behavioral time series data are
Fig.6(c2) LSTM-49 0.254 0.001 converted to the 0&1 sequence as follows:
Fig.6(c3) LSTM-1 0.734 1.23e-27
STEP 1. TIME DATA REPRESENTATION
V. CONCLUSION AND FUTURE WORK The time data are converted to a discrete sequence with a
As an important issue in the education data mining field, normalized time interval by the following three steps:
academic performance prediction has been studied by many ⚫ Step 1.1. The entire semester was from 01/09/2018
researchers. However, due to lack of richness and diversity in (September 1st) to 20/01/2019 (January 20st) and includes
both data sources and features, there still exist a lot of a total of 140 days. Thus, each day can be numbered
challenges in prediction accuracy and interpretability. To from 1 to 140, resulting in a discrete sequence {p1, p2, . . .,
initially alleviate this problem, our study aims at developing a pi} = {1, 2, . . ., 140}, where i denotes the ith day in the
robust academic performance prediction model, to gain an in- semester;

VOLUME XX, 2017 9

⚫ Step 1.2. We divide each day into 48 time bins such that ( N − Rank ( xn )) / N , Corr ( X k )  0 (B-1)
each bin spans 30 minutes. Subsequently, every bin is Scorekn = 
 Rank ( xn ) / N , Corr ( X k )  0
encoded from 1 to 48, i.e., {q1, q2, . . ., qj} = {1, 2, . . .,
48}, where j denotes the jth time bin. For example, We assume that there are N students and K extracted
“0:01—0:30” is the 1st time bin, “0:31—1:00” is the 2nd features in total. Corr(Xk) is the Pearson correlation
bin, etc. coefficient between the kth feature XK and students’
⚫ Step 1.3. By combining the sequences of days and time academic performance, where k  K. Rank(xn) means the
bins, the time during the spring semester is mapped to a ranking of the nth student’s (denoted as un, where n  N)
discrete time sequence with length Nt, i.e., {T1, T2, . . ., feature among all students. For example, there are three
Tij} = {1, 2, . . ., Nt}, where students (u1, u2, u3), and their kth feature (e.g. slope value
Tij =( pi − 1)  48 + q j ， (A-1) of having breakfast) are (0.8, 0.4, 0.6), then we have
And Nt = 6720. Specifically, if the time is “03/09/2018, Scorek1 = 0, Scorek2 = 0.667, and Scorek3 = 0.333 because
10:24”, i.e., pi = 3 and qj = 21 (the 21st time bin of the 3rd Corr(Xk) > 0.
day), according to Eq. A-1, Tij = 2  48 + 21 = 117, i.e., ⚫ Step 2. The indicator of linear feature group, D-Linear,
“03/09/2018, 10:24” is encoded by 117. is calculated by utilizing the feature scores as follows,
K

STEP 2. BEHAVIORAL DATA REPRESENTATION

 ( Corr ( X
k =1
k ) * Scorekn ) (B-2)

Following the time data representation, the raw behavioral

data are converted to a 0&1 sequence by the following two We note that essentially D-Linear is the weighted mean of all
steps: linear feature scores, and its weights are the correlation
⚫ Step 2.1. First, a zero sequence Xij with length Nt is coefficients.
generated, and
REFERENCES
⚫ Step 2.2. If a behavior occurs at time Tij, the Tijth element [1] A. Furnham, and J. Monsen, “Personality traits and intelligence
of the corresponding discrete behavioral sequence Xij is predict academic school grades," Learning and Individual Differences,
set to 1, i.e., Xij = 1. For instance, if a student has a meal vol. 19, no. 1, pp. 0-33, 2009.
at “03/09/2018, 10:24” (where Tij = 117), the 117th [2] M. A. Conard, “Aptitude is not enough: How personality and behavior
predict academic performance,” Journal of Research in Personality,
element of the discrete meal sequence is set to 1, i.e., Xij vol. 40, no. 3, pp. 339-346, 2006.
= 117. [3] T. Chamorropremuzic, and A. Furnham, “Personality predicts
This process can by described as follows: academic performance: Evidence from two longitudinal university
samples,” Journal of Research in Personality, vol. 37, no. 4, pp. 319-
1, if a behavior happens at time Tij 338, 2003.
X ij =  ， (A-2) [4] R. Langford, C. P. Bonell, H. E. Jones, T. Pouliou, S. M. Murphy, and
0, otherwise E. Waters, “The WHO health promoting school framework for
improving the health and well‐being of students and their academic
where Xij  [0,1]. According to Eq. A-2, all behavioral data achievement,” Cochrane Database of Systematic Reviews, vol. 4, no.
listed in Table II (including SPOC online study, borrowing a 4, pp. CD008958, 2014.
book, library entry, meal consumption, breakfast consumption, [5] A. Jones, and K. Issroff, “Learning technologies: Affective and social
issues in computer-supported collaborative learning,” Computers &
consumption, clinical visits, and WiFi data in the study and Education, vol. 44, no. 4, pp. 395-408, 2005.
relaxation areas) are converted to discrete behavioral [6] D. N. A. G. Van, E. Hartman, J. Smith, and C. Visscher, “Modeling
sequences. relationships between physical fitness, executive functioning, and
academic achievement in primary school children,” Psychology of
Sport & Exercise, vol. 15, no. 4, pp. 319-325, 2014.
Appendix B [7] R. Wang, F. Chen, Z. Chen, T. Li, and A. T. Campbell, “StudentLife:
Regarding the nine assistant indicators described in Section Assessing mental health, academic performance and behavioral trends
IV.C, the former seven are calculated according to [24]; while of college students using smartphones,” In Proc. of the ACM
International Joint Conference on Pervasive & Ubiquitous Computing,
the latter two (LSTM-49, LSTM-1) are selected from the Seattle, WA, USA, 2014.
extracted 50 LSTM-features without any further processing. [8] R. Wang, G. Harari, P. Hao, X. Zhou, and A. T. Campbell, “SmartGPA:
The fundamental mathematical approach to calculate the How smartphones can assess and predict academic performance of
college students,” In Proc. of the ACM International Joint Conference
former seven indicators is the same. The major similarity on Pervasive & Ubiquitous Computing, Osaka, Japan, 2015.
between these indicators is that they all represent certain [9] M. T. Trockel, M. D. Barnes, and D. L. Egget, “Health-related
property of all multiple behaviors involved in our study. The variables and academic performance among first-year college students:
major difference is the input features used for calculation. To Implications for sleep and other behaviors,” Journal of American
College Health, vol. 49, no. 3, pp. 125-131, 2000.
clarify, in this section the mathematical approach to the [10] D. M. Hansen, S. D. Herrmann, K. Lambourne, J. Lee, and J. E.
calculation of D-linear is given. Donnelly, “Linear/nonlinear relations of activity and fitness with
⚫ Step 1. The score of each linear feature (e.g. slope) for children’s academic achievement,” Med Sci Sports Exerc. vol. 46, no.
12, pp. 2279-2285, 2014.
each student is calculated as follows, [11] A. K. Porter, K. J. Matthews, D. Salvo, and H. W. Kohl, “Associations
of physical activity, sedentary time, and screen time with
cardiovascular fitness in United States adolescents: Results from the

VOLUME XX, 2017 9

NHANES national youth fitness survey (NNYFS),” Journal of factors affecting students’ academic performance in higher education,”
Physical Activity and Health, pp. 1-21, 2017. IEEE Access, vol. 7, pp. 98725-98742, 2019.
[12] K. N. Aadland, O. Yngvar, A. Eivind, K. S. Bronnick, L. Arne, and G. [29] T. Phan, S. G. Mcneil, and B. R. Robin, “Students’ patterns of
K. Resaland, “Executive functions do not mediate prospective engagement and course performance in a massive open online course,”
relations between indices of physical activity and academic Computers & Education, vol. 95, pp. 36-44, 2016.
performance: the active smarter kids (ask) study,” Frontiers in [30] S. Liu, X. Peng, H. Cheng, Z. Liu, J. Sun, and C. Yang, “Unfolding
Psychology, vol. 8, pp. 1088, 2017. sentimental and behavioral tendencies of learners’ concerned topics
[13] M. Credé, S. G. Roch, and U. M. Kieszczynska, "Class attendance in from course reviews in a MOOC,” Journal of Educational Computing
college: A meta-analytic review of the relationship of class attendance Research, vol. 57, no. 3, pp. 670-696, 2018.
with grades and student characteristics," Review of Educational [31] S. Helal, J. Li, L. Liu, E. Ebrahimie, S. Dawson, and D. J. Murray,
Research, vol. 80, no. 2, pp. 272-295, 2010. “Predicting academic performance by considering student
[14] S. P. Gilbert, and C. C. Weaver, “Sleep quality and academic heterogeneity,” Knowledge-Based Systems, vol. 161, pp. 134-146,
performance in university students: A wake-up call for college 2018.
psychologists,” Journal of College Student Psychotherapy, vol. 24, no. [32] Z. Liu, C. Yang, L. S. Rüdian, S. Liu, L. Zhao, and T. Wang,
4, pp. 295-306, 2010. “Temporal emotion-aspect modeling for discovering what students are
[15] A. Wald, P. A. Muennig, K. A. O"Connell, and C. E. Garber, concerned about in online course forums,” Interactive Learning
“Associations between healthy lifestyle behaviors and academic Environments, vol. 27, pp. 598-627, 2019.
performance in U.S. undergraduates: A secondary analysis of the [33] A. Zollanvari, R. C. Kizilirmak, Y. H. Kho, and D. Hernandez-torrano,
American college health association\"s National College Health “Predicting students’ GPA and developing intervention strategies
Assessment II,” American Journal of Health Promotion, vol. 28, no. based on self-regulatory learning behaviors,” IEEE Access, vol. 5, pp.
5, pp. 298-305, 2014. 23792-23802, 2017.
[16] P. Scanlon, and A. Smeaton, “Identifying the impact of friends on their [34] A. Akram, C. Fu, Y. Li, M. Y. Javed, R. Lin, Y. Jiang, and Y. Tang,
peers academic performance,” In Proc. of the IEEE/ACM “Predicting students academic procrastination in blended learning
International Conference on Advances in Social Networks Analysis course using homework submission data,” IEEE Access, vol. 7, pp.
and Mining (ASONAM), San Francisco, CA, USA, 2016. 102487-102498, 2019.
[17] E. L. Faught, J. P. Ekwaru, D. Gleddie, K. E. Storey, M. Asbridge, and [35] L. Gao, Z. Zhao, L. Qi, Y. Liang, and J. Du, “Modeling the effort and
P. J. Veugelers, “The combined impact of diet, physical activity, sleep learning ability of students in MOOCs,” IEEE Access, vol. 7, pp.
and screen time on academic achievement: A prospective study of 128035-128042, 2019.
elementary school students in Nova Scotia, Canada,” International [36] Z. Liu, H. Cheng, S. Liu, and J. Sun, “Discovering the two-step lag
Journal of Behavioral Nutrition and Physical Activity, vol. 14, no. 1, behavioral patterns of learners in the college SPOC platform,”
pp. 29, 2017. International Journal of Information and Communication Technology
[18] E. L. Faught, D. Gleddie, K. E. Storey, C.M. Davison, and P. J. Education, vol. 13, no. 1, pp. 1-13, 2017.
Veugelers, “Healthy lifestyle behaviours are positively and [37] Z. Liu, W. Zhang, H. Cheng, J. Sun, and S. Liu, “Investigating
independently associated with academic achievement: an analysis of relationship between discourse behavioral patterns and academic
self-reported data from a nationally representative sample of Canadian achievements of students in SPOC discussion forum,” International
early adolescents,” Plos One, vol. 12, no. 7, pp. e0181938, 2017. Journal of Distance Education Technologies, vol. 16, no. 2, pp. 37-50,
[19] V. Kassarnig, E. Mones, A. Bjerre-Nielsen, P. Sapiezynski, D. D. 2018.
Lassen, and S. Lehmann, “Academic performance and behavioral [38] Z. Liu, N. Pinkwart, H. Liu, S. Liu, and G. Zhang, “Exploring students
patterns,” Epj Data Science, vol. 7, no. 1, pp. 10, 2017. engagement patterns in SPOC forums and their association with
[20] J. E. Donnelly, C. H. Hillman, J. L. Greene, D. M. Hansen, C. A. course performance,” EURASIA Journal of Mathematics, Science and
Gibson, D. K. Sullivan, “Physical activity and academic achievement Technology Education, vol. 14, no. 7, pp. 3143-3158, 2018.
across the curriculum: Results from a 3-year cluster-randomized trial,” [39] B. Kim, B. Vizitei, and V. Ganapathi, “GritNet: Student performance
Preventive Medicine, vol. 99, pp. 140-145, 2017. prediction with deep learning,” in Proc. of the 11th International
[21] N. Morita, T. Nakajima, K. Okita, T. Ishihara, M. Sagawa, and K. Conference on Educational Data Mining, Buffalo, NY, USA, 2018,
Yamatsu, “Relationships among fitness, obesity, screen time and pp. 625-629.
academic achievement in Japanese adolescents,” Physiology & [40] S. Sahebi, and P. Brusilovshky, “Student performance prediction by
Behavior, vol. 163, pp. 161-166, 2016. discovering inter-activity relations,” In Proc. of the 11th International
[22] Y. Cao, J. Gao, D. Lian, Z. Rong, J. Shi, and Q. Wang, “Orderliness Conference on Educational Data Mining, Buffalo, NY, USA, 2018,
predicts academic performance: Behavioural analysis on campus pp. 87-96.
lifestyle,” Journal of the Royal Society Interface, vol. 15, pp. 146, [41] T. L. Kelley, “The selection of upper and lower groups for the
2018. validation of test items,” Journal of Educational Psychology, vol. 30,
[23] H. Yao, D. Lian, Y. Cao, Y. Wu, and T. Zhou, “Predicting academic no. 1, pp. 17–24, 1939.
performance for college students: A campus behavior perspective,” [42] J. Heo, S. Yoon, W. S. Oh, J. W. Ma, S. Ju, and S. B. Yun, "Spatial
ACM Transactions on Intelligent Systems and Technology, vol. 1, no. computing goes to education and beyond: Can semantic trajectory
1, pp. 1-20, 2019. characterize students?" In Proc. of the 5th ACM SIGSPATIAL
[24] Z. Wang, X. Zhu, J. Huang, X. Li, and Y. Ji, “Prediction of academic International Workshop on Analytics for Big Geospatial Data
achievement based on digital campus,” In Proc. of the 11th (BigSpatial'16), Oct. 31-Nov. 03, Burlingame, CA, USA, 2016.
International Conference on Educational Data Mining, Buffalo, NY, [43] J. Heo, H. Lim, S. B. Yun, S. Ju, S. Park, and R. Lee, "Descriptive and
USA, 2018. predictive modeling of student achievement, satisfaction, and mental
[25] I. Pytlarz, S. Pu, and M. Patel, “What can we learn from college health for data-driven smart connected campus life service," In Proc.
students’ network transactions? Constructing useful features for of the 9th International Conference on Learning Analytics &
students prediction,” in Proc. of the 11th International Conference on Knowledge (LAK'19), March, Tempe, AZ, USA, 2019.
Educational Data Mining, Buffalo, NY, USA, 2018, pp. 444-448. [44] X. Zhang, G. Sun, Y. Pan, H. Sun, Y. He, and J. Tan, “Students
[26] S. Ahmad, K. Li, A. Amin, M. S. Abnwar, and W. Khan, “A multilayer performance modeling based on behavior pattern,” Journal of Ambient
prediction approach for the student cognitive skills measurement,” Intelligence and Humanized Computing, vol. 9, pp. 1659–1670, 2018.
IEEE Access, vol. 6, pp. 57470-57484, 2018. [45] S. R. Eddy, “Hidden markov models,” Current Opin Struct Biol, vol.
[27] S. Qu, K. Li, S. Zhang, and Y. Wang, “Predicting achievement of 6, no. 3, 361–365, 1996.
students in smart campus,” IEEE Access, vol. 6, pp. 60264-60273, [46] https://ww2.mathworks.cn/help/stats/hidden-markov-models-
2018. hmm.html.
[28] N. Alalwan, W. M. Al-Rahmi, O. Alfarraj, A. Alzahrani, N. Yahaya, [47] J. Howcroft, J. Kofman, and E. D. Lemaire, “Review of fall risk
and A. Al-Rahmi, “Integrated three theories to develop a model of assessment in geriatric populations using inertial sensors,” Journal of
Neuroengineering & Rehabilitation, vol. 10, no. 1, pp. 1-12, 2013.

VOLUME XX, 2017 9

[48] A. Wolf, J. B. Swift, H. L. Swinney, and J. A. Vastano, “Determining [69] R. Conijn, C. Snijders, A. Kleingeld, and U. Matzat, “Predicting
lyapounov exponents from a time series,” Physica D Nonlinear student performance from LMS data: A comparison of 17 blended
Phenomena, vol. 16, no. 3, pp. 285-317, 1985. courses using moodle LMS,” IEEE Trans. on Learning Technologies,
[49] M. T. Rosenstein, J. J. Collins, and C. J. D. Luca, “A practical method vol. 10, no. 1, pp. 17-29, 2017.
for calculating largest lyapunov exponents from small data set,”
Expert Systems with Applications, vol. 29, no. 3, pp. 506-514, 1993.
[50] S. M. Bruijn, D. J. J. Bregman, O. G. Meijer, P. J. Beek, and J. H.
Dieen, “Maximum lyapunov exponents as predictors of global gait
stability: A modeling approach,” Medical Engineering & Physics, vol.
34, no. 4, pp. 428-436, 2012.
[51] N. F. Güler, E. Ubeyli, and I. Güler, “Recurrent neural networks
employing lyapunov exponents for EEG signals classification,” Expert
Systems with Applications, vol. 29, no. 3, pp. 506-514, 2005.
[52] H. E. Hurst, “Suggested statistical model of some time series which
occurs in nature,” Nature, vol. 180, no. 4584, pp. 494, 1957.
[53] B. Qian, and K .Rasheed, “Hurst exponent and financial market
predictability,” In Proc. of the 2nd IASTED International Conference
on Financial Engineering and Applications, Cambridge, MA, USA,
2004, pp. 356-362.
[54] R. Weron, “Estimating long range dependence: Finite sample
properties and confidence intervals,” Physica A Statistical Mechanics
& Its Applications, vol. 312, no. 1, pp. 285-299, 2001.
[55] C. K. Peng, S. V. Buldyrev, S. Havlin, M. Simons, and A. L.
Goldberger, “Mosaic organization of DNA nucleotides,” Physical
Review E, vol. 49, pp. 1685-1689, 1994.
[56] C. K. Peng, S. Havlin, E. H. Stanley, and A. L. Goldberger,
“Quantification of scaling exponents and crossover phenomena in
nonstationary heartbeat time series,” Chaos: An Interdisciplinary
Journal of Nonlinear Science, vol. 5, no. 1, pp. 82-0, 1995.
[57] R. Hardstone, S. S. Poil, G. Schiavone, R. Jansen, V. V. Nikulin, and
H. D. Mansvelder, “Detrended fluctuation analysis: A scale-free view
on neuronal oscillations,” Frontiers in Physiology, vol. 3, pp. 1-12,
2012.
[58] S. Hochreiter and J. Schmidhuber,. “Long short-term memory,”
Neural Computation, vol. 9, no. 8, pp. 1735-1780, 1997.
[59] S. Alhagry, A. Fahmy, and R. El-Khoribi, “Emotion recognition based
on EEG using LSTM recurrent neural network,” International Journal
of Advanced Computer Science and Applications, vol. 8, no. 10, pp.
355-358, 2017.
[60] M. Wölller, A. Metallinou, F. Eyben B. Schuller, and S. Narayanan,
“Context-sensitive multimodal emotion recognition from speech and
facial expression using bidirectional LSTM modeling,” Interspeech,
Conference of the International Speech Communication Association
(ISCA), Makuhari, Chiba, Japan, September. DBLP, 2010.
[61] Z. Zhao, W. Chen, X. Wu, P. C. Y. Chen, and J. Liu, “LSTM network:
A deep learning approach for short-term traffic forecast,” IET
Intelligent Transport Systems, vol. 11, no. 2, pp. 68-75, 2017.
[62] M. Baccouche, F. Mamalet, C. Wolf, C. Garcia, and A. Baskurt,
“Action classification in soccer videos with long short-term memory
recurrent neural networks,” In: Proc. of ICANN 2020: International
Conference on Artificial Neural Networks, Thessaloniki, Greece,
September, Part II. DBLP, 2010.
[63] R. Tibshirani. “Regression shrinkage and selection via the lasso,”
Journal of the Royal Statistical Society. Series B (Methodological), vol.
58, no. 1, pp. 267–288, 1996.
[64] C. Burges, T. Shaked, E. Renshaw, A. Lazier, and G. N. Hullender,
“Learning to rank using gradient descent,” In: Proc. of the Twenty-
Second International Conference (ICML 2005), Bonn, Germany,
August 7-11, 2005.
[65] S. Helal, J. Li, L. Liu, E. Ebrahimie, S. Dawson, D. J. Murray, and Q.
Long, “Predicting academic performance by considering student
heterogeneity,” Knowledge-Based Systems, vol. 161, pp. 134-146,
2018.
[66] X. Zhang, S. Qu, J. Huang, B. Fang, and P. Yu, “Stock market
prediction via multi-source multiple instance learning,” IEEE Access,
vol. 6, pp. 50720-50728, 2018.
[67] Z. Wu, W. Lin, P. Liu, J. Chen, and L. Mao, “Predicting long-term
scientific impact based on multi-field feature extraction,” IEEE
ACCESS, vol. 7, 2019, 51759-51770.
[68] L. Qiu, Q. Lei, and Z. Zhang, “Advanced sentiment classification of
Tibetan microblogs on smart campuses based on multi-feature fusion,”
IEEE Access, vol. 6, pp. 17896-17904, 2018.

VOLUME XX, 2017 9

This work is licensed under a Creative Commons Attribution 4.0 License. For more information, see https://creativecommons.org/licenses/by/4.0/.

Article 4
No ratings yet
Article 4
9 pages
Predicting Student Performance From Online Engagement Activities Using Novel Statistical Features
No ratings yet
Predicting Student Performance From Online Engagement Activities Using Novel Statistical Features
19 pages
Impact of Online/Virtual Learning On Students' Academic Performance Using Machine Learning Approaches
No ratings yet
Impact of Online/Virtual Learning On Students' Academic Performance Using Machine Learning Approaches
26 pages
Machine Learning Approach To Student
No ratings yet
Machine Learning Approach To Student
15 pages
Academic Performance Prediction Based On Multisource, Multi Feature Behavioral Data
No ratings yet
Academic Performance Prediction Based On Multisource, Multi Feature Behavioral Data
6 pages
Prediction of Student Academic Performance Based On Their Emotional Wellbeing and Interaction On Various e Learning Platforms
No ratings yet
Prediction of Student Academic Performance Based On Their Emotional Wellbeing and Interaction On Various e Learning Platforms
30 pages
Ramaswami 2020
No ratings yet
Ramaswami 2020
5 pages
Base Paper 1
No ratings yet
Base Paper 1
7 pages
(IJCST-V11I4P11) :vaibhav Sharma, Manoj Patil
No ratings yet
(IJCST-V11I4P11) :vaibhav Sharma, Manoj Patil
3 pages
Lucky Mini Project
No ratings yet
Lucky Mini Project
32 pages
Leveraging Machine Learning Approaches For Predicting Students' Academic Success An Analytical Perspective
No ratings yet
Leveraging Machine Learning Approaches For Predicting Students' Academic Success An Analytical Perspective
16 pages
Comp Applic in Engineering - 2022 - Arashpour
No ratings yet
Comp Applic in Engineering - 2022 - Arashpour
17 pages
A Deep Learning Approach Towards Student Performance Prediction in Online Courses Challenges Based On A Global Perspective
No ratings yet
A Deep Learning Approach Towards Student Performance Prediction in Online Courses Challenges Based On A Global Perspective
6 pages
Machine Learning Approaches For Student Performance Prediction
No ratings yet
Machine Learning Approaches For Student Performance Prediction
6 pages
An E Cient Deep Learning Approach For Prediction of Student Performance Using Neural Network
No ratings yet
An E Cient Deep Learning Approach For Prediction of Student Performance Using Neural Network
13 pages
Predicting Student Academic Performanceusing Support Vector Machineand Random Forest
No ratings yet
Predicting Student Academic Performanceusing Support Vector Machineand Random Forest
9 pages
Student Course Grade Prediction Using The Random Forest Algorithm - Analysis of Predictors' Importance
No ratings yet
Student Course Grade Prediction Using The Random Forest Algorithm - Analysis of Predictors' Importance
7 pages
Student Performance Prediction
No ratings yet
Student Performance Prediction
19 pages
Presentation II
No ratings yet
Presentation II
29 pages
Bhuma Devi's Synopsis
No ratings yet
Bhuma Devi's Synopsis
17 pages
A Novel Prediciting Students Performance Approach To Compentency & Hidden Risk Factor Identifier Using A Various Machine Learning Classifiers
No ratings yet
A Novel Prediciting Students Performance Approach To Compentency & Hidden Risk Factor Identifier Using A Various Machine Learning Classifiers
15 pages
2025 Proceedings of The International COnference On Decision Aid and Artificial Intelligence (ICODAI 2024)
No ratings yet
2025 Proceedings of The International COnference On Decision Aid and Artificial Intelligence (ICODAI 2024)
14 pages
AI Models for Online Student Success
No ratings yet
AI Models for Online Student Success
25 pages
Data Mining Approach To Predict Academic Performance of Students
No ratings yet
Data Mining Approach To Predict Academic Performance of Students
11 pages
Predicting Student Performance To
No ratings yet
Predicting Student Performance To
17 pages
PredictingStudentSuccess-AutoML PrePrint
No ratings yet
PredictingStudentSuccess-AutoML PrePrint
23 pages
Prediction of Students Performance With Learning Coefficients Using Regression Based Machine Learning Models
No ratings yet
Prediction of Students Performance With Learning Coefficients Using Regression Based Machine Learning Models
11 pages
Paper Predicting Student Scores
No ratings yet
Paper Predicting Student Scores
10 pages
Kamal 2018
No ratings yet
Kamal 2018
9 pages
Predicting Short-Term Electricity Demand by Combining The Advantages of Arma and Xgboost in Fog Computing Environment
No ratings yet
Predicting Short-Term Electricity Demand by Combining The Advantages of Arma and Xgboost in Fog Computing Environment
3 pages
Master Project
No ratings yet
Master Project
54 pages
Predicting Student Success
No ratings yet
Predicting Student Success
3 pages
PM Web 18058
No ratings yet
PM Web 18058
18 pages
Mining Educational Data To Predict Student's Academic Performance Using Ensemble Methods
No ratings yet
Mining Educational Data To Predict Student's Academic Performance Using Ensemble Methods
19 pages
Applying Machine Learning Approach To Predict Student's Performance I HE
No ratings yet
Applying Machine Learning Approach To Predict Student's Performance I HE
20 pages
Artificial Intelligent Approach To Predict The Student Behaviour and Performance
No ratings yet
Artificial Intelligent Approach To Predict The Student Behaviour and Performance
11 pages
Irjet V7i2688 PDF
No ratings yet
Irjet V7i2688 PDF
4 pages
Predicting At-Risk Students at Different Percentages of Course Length For Early Intervention Using Machine Learning Models
No ratings yet
Predicting At-Risk Students at Different Percentages of Course Length For Early Intervention Using Machine Learning Models
21 pages
Multi-Label Feature Aware XGBoost Model For Student Performance Assessment Using Behavior Data in Online Learning Environment
No ratings yet
Multi-Label Feature Aware XGBoost Model For Student Performance Assessment Using Behavior Data in Online Learning Environment
7 pages
14 Predicting Students Performance in Educational Data Mining
No ratings yet
14 Predicting Students Performance in Educational Data Mining
4 pages
Predicting Student Success with ML
No ratings yet
Predicting Student Success with ML
30 pages
Education 13 00313
No ratings yet
Education 13 00313
14 pages
Proj Report 4
No ratings yet
Proj Report 4
12 pages
Ai-Based Early Prediction and Intervention For Student Academic Performance in Higher Education
No ratings yet
Ai-Based Early Prediction and Intervention For Student Academic Performance in Higher Education
19 pages
1st Review.1
No ratings yet
1st Review.1
10 pages
Prediction Model For Students PDF
No ratings yet
Prediction Model For Students PDF
4 pages
Introduce and Related Work
No ratings yet
Introduce and Related Work
3 pages
Predicting Students Performance Through Data Mini
No ratings yet
Predicting Students Performance Through Data Mini
15 pages
Revue D'intelligence Artificielle: Received: 1 October 2021 Accepted: 21 October 2021
No ratings yet
Revue D'intelligence Artificielle: Received: 1 October 2021 Accepted: 21 October 2021
7 pages
Early Prediction of Student Performance in Face-To-Face Education Environments A Hybrid Deep Learning
No ratings yet
Early Prediction of Student Performance in Face-To-Face Education Environments A Hybrid Deep Learning
15 pages
2950-Article Text-5557-1-10-20210418
No ratings yet
2950-Article Text-5557-1-10-20210418
6 pages
1 s2.0 S2772503025000180 Main
No ratings yet
1 s2.0 S2772503025000180 Main
16 pages
An Integrated System Framework For Predicting Students' Academic Performance in Higher Educational Institutions
No ratings yet
An Integrated System Framework For Predicting Students' Academic Performance in Higher Educational Institutions
9 pages
Literature Review
No ratings yet
Literature Review
2 pages
Reporte Alumnos 2024-1 Ma030502 Metodos Estadisticos
No ratings yet
Reporte Alumnos 2024-1 Ma030502 Metodos Estadisticos
7 pages
Academic Performance and Behav
No ratings yet
Academic Performance and Behav
17 pages
Competency Learning and Student Centric
No ratings yet
Competency Learning and Student Centric
14 pages
Big Data Quiz Solutions
No ratings yet
Big Data Quiz Solutions
4 pages
Modern Information Retrieval Chapter 5 Query Operations
No ratings yet
Modern Information Retrieval Chapter 5 Query Operations
33 pages
ML Opp
No ratings yet
ML Opp
9 pages
Crime Data Analytics & Forecasting
No ratings yet
Crime Data Analytics & Forecasting
13 pages
Intents and Broadcasts
No ratings yet
Intents and Broadcasts
4 pages
A Sentiment Analysis System To Improve Teaching and Learning PDF
0% (1)
A Sentiment Analysis System To Improve Teaching and Learning PDF
8 pages
AI and Machine Learning For Risk Management
No ratings yet
AI and Machine Learning For Risk Management
19 pages
An Access Control Scheme For Big Data Processing: Vincent C. Hu, Tim Grance, David F. Ferraiolo, D. Rick Kuhn
No ratings yet
An Access Control Scheme For Big Data Processing: Vincent C. Hu, Tim Grance, David F. Ferraiolo, D. Rick Kuhn
7 pages
A Multidisciplinary Industrial Robot Approach
No ratings yet
A Multidisciplinary Industrial Robot Approach
8 pages
Engineering Students' Feedback Insights
No ratings yet
Engineering Students' Feedback Insights
9 pages
ML in Financial Crisis Prediction Survey
No ratings yet
ML in Financial Crisis Prediction Survey
16 pages
Precision Fish Farming Insights
No ratings yet
Precision Fish Farming Insights
10 pages
Precision Aquaculture
No ratings yet
Precision Aquaculture
5 pages
HCI NOTES (R16) Unit-I, II
No ratings yet
HCI NOTES (R16) Unit-I, II
59 pages
Design Process HCI Design: Structure and Feedback Are The Seven Principles Used in Interface Designing
No ratings yet
Design Process HCI Design: Structure and Feedback Are The Seven Principles Used in Interface Designing
7 pages
Data Visualization
No ratings yet
Data Visualization
140 pages
HCI Chapter 8 Evaluation Techniques
100% (1)
HCI Chapter 8 Evaluation Techniques
11 pages
Advanced ML for Researchers
No ratings yet
Advanced ML for Researchers
57 pages
Particle Filter Tutorial
No ratings yet
Particle Filter Tutorial
39 pages
Paper 114-Harnessing AI To Generate Indian Sign Language From Natural Speech
No ratings yet
Paper 114-Harnessing AI To Generate Indian Sign Language From Natural Speech
10 pages
A Tutorial On Hidden Markov Models and Selected Applications in Speech Recognition
No ratings yet
A Tutorial On Hidden Markov Models and Selected Applications in Speech Recognition
30 pages
E9 205 - Machine Learning For Signal Processing
No ratings yet
E9 205 - Machine Learning For Signal Processing
2 pages
CS221 Winter 2021 Exam 2 Instructions
No ratings yet
CS221 Winter 2021 Exam 2 Instructions
18 pages
ZUPT
No ratings yet
ZUPT
16 pages
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo Full Access
100% (2)
Machine Learning Foundations: Supervised, Unsupervised, and Advanced Learning Taeho Jo Full Access
57 pages
Multiple Sequence Alignment
No ratings yet
Multiple Sequence Alignment
89 pages
Modeling and Simulation in Python 1st Edition Jason M Kinser Online Reading
No ratings yet
Modeling and Simulation in Python 1st Edition Jason M Kinser Online Reading
84 pages
A Hindi Speech Recognition System For Connected Wo
No ratings yet
A Hindi Speech Recognition System For Connected Wo
8 pages
Ai Unit 5
No ratings yet
Ai Unit 5
16 pages
Wu Et Al 2022 How Is Mobile User Behavior Different A Hidden Markov Model of Cross Mobile Application Usage Dynamics
No ratings yet
Wu Et Al 2022 How Is Mobile User Behavior Different A Hidden Markov Model of Cross Mobile Application Usage Dynamics
21 pages
Context-Free Grammar Explained
No ratings yet
Context-Free Grammar Explained
14 pages
NLP Assignment-3 Solution
100% (1)
NLP Assignment-3 Solution
6 pages
PRML Handout
No ratings yet
PRML Handout
3 pages
Upi Fraud Detection Using Machine Learning
No ratings yet
Upi Fraud Detection Using Machine Learning
5 pages
Churn Analytics
No ratings yet
Churn Analytics
46 pages
Parts of Speech Tagging For Afaan Oromo
No ratings yet
Parts of Speech Tagging For Afaan Oromo
5 pages
Sarowar 2025 Ijca 924776
100% (1)
Sarowar 2025 Ijca 924776
34 pages
Machine Learning Exam Guide
No ratings yet
Machine Learning Exam Guide
11 pages
Model Hidden Markov Pada Prediksi Harga Beras Dan Perpindahan Konsumen Beras Di Kota Solok Provinsi Sumatera Barat Melsi Diansa Putri
No ratings yet
Model Hidden Markov Pada Prediksi Harga Beras Dan Perpindahan Konsumen Beras Di Kota Solok Provinsi Sumatera Barat Melsi Diansa Putri
10 pages
Probability and Statistics UIUC Luthuli
100% (2)
Probability and Statistics UIUC Luthuli
451 pages
Debit Card Fraud Detection
No ratings yet
Debit Card Fraud Detection
3 pages
Sign Language To Speech Conversion
No ratings yet
Sign Language To Speech Conversion
6 pages
A Comprehensive Review On Music Transcription
No ratings yet
A Comprehensive Review On Music Transcription
20 pages
IT8601-Computational Intelligence PDF
No ratings yet
IT8601-Computational Intelligence PDF
12 pages
Automatic Life-Logging: A Novel Approach To Sense Real-World Activities by Environmental Sound Cues and Common Sense
No ratings yet
Automatic Life-Logging: A Novel Approach To Sense Real-World Activities by Environmental Sound Cues and Common Sense
10 pages
Pairwise Cosine Similarity of Emission Probability Matrix As An Indicator of Prediction Accuracy of The Viterbi Algorithm
No ratings yet
Pairwise Cosine Similarity of Emission Probability Matrix As An Indicator of Prediction Accuracy of The Viterbi Algorithm
6 pages
Speech Recognition Overview
No ratings yet
Speech Recognition Overview
11 pages

Academic Performance

Uploaded by

Academic Performance

Uploaded by

This article has been accepted for publication in a future issue of this journal, but has not been

Academic Performance Prediction Based

Corresponding author: Liang Zhao (liang.zhao@mail.ccnu.edu.cn)

I. INTRODUCTION ⚫ Lifestyle Behaviors (e.g., eating, physical activity,

VOLUME XX, 2017 1

2 VOLUME XX, 2017

Consumption Clinic visit

Library interaction Learning

Data Module Prediction Module Feedback Module

II. RELATED WORK 1) BEHAVIORAL CHANGE-LINEAR (BC-LINEAR)

2 VOLUME XX, 2017

2 VOLUME XX, 2017

2 VOLUME XX, 2017

On- online study         

2 VOLUME XX, 2017

Behavioral Change-LSTM (BC-LSTM)

t a Keras LSTM network ➢LSTM-based Features

FIGURE 3. Flowchart of the data module.

Finally, the robustness of the algorithm is tested by 10-fold

2 VOLUME XX, 2017

VOLUME XX, 2017 9

accuracy value of SVM* is 0.866, which is much higher than

C. IDENTIFICATION OF AT-RISK STUDNETS BASED ON

delivered to one at risk student is shown in Fig. 6.

FIGURE 4. Comparisons of the SVM* performance of different data source 0.25

FIGURE 6. Example of a feedback given to one at risk student, including the

0.2 identify at-risk students. Traditionally, those features can be

selected by either statistical analysis (e.g. by ANOVA) or ML

algorithms (e.g. feature importance returned by RF). We

0 recall that in our study, multiple behaviors are involved, and

VOLUME XX, 2017 9

VOLUME XX, 2017 9

STEP 2. BEHAVIORAL DATA REPRESENTATION

Following the time data representation, the raw behavioral

VOLUME XX, 2017 9

VOLUME XX, 2017 9

VOLUME XX, 2017 9

You might also like