DT - Example 1
DT - Example 1
A thesis submitted to Birmingham City University in partial fulfillment of the requirements of the
degree of Bachelor of Science.
Abstract
Businesses discover challenges when analysing their processes every day, there are also many
challenges with business processes that organisations are not even aware of. Process mining is an
umbrella of techniques that combines data science and process management to analyse businesses’
processes and diminish the challenges organisations face.
Although process mining is an emerging field there is still a lack of knowledge and integration of
process mining being introduced in to small and medium-sized enterprises (SMEs). This paper will
identify the challenges organisations face when analysing processes and the possible solutions will
be tested with scenarios using Celonis (2022). The results will demonstrate how process mining
could reduce the challenges SMEs face when analysing processes with a conceptual model being
produced to help further research.
2
Acknowledgements
First and foremost, I would like to thank my supervisor Ensi Smajli, for his supervision, guidance, and
patience throughout the course of my final year project at Birmingham City University. I would also
like to thank Dr Gerald Feldman for his support on this project and throughout my entire university
course.
3
Table of Contents
Abstract ....................................................................................................................................... 2
Acknowledgements ...................................................................................................................... 3
Table of Contents ......................................................................................................................... 4
List of Figures ............................................................................................................................... 6
List of Tables ................................................................................................................................ 7
1.0 Introduction ........................................................................................................................... 8
1.1 Background ......................................................................................................................... 8
2.0 Project Aims and Objectives..................................................................................................... 9
3.0 Literature Review .................................................................................................................. 10
3.1 Business Processes......................................................................................................... 10
3.2 Business Process Management in SMEs .............................................................................. 10
3.3 Challenges of Business Processes Management in SMEs ...................................................... 11
3.4 Process Mining .................................................................................................................. 11
3.5 Process Mining Case Study – Healthcare ......................................................................... 13
3.6 Process Mining in SMEs .................................................................................................. 14
4.0 Methodology ........................................................................................................................ 16
4.1 The Scientific Approach ..................................................................................................... 16
4.2 Research Methodology ...................................................................................................... 17
4.2.1 Research Design ............................................................................................................. 17
4.2.2 Define Research Problem: ........................................................................................... 17
4.2.3 Review the Literature .................................................................................................. 17
4.2.4 Collect Data ................................................................................................................ 18
4.2.5 Analyse Data............................................................................................................... 18
4.2.6 Interpret and Report ................................................................................................... 19
5.0 Results ................................................................................................................................. 20
5.1 The Dataset....................................................................................................................... 20
5.2 Scenario One:.................................................................................................................... 20
5.3 Scenario Two: ................................................................................................................... 23
5.4 Scenario Three: ................................................................................................................. 24
5.5 Scenario Four .................................................................................................................... 27
5.6 Scenario Five..................................................................................................................... 29
5.7 Scenario Six ....................................................................................................................... 31
6.0 Discussion............................................................................................................................. 34
6.1 Conceptual Model ............................................................................................................. 35
7.0 Conclusion ............................................................................................................................ 37
7.1 Limitations ........................................................................................................................ 37
4
7.2 Future Research and Recommendations ............................................................................. 37
References ................................................................................................................................. 39
5
List of Figures
Figure 1: Process Model results derived from Heuristic mining algorithm……………………………………….14
Figure 2: The Scientific Approach (Baker, 2000) ………………………………………………………………………………16
Figure 3: Proposed Changes to Scientific Approach Methodology…………………………………………………...17
6
List of Tables
Table 1: Possible use cases of process mining (Faizan et al, 2021) …………………………………………..……..13
Table 2: Challenges identified with potential process mining solution……………………………………………..15
7
1.0 Introduction
Data is used every second within a business, meaning it is increasingly important for organisations to
ensure their data is being processed efficiently and effectively. When an organisation collects data, it
is unorganised and unable to be analysed easily. Data transformation “refers to the conversion of
the value of given data point, using some kind of consistent mathematical transformation” (Allen,
2017). This will allow organisations to analyse data and spot trends. Data transformation is
increasingly popular within organisations, more recently with an emphasis on process mining.
Businesses have a number of processes that they use every day to ensure their business is running
smoothly. Atrostic and Nguyen (2006) state that “order taking, inventory monitoring and logistic
tracking” are all processes that most businesses use daily. Processes are used to understand,
manage and coordinate activities within an organisation whilst also helping dissolve issues that arise
easily and efficiently (Bibiano et al, 2007).
Process Mining is a combination of data science and process management which supports the
analysis of business processes based on event logs. Munoz-Gama (2022) states that “Process mining
techniques can be used to analyse business processes using the data logged during their execution.”.
The overall aim of process mining is to turn the event logs into event data which will give
organisations insights and analyses of their processes.
1.1 Background
Business process management is consistently used within SMEs, however, it has many limitations
that are restricting the organisations’ ability to expand. BPM are normally manual processes
conducted by staff. Vugec et al (2018) states the main issue and limitation is staff not following the
designed process models that the organisation is using. This means there will be inconsistencies and
potential costly mistakes if staff are not following protocol, and processes can be missed. Another
limitation of traditional BPM is the gap between Business and IT. It is found that for several small
businesses BPM falls under the business department of the IT department however, they need to
work together to be the most efficient with their processes (Ahrend, 2014).
Process mining is an emerging research discipline (Burattin, 2013) that discovers, controls, and
improves real processes by extracting information from an organisation’s event log (Corallo et al,
2020). This can eliminate the limitations previously stated as it can “discover and analyse business
processes based on raw event data” (Van Der Aalst and Dustar,2012). This will make the change
from manual processes to fast data input treatment, automatic processes and outputs that give
accurate results quickly whilst discovering special cases and checks (Miclo et al, 2013 and Mishra et
al, 2018), thus, reducing human error. It will also help them have a better understanding of
processes which might motivate them more. It enables an organisation to understand what is
happening within their organisational processes (Lee et al, 2013).
This benefits organisations as it is “based on facts” (Aalst, 2012) and gives the opportunity to
“extend or improve an existing process model using the event log” (Aalst, 2012). It is “able to
graphically and logically present the pathway behaviour of customers.” (Hwang and Jang,2017).
Process mining combines historic event data and real-time event data to predict future problems
before they occur (Van Der Aalst, 2004).
However, many SMEs have either not introduced process mining into their organisations or are not
optimising data transformation techniques. Digitalization is a major challenge for SMEs, many still
use manual processes and therefore need to move to digital transformation before they will be able
to utilise process mining. Melo and Machado (2019) state “Process mining can bring added value to
an SME, but the SME needs to be prepared diligently”. Another issue SMEs have when trying to
8
introduce process mining is lack of data. Process mining needs “voluminous data” which gives a
large number of cases and events for the algorithm to analyse (Bose et al, 2013), many SMEs do not
have the dataset size needed meaning the results would not be accurate.
SMEs are unlikely to have the financial capabilities to introduce basic digital data transformation
systems. The introduction of process mining is known to have an annual cost of upwards of
thousands to millions of pounds, depending on the size of the organisation. This does not include the
initial cost to introduce process mining in the first place. (Melo and Machado, 2019). With the other
limitations discussed above the added cost for a tool that may not work for the business would be
damaging.
Therefore, this project will compare and analyse how Process Mining techniques help SMEs. This will
be done by using sample data to run process mining analyses and comparing the results to the
results found in the literature review that will be conducted.
9
3.0 Literature Review
This literature review will discuss and explore current challenges organisations face when analysing
processes using business process management (BPM) and look at how process mining could be a
possible solution to those challenges identified. Although the two themes stated above are the main
themes of the report, the report will also discuss what business processes are, BPM in SMEs and
Process Mining in SMEs.
Business processes are used to adapt to the ever-changing environment in which an organisation
operates as they enable them to be more responsive. However, it means it is essential for the
organisation to understand the nature of the processes. (Leymann and Altenhuber, 1994). Thus,
meaning it is key for an organisation to understand what processes they use, why they use them,
and how they are used to utilise the business. Business process management (BPM) is “all efforts in
an organization to analyse and continually improve fundamental activities such as manufacturing,
marketing, communications and other major elements of company’s operations” (Trkman, 2010).
Each business process has events and activities which will most likely lead to a decision point. Even ts
happen automatically with no time duration; these then trigger activities which is the logical flow of
information to complete the process (Marlon, 2018). This normally leads to decision points that
affect the outcome of the process. BPM uses capability frameworks that link areas that are most
important for the successful “implementation of process orientation” in an organisation
(Kerpedzihev, 2021). Therefore, an organisation needs to have awareness of important areas of their
organisation to successfully establish processes and use business process management. A challenge
that organisations find when analysing processes is that organisations, particularly SMEs lack
adequate information and understanding of the current processes (Riss et al, 2005).
To manage processes many organisations, use business process management (BPM) which helps
organizations operate effectively and efficiently through the continuous discovery, execution,
analysis, and redesign of business processes (Dumas et al. 2018). However, organisations face
challenges when using BPM. The next section of this literature review will identify those challenges.
10
3.3 Challenges of Business Processes Management in SMEs
The success of introducing BPM into an organisation relies on the organisation being willing to look
at the organisation as a whole and not single entities. Gulledge and Sommer (2002) state that some
organisations focus solely on performance measures rather than how to link their business
objectives with processes that are needed. BPM are normally manual processes conducted by staff.
Vugec et al (2018) states the main issue and limitation is staff not following the designed process
models that the organisation is using. This means there will be inconsistencies and potential costly
mistakes if staff are not following protocol, and processes can be missed. Another challenge that
many organisations have is the misalignment between business strategies and IT strategies. Silva and
Chaix (2018) states “The lack of alignment or misalignment between business strategies and IT
results in the business failing to use the available IT support.”. This also shows that a big challenge is
deriving IT goals from the business goals (Van de Aalst et al, 2007). This shows that organisations
struggle to bridge the gap between business and IT when identifying processes for BPM. A business’s
processes will change as the business grows, this means they will need to be kept up to date, and
processes may have different stages as improvements are made. Alotaibi and Liu (2016) state that
“the transition from one BP stage to another BP stage can be slow and error prone”. This could lead
to more challenges down the line if errors are made in the updating of the business processes. Smart
et al (2005) state that “business processes are streams of activity that flow across functional
boundaries, for this reason, business processes are said to be fragmented and scattered.”. Due to
them being fragmented and scattered organisations may not know all activities that occur within a
process.
A possible solution for the challenges SMEs face when using BPM is process mining. The next section
looks at what process mining is.
Conformance checking is “concerned with quantifying the quality of a business process model in
relation to event data that was logged during the execution of business process” (Syring et al, 2019).
It calculates metrics which determine the quality of process models. (Dakic et al. 2019). The metrics
can identify how well “a process model allows for behaviour in an event log and metrics that
measure if a process model allows for behaviours beyond the one recorded in an event log” (Theis et
al, 2021). This could help an organisation identify new processes from the possible behaviours
identified. However, the behaviours could be negative to an organisation which means conformance
checking can be utilized by organisations to find, distinguish, and clarify deviations. Also identifying
the seriousness of the deviation as it is checking what degree the given process model is accurately
describing the event data (Mishra et al, 2018 and Berti et al, 2019).
Enhancement aims to extend or improve existing process models, to do this it uses information
about actual processes that have been recorded in an event log. For example, it can use a timestamp
in the event log to extend the model to show bottlenecks. (Van Der Aalst, 2012). Enhancement
provides organizations with operation support and is argued to be the most “ambitious” form of
process mining as it also uses and combines statistics and machine learning to improve process
models. This is to gain a deeper understanding of an organisations processes and how to optimize
them (Caldeira and Abreu, 2016).
Within process mining, there is key terminology used when analysing processes. One term that is
frequently used is bottlenecks. These are subprocesses that deviate from the main process and can
easily be highlighted when using process mining (Caballero-Hernandez et al, 2018). Another term is
throughput time. This is the time a process takes from the start to the end event. Analysing through-
put time is a performance indicator that allows an organisation to identify areas that are time-
consuming or need further analysis (Geyer-Klingeberg et al,2018).
Table 1 shows possible use cases an organisation could use process mining and what technique
would be used to gain results.
12
Table 1 Possible use cases of process mining (Faizan et al, 2021).
Table one identifies possible use cases for process mining. These use cases could be used further
into the project when testing scenarios for process mining. The use cases identified tackle problems
that have also been highlighted in the literature review. For example, the Identification of real-world
business processes would help SMEs identify activities that take place within the process that they
currently are not aware of or do not understand how the activities may link. Searching for
bottlenecks in a business’s process will also help SMEs identify what activities occur that are not
included in the targeted process model.
The results are spaghetti-like and too complex to understand. This meant further breakdown of the
logs to form clusters was needed, which then allows the Heuristic mining algorithm to be used on
the big clusters. This will give clearer results of processes. Using this technique is time consuming
and may not be as accurate, thus meaning it might not be suitable for SMEs.
14
Introducing process mining into an organisation can be expensive and many SMEs will not have the
financial capabilities to introduce basic digital data transformation systems let alone process mining
which can cost thousands of pounds to an organisation (Melo and Machado, 2019). The high prices
are due to “demanding requirements in terms of hardware and software infrastructure” (Gua12rda
et al, 2013).
Once process mining has been completed for a certain aspect of the data there are no “portable
solutions” (Rojas et al,2016). This means it cannot be adapted if the business environment changes.
This could cause issues for SMEs as they might not have sufficient staff or capital to keep running
different process mining techniques for the changing environments.
SMEs that have already introduced process mining have stated that it has “reduced documentation
effort due to automatic documentation” and have gained an overall better understanding of the
processes, especially for different positions in the organisation hierarchy. This is due to a better
understanding of their processes which has led to better communication (Stertz et al, 2021).
Table 2 BPM challenges identified in SMEs with possible process mining solution
Table 2 gives a summary of the challenges SMEs may face when analysing processes with potential
solutions process mining can give. These will be considered further in the analysis.
To conclude, BPM helps SMEs move from manual processes to automatic processes and has
benefited SMEs in many ways. However, there are limitations or challenges faced when analysing
processes using business process management. The literature has found that process mining could
be a potential solution to some of the challenges. This will be looked at further in the project
analysis in the results section.
15
4.0 Methodology
To ensure the project’s objectives and the overall aim of the project are met a clear and structured
methodology needs to be used. Within this section, different methodologies will be discussed before
the final methodology is decided. (Vaushnavi and Kuechler, 2012). These two process steps could be
used for objectives 1-3.
The advantages and disadvantages of using this methodology are shown in the table below.
Table 1 Advantages and Disadvantages of the scientific approach
Advantages Disadvantages
This standardised approach uses objectives
when conducting experiments/ projects. By
using objectives as a guide for the project it
enables the investigator to stay on course. Data and findings are down to the
This approach gives research a purpose to interpretation of the researcher. If the
discover answers to the questions identified in researcher interprets the data wrong or
the objectives by applying scientific procedures differently than intended, this could lead to
which have been developed to ensure the inaccurate or different results.
research gathered is relevant and unbiased to
the initial question asked (Baker, 2000). In this
project, it would be linked to the overall aim of
the project.
This methodology mostly fits with the method of this project however, to ensure that the objectives
are achieved and meet the total aim of the project it needs to be adjusted slightly. The next section
will show the final methodology.
16
4.2 Research Methodology
The proposed methodology is a slightly edited version of the scientific approach. It has been edited
to meet the needs of the project whilst also keeping the main functions of the original approach.
Figure 3 shows the updated methodology, the main difference that is an artefact is developed, in
this case, a Conceptual Model.
The following section will look at each step of the methodology and how it will be used within the
project.
18
Figure 4 Process Mining Project Methodology (PMPM)
As this is only a sub methodology for this section of the overall chosen methodology not all sections
will be used. Each step that will be used will be shown in the table below.
This stage gives an opportunity to understand the data and answer the questions that were
identified in the scenarios. Analysing the data completes objective 4, it allows testing of process
mining and how an SME could analyse their data. To analyse the data the software Ceolins is being
used.
19
5.0 Results
The results section will cover the collect and test the data section of the methodology whilst also
completing objective 4. This section is also where the sub methodology beings by using aspects of
the Scoping, Data Understanding and Process Mining steps in the PMPM. In this section a data set
will be used to test how process mining can be used in different scenarios that will enhance the
overall business performance. The dataset being used is an open-source dataset from BPI Challenge
(2019). The dataset will be discussed below.
Figure 5 The secondary sample dataset being used for the project.
Figure 5 shows the dataset within Celonis. In this datasets case, the Case ID is the first column, the
timestamp is the Duration column, and the Activity is the number of activities column.
The event log is large with over 900k cases found meaning this comes from a large organisation. This
project is focused on SMEs. Therefore, to ensure the overall aim of the project is not lost scenarios
are being identified that SMES could use this on a smaller scale with a smaller event log. Below six
scenarios have been identified where process mining could be an easy solution and beneficial for
SMEs.
20
Figure 6 All variants of the create order process
Figure 6 shows all variants and connections of the customer order process from start to finish. There
are over 500 variants and connections making it impossible for the organisation to decipher and
analyse. The spaghetti like results make it challenging to interpret any logical process flow this could
severely limit the understanding of the process for the organisation. However, by filtering the
results it allows the organisation to discover the most common variants.
21
To start finding the most common activities the most common variant was filtered. Figure 7 shows
that the most common variant covers 41% of the overall cases (405k). This has identified 6 activities
which are Receive Order, Confirm Order, Generate Delivery Document, Ship Goods, Send Invoice and
Clear Invoice. However, 41% of cases is still a low amount of coverage. To gain a better
understanding of the activities the top five variants of the process were filtered together (Figure 8).
Figure 8 shows that 74% (735k) of cases are covered in the top 5 variants with only five different
activities added from the top variant. These include approving credit checks, changing shipping
conditions, removing delivery blocks, returning goods, and cancelling orders. Filtering the different
variants together has helped identify that not all activities have the same process end. For Example,
43,728 cases end with cancel order after the activity return goods. This could indicate to the
organisation that more analysis on the cancel order activity should be completed as it could be an
area that needs to be improved.
By identifying the top variants and activities it allows the organisation to identify activities and parts
of the overall process that they should focus on and ensure that they continuously improve them to
improve business performance. Filtering the process to the top variants also allows the organisation
to identify how they could streamline all cases to fit the top variants, improving the organisations
performance.
22
5.3 Scenario Two:
The organisation has identified that there has been an increase of complaints about late deliveries.
The organisation investigated this and found that there is often a delay in shipping the orders. The
company now wants to identify the bottlenecks of the original process to identify the factors that
are leading to delayed shipping to ensure changes are made and customer satisfaction is increased.
Figure 9 shows that there are five causes of late delivery. These include Slow Credit Check, Low
Credit Limit, Delayed Order Entry, Material Availability and Other. With Slow Credit Check being the
top reason for late delivery it will be looked at in further detail as it is the most crucial for the
organization. To do this the Key Performance Indicator (KPI) will be changed to average throughput
time (Figure 10)
23
Figure 10 Average throughput time of credit check process
Figure 10 shows that from the time the organisation receives an order from a customer it takes 6
days to approve it and a further 2 days to confirm the order, 3 days to Generate a Delivery
Document and 2 days to finally Ship the goods. Totalling in 13 days if that case follows the target
process model. However, once it’s been approved there are two other paths it could take. The first
one is the change division which takes a total of 11 days from Approve Credit Check to the process
end. The second path takes a total of 63 days. This route is the main cause of the delays in delivery
and where the organisation needs to concentrate on what is causing the average time to be so high.
There are two bottlenecks from the desired process route which show where delays are occurring.
Identifying the bottlenecks allows the organisation to investigate why there is a 22 day wait between
a delivery being released and an invoice being created. Identifying bottlenecks allows an
organisation to understand where the process is deviating from the target model and how it could
be improved.
24
Figure 11 Average Throughput Time
Figure 12 shows the shortest average throughput time however, it covers 0% of the cases as only
3272 cases out of the total 988K cases follow it. The total throughput time is 0 days as the only event
is receiving an order. This shows that although it is the shortest throughout time it does not give the
organisation insight into the throughput time of an order process as it only has one activity. The
organisation can use this to investigate why the process didn’t continue however that is currently
not what the organisation is looking for.
To gain a better understanding of the quickest throughput time of an order that goes through more
events the variants are sorted by most common variant (Figure 13).
25
Figure 13 shows that the most common variant with the fastest throughput time is 11.8 days. This
covers 41% of all cases. The organisation can look at this information in more detail by looking at the
events these cases go through.
Figure 14 shows the time it takes for each event to take place. It shows that the longest amount of
time between events is between Send Invoice and Clear Invoice which is 12 days. This result shows
that 41% of cases follow these activities and throughput times.
The longest throughput time of an order process is 312.8 days (Figure 15). Although it has only
happened in 3 cases the organisation could investigate why it has been repeated more than once.
This could be a human error or something more serious that needs to be investigated.
Comparing 312.8 days to 11.8 days shows that there is a major issue with long throughput time as
there were multiple variants with similar averages. This has identified that there may be multiple
bottlenecks within the order process, and it is something that needs to be immediately looked at to
ensure the performance is continuously improving. If nothing is done it could lead to further
problems and a decrease in customer satisfaction. If the organisation does not frequently analyse
and update processes, it could lead to further issues.
26
5.5 Scenario Four
A company has noticed an increase in cancelled orders, they want to identify what stage in the order
process in which orders are being cancelled. This will help them to identify possible improvements
and solutions to stop cancellations from occurring so often.
To start the investigation the data was filtered to only show variants that included the Cancel Order
event (Figure 16).
Once the data has been filtered it shows that there are 7 different variants where orders have been
cancelled (Figure 17). Overall, 62.5k orders have been cancelled which is 6% of total orders (Figure
18).
The highest percentage of cases were in variant 1 which was 70% of cases. Figure 18 shows the flow
this variant took.
27
Figure 19 Process flow of variant 1
Figure 19 shows that the highest percentage of orders (70% of all cancelled orders) are cancelled
due to the goods being returned. This reason may be something that the organisation needs to look
at and review. Highlighting an activity that is repeatedly happening throughout the process will give
the organisation insight into the areas of the organisation that need to be improved.
28
Figure 20 All variants that Cancel Order Activity appears in
Figure 20 shows all 7 variants including the bottlenecks in one view. This shows the organisation the
other main reason an order is cancelled is that the order is received but then cancelled, this
indicates that the organisation may need to investigate why orders are being cancelled straight
away. However, it also shows that orders are also cancelled during the credit check process, this
might be due to bad credit and no fault of the organisation.
Figure 21 shows the organisations target model. This model is very basic and is likely that the actual
process does not follow it. Conformance checking allows the organisation to look at activities that
continuously occur outside of the target model.
29
Figure 22 Conformance Statistics
Figure 22 shows that 42% (419k) of cases conform to the target model meaning 58% (584k) of cases
in the data set do not conform with a total of 38 violations. Process mining allows the organisation
to further investigate in more detail the identified violations.
Figure 23 shows the three highest violations of the target model. These activities are Approve Credit
Check, Remove Delivery Block and Cancel Order. Each of these violations have undesirable effects on
the overall process e.g., added throughput time. Each violation can be looked at in more detail by
viewing the cases in the process explorer. Figure 24 shows the Approve Credit Check activity.
30
Figure 24 Approve Credit Check undesired activity
Viewing the undesired activity in more details gives the organisation further understanding of where
the activity fits into the process. Figure 24 shows that 168,999 cases flow through approve credit
checks from the process start receiving orders, whilst 2375 cases flow up from confirm order. This
could indicate to the organisation that they need to investigate further why a credit check is
completed for some but not all orders.
To start this process the months that the net order value has decreased need to be identified.
31
Figure 25 Identifying decrease of net order value
Figure 25 highlights that between March and April 2017 the net order value falls. To analyse the
process, the results need to be filtered to only show the two months identified.
Figure 26 Filtered Results showing March and April net order value.
Figure 26 shows only the two months that are being analysed. To identify if the process is being
deviated from the target model there needs to be a comparison of the two process flows.
32
Figure 27 Target Process Model
Figure 28 Cases that follow the target process flow in Figure 29 Process Flow of selected two months with top 5 variants
March and April
Figure 27 shows that the target process flow should only flow through 6 activities from start to
finish. However, Figure 28 shows that in the months of March and April 2017 only 86,000 of 198,000
follow the target process flow. Figure 29 shows that the bottlenecks that have been identified are
approving credit check, changing of shipping conditions, the return of goods and payment
reminders.
Figure 29 shows that 40,000 cases flow through approve credit check which could indicate that this
activity needs to be added to the target process model. It also shows that 11,000 orders were
returned and cancelled. This has highlighted a possible issue with the goods or quality of service
customers receive, indicating this bottleneck needs to be looked at by the organisation.
33
Each of the scenario’s tackles challenges that SMEs may face daily when analysing their processes.
The next section will link the challenges identified in the literature review with the outcome of the
scenarios.
6.0 Discussion
This section will cover the Interpret and Report section of the methodology whilst also completing
objective 5. The discussion will investigate the understanding of the data and how the results of
running process mining on the dataset could help SMEs face challenges that were identified in the
literature review. Each scenario will be linked with a challenge, which will help develop the
conceptual model further in the discussion.
The literature review identified the issue of SMEs not knowing or understanding the activities a
specific process goes through. Scenario One takes this into consideration. It looks at a process from
start to finish with all activities involved. The initial result is unreadable as it has hundreds of
bottlenecks and variations which gives a spaghetti like look to the process. The spaghetti like process
model does not provide any meaningful information from the event logs, deeming them useless to
an organisation (Van De Aalst and Gunther, 2007). This led to further filtering of the event data to
explore what the most common variant was. Filtering and identifying the most common variant
allows the user to identify and analyse what the most common activities are for the process.
Variance based filtering clusters the event logs to discover simpler process models (Eck et al,2015).
Although SMEs may not have enough event data to have a complex spaghetti like result, they may
still have bottlenecks and many different variants of the process. Filtering the results would allow
SMEs to streamline the data and identify different activities and potential variants. The results of
scenario one showed that SMEs could gain a better understanding of activities within a process by
using process discovery and then further filtering the results. It allows an SME to look at the process
from start to finish with statistics allowing an organisation to view the entire process not just single
activities. This was another challenge identified in the literature. Song and Van Der Aalst (2007)
support this when stating that process mining shows the overall process events at glance, making it
easy for an analysis of the entire process to be completed.
Scenario Six also helps the organisation understand their processes by further filtering results to
specific results. If an organisation notices a decrease in net sales, they can filter the results to
specific months. Scenario six identifies and helps an organisation understand what cases deviate
from the target process flow and gives an organisation indication on what aspects could be
improved to increase net sales. This gives the organisation a better understanding of the process
flow as filtering the event logs can enable an SME to extract important knowledge that they might
not have been aware of beforehand (De Weerdt et al, 2013). Zelst et al (2018) also state that
filtering results accurately extracts knowledge for organisations which will help them understand
and gain specific results e.g. looking at causes of bottlenecks.
Another challenge identified in the literature review is that an organisations’ staff do not follow the
designed process model, especially in SMEs where the process model is not clearly defined. Scenario
Five uses conformance checking to identify bottlenecks within the process where variants of the
process do not follow the designed process model. An SME would have the opportunity to insert
their target process model and then use conformance checking. It would allow an SME to identify
non-conforming cases within the event log. Scenario five shows that it would also give an
organisation statistics about each non-conforming event. This will show inconsistencies between the
process model and the actual event data (Rozinat and Van Der Aalst, 2008).
Scenario Two also looks at this challenge by investigating late deliveries. It identifies the bottlenecks
that are delaying the orders being shipped. Although the delay in shipping may not be relevant to all
SMEs and all processes the key part of this scenario is identifying the bottlenecks. It allows an
34
organisation to identify why there are delays or deviations in the process. This could be staff not
following the process model or it could have identified an area that needs to be looked at in more
detail. If SMEs identify bottlenecks and non-conforming activities it will make staff aware of the
desired process model and gain perspective on how they can make changes to follow it (Dewande et
al, 2021).
Processes need to be continuously analysed and updated to ensure they are working efficiently. This
is a major challenge to SMEs as many manually analyse processes but do not keep them up to date.
Scenario Two identifies the bottlenecks of the process that are causing delayed shipping. Identifying
the bottlenecks allows an organisation to look at where it is deviating from the desired process
model. SMEs could use this to see if activities need to be added to the process model to ensure it is
working in its most efficient and up to date way. SMEs would benefit from identifying where
processes need to be changed and updated as it will allow them to work more efficiently and in turn
reduce the risks associated with non-conforming variants (Garcia et al, 2017).
Scenario Three could also help SMEs identify where processes may need updating. The scenario
looks at the average throughput time of a process from start to finish. SMEs could use this to
identify activities that are making the average time of the process increase, a decision could then be
made to change the activities or add different activities to the process model to decrease the time.
Identifying what activities are the slowest and determining delays between activities would allow
SMEs to continuously identify activities that could be added or deleted to ensure each activity is
working at its optimised potential (Ailenei et al, 2011).
Scenario Four also looks at a solution to this challenge. It identifies at what stages orders are being
cancelled. The results show that most orders are cancelled within a bottleneck of the target process
model. This indicates that there are activities that are causing issues within the process. SMEs could
easily find this information and immediately makes changes to try and reduce the number of
cancelled orders. Scenario six also contributes to this challenge by identifying bottlenecks of a
process for specific months. It was identified that several cases were flowing through a specific
activity. Detecting the root cause of variations could indicate that the activity needs to be updated in
the target process model (Vogelgesang et al, 2022).
35
Figure 30 Conceptual Model of Challenges of analysing processes with process mining as a solution.
Figure 30 shows four challenges that were identified in the literature review and how the scenarios
link to those challenges as possible solutions. Scenario One and Six could be a possible solution for
two challenges which will help SMEs look at the entire process not just activities as single entities
and will help give SMEs a better understanding of their processes. Scenarios Two, Three, Four, Five
and Six all could help ensure SMEs identify where they need to update their processes. Finally,
Scenarios Five and Six help to ensure SMEs staff follow the target process model. The conceptual
model could help SMEs easily visualise how process mining could be beneficial to their organisation.
36
7.0 Conclusion
To conclude, a comparative analysis has been carried out to find if implementing process mining
could be beneficial to SMEs. It was identified that SMEs currently face many challenges with not only
analysing processes but also managing the processes all together. The findings from the literature
review and the testing of process mining with scenarios, it is believed that process mining could be a
solution to the challenges identified and overall, be beneficial to SMEs. It was found that the main
challenges faced by SMEs when analysing processes are lack of understanding of the processes, not
looking at processes as one big process but only looking at single entities, staff not following the
target process model and finally not updating their processes regularly. The project has found that
process discovery, conformance checking and identifying the bottlenecks of the process model will
be extremely beneficial to SMEs. To aid SMEs when implementing process mining a conceptual
model linked with challenges and scenarios within this project has been created. This will further
benefit SMEs in implementing process mining into their organisations.
The overall aim of the project has been achieved along with the five objectives that were created.
Each objective was a building block of the project to achieve the overall aim, ensuring this project
has a positive outcome.
7.1 Limitations
The project has been successful, however, there are limitations to the overall results. The main
limitation of the project is that secondary data was used. The secondary data that was used was a
large data set meaning it is only an assumption that the results of the project would work for SMEs.
Primary data of a smaller event log would have been more accurate for the overall results. However,
due to the time-consuming nature of creating an event log by SMEs, this was not achievable in the
time space and therefore secondary data was used.
Another limitation of the project is the use of scenarios. The scenarios were created with the
secondary research found in the literature review. However, primary research such as
questionnaires could have been used to find out first hand challenges SMEs face when analysing
processes. This does not affect the results of the overall project but could have enhanced and added
more value to the result.
A challenge that has been identified is that SMEs do not have enough data for event logs, although
this could not be looked at during this project to the timings, SMEs can easily build an event log from
their data over a few months. Process mining is not an easy solution. SMEs will need to take time to
ensure that they fit all the requirements. Once they have done this it is believed that the
implementation of process mining would be extremely beneficial to SMEs.
The final limitation identified is that there has been no validation with field experts for this project,
however, this could be a starting base for further work. Although there has been no validation of the
project it has been identified that process mining could be beneficial to SMEs if they took the time to
truly understand the technique.
7.2 Future Research and Recommendations
In future research, the project could be added to by collecting primary data from an SME and testing
similar scenarios. This would give more accurate results of the scenarios whilst also allowing SMEs to
view actual results of similar event log sizes not just assumptions made. Scenarios would also be
more arcuate with primary research as real-life challenges that SMEs face day to day could be
considered. The project could also be validated by field experts in future research which could
provide more insight into the accuracy of the project and how it could be improved. Validating the
project with field experts in future research will add further insight into how process mining could be
37
beneficial to SMEs. Future research will also be required into how SMEs can cope with handling the
amount of data for process mining and how it can be done cost-effectively as this was a challenge
identified that could not be looked at within the scope of this project.
To conclude, there are areas of the project that has been limited and could be improved in future
work. However, the project has identified that SMEs would benefit from implementing process
mining and the conceptual model that has been developed could be a foundation for future research
on this topic.
38
References
Ahrend, N., 2014. Opportunities and limitations of BPM initiatives in public administrations across
levels and institutions.
Ailenei, I., Rozinat, A., Eckert, A. and van der Aalst, W.M., 2011, August. Definition and validation of
process mining use cases. In International Conference on Business Process Management (pp. 75-86).
Springer, Berlin, Heidelberg.
Alotaibi, Y. and Liu, F., 2017. Survey of business process management: challenges and
solutions. Enterprise Information Systems, 11(8), pp.1119-1153.
Baker, M.J., 2000. Selecting a research methodology. The marketing review, 1(3), pp.373-397.
Baskerville, R., Pries-Heje, J. and Venable, J., 2009, May. Soft design science methodology.
In Proceedings of the 4th international conference on design science research in information systems
and technology (pp. 1-11).
Baskerville, R., Baiyere, A., Gregor, S., Hevner, A. and Rossi, M., 2018. Design science research
contributions: Finding a balance between artefact and theory. Journal of the Association for
Information Systems, 19(5), p.3.
Bazhenova, E., Taratukhin, V. and Becker, J., 2012, September. Towards on business process
management on small-to medium enterprises in the emerging economies. In 2012 7th International
Forum on Strategic Technology (IFOST) (pp. 1-5). IEEE.
Belzer, A. and Ryan, S., 2013. DEFINING THE PROBLEM OF PRACTICE DISSERTATION: WHERE'S THE
PRACTICE, WHAT'S THE PROBLEM?. Planning & Changing, 44.
Berti, A., Van Zelst, S.J. and van der Aalst, W., 2019. Process mining for python (PM4Py): bridging the
gap between process-and data science. arXiv preprint arXiv:1905.06169.
Bolt, A., de Leoni, M. and van der Aalst, W.M., 2016. Scientific workflows for process mining: building
blocks, scenarios, and implementation. International Journal on Software Tools for Technology
Transfer, 18(6), pp.607-628.
Bose, R.J.C., Mans, R.S. and van der Aalst, W.M., 2013, April. Wanna improve process mining
results?. In 2013 IEEE symposium on computational intelligence and data mining (CIDM) (pp. 127-
134). IEEE.
Burattin, A., 2013. Applicability of process mining techniques in business environments.
Burnard, P., Gill, P., Stewart, K., Treasure, E. and Chadwick, B., 2008. Analysing and presenting
qualitative data. British dental journal, 204(8), pp.429-432.
Caballero-Hernández, J.A., Dodero, J.M., Ruiz-Rube, I., Palomo-Duarte, M., Argudo, J.F. and
Domínguez-Jiménez, J.J., 2018, September. Discovering bottlenecks in a computer science degree
through process mining techniques. In 2018 International Symposium on Computers in Education
(SIIE) (pp. 1-6). IEEE.
Caldeira, J. and e Abreu, F.B., 2016, September. Software development process mining: Discovery,
conformance checking and enhancement. In 2016 10th International Conference on the Quality of
Information and Communications Technology (QUATIC) (pp. 254-259). IEEE.
Creswell, J.W., 2003. A framework for design. Research design: Qualitative, quantitative, and mixed
methods approaches, pp.9-11.
39
Dawande, M., Feng, Z. and Janakiraman, G., 2021. On the structure of bottlenecks in processes.
Management Science, 67(6), pp.3853-3870.
De Weerdt, J., Vanden Broucke, S., Vanthienen, J. and Baesens, B., 2013. Active trace clustering for
improved process discovery. IEEE Transactions on Knowledge and Data Engineering, 25(12),
pp.2708-2720.
Dakic, D., Sladojevic, S., Lolic, T. and Stefanovic, D., 2019, September. Process mining possibilities
and challenges: a case study. In 2019 IEEE 17th International Symposium on Intelligent Systems and
Informatics (SISY) (pp. 000161-000166). IEEE.
Dobrosavljević, A. and Urošević, S., 2019. Analysis of business process management defining and
structuring activities in micro, small and medium–sized enterprises. Operational Research in
Engineering Sciences: Theory and Applications, 2(3), pp.40-54.
Dumas, M. et al. (2018) Fundamentals of Business Process Management. 2nd edition. Berlin:
Springer.
Eck, M.L.V., Lu, X., Leemans, S.J. and Van Der Aalst, W.M., 2015, June. PM $$^ 2$$: a process mining
project methodology. In International conference on advanced information systems engineering (pp.
297-313). Springer, Cham.
Faizan, M., Zuhairi, M.F., binti Ismail, S. and Ahmed, R., 2021. Challenges and use cases of process
discovery in process mining.
García, L.M., Pardo-Hernandez, H., Superchi, C., de Guzman, E.N., Ballesteros, M., Roteta, N.I.,
McFarlane, E., Posso, M., i Figuls, M.R., Del Campo, R.R. and Sanabria, A.J., 2017. Methodological
systematic review identifies major limitations in prioritization processes for updating. Journal of
clinical epidemiology, 86, pp.11-24.
Geyer-Klingeberg, J., Nakladal, J., Baldauf, F. and Veit, F., 2018, July. Process Mining and Robotic
Process Automation: A Perfect Match. In BPM (Dissertation/Demos/Industry) (pp. 124-131).
Guarda, T., Santos, M.F., Augusto, M.F., Silva, C. and Pinto, F., 2013, June. Process mining: a
framework proposal for pervasive business intelligence. In 2013 8th Iberian Conference on
Information Systems and Technologies (CISTI) (pp. 1-4). IEEE.
Gulledge, T.R. and Sommer, R.A., 2002. Business process management: public sector implications.
Business process management journal.
Gupta, A. and McDaniel, J., 2002. Creating competitive advantage by effectively managing
knowledge: A framework for knowledge management. Journal of knowledge Management
practice, 3(2), pp.40-49.
Hevner, A. and Chatterjee, S., 2010. Design science research in information systems. In Design
research in information systems (pp. 9-22). Springer, Boston, MA.
Hox, J.J. and Boeije, H.R., 2005. Data collection, primary vs. secondary. Encyclopedia of social
measurement, 1(1), pp.593-599.
Kerpedzihev, G.D., König, U.M., Roglinger, M. and Rosemann, M., 2021. An exploration into future
business process management capabilities in view of digitalization.
Knopf, J.W., 2006. Doing a literature review. PS: Political Science & Politics, 39(1), pp.127-132.
Kothari, Chakravanti Rajagopalachari. Research methodology: Methods and techniques. New Age
International, 2004.
40
Kuechler, W. and Vaishnavi, V., 2012. A framework for theory development in design science
research: multiple perspectives. Journal of the Association for Information systems, 13(6), p.3.
Munoz-Gama, J., Martin, N., Fernandez-Llatas, C., Johnson, O.A., Sepúlveda, M., Helm, E., Galvez-
Yanjari, V., Rojas, E., Martinez-Millana, A., Aloini, D. and Amantea, I.A., 2022. Process mining for
healthcare: Characteristics and challenges. Journal of Biomedical Informatics, 127, p.103994.
Nayak, A. and Samanta, D., 2011. Synthesis of test scenarios using UML activity diagrams. Software
& Systems Modeling, 10(1), pp.63-89.
Offermann, P., Levina, O., Schönherr, M. and Bub, U., 2009, May. Outline of a design science
research process. In Proceedings of the 4th International Conference on Design Science Research in
Information Systems and Technology (pp. 1-11).
Pace, D.K., 2000. Ideas about simulation conceptual model development. Johns Hopkins APL
technical digest, 21(3), pp.327-336.
Perez-Castillo, R., Weber, B., Pinggera, J., Zugal, S., de Guzmán, I.G.R. and Piattini, M., 2011.
Generating event logs from non-process-aware systems enabling business process mining.
Enterprise Information Systems, 5(3), pp.301-335.
Pereira, G.B., Santos, E.A.P. and Maceno, M.M.C., 2020. Process mining project methodology in
healthcare: a case study in a tertiary hospital. Network Modeling Analysis in Health Informatics and
Bioinformatics, 9(1), pp.1-14.
Povlakic, V., 2021. The relations between CRM, BPM, and IT: A study done on Swedish SMEs.
Riss, U.V., Rickayzen, A., Maus, H. and van der Aalst, W.M., 2005. Challenges for business process
and task management. Journal of Universal Knowledge Management, 2, pp.77-100.
Rojas, E., Munoz-Gama, J., Sepúlveda, M. and Capurro, D., 2016. Process mining in healthcare: A
literature review. Journal of biomedical informatics, 61, pp.224-236.
41
Rozinat, A. and Van der Aalst, W.M., 2008. Conformance checking of processes based on monitoring
real behavior. Information Systems, 33(1), pp.64-95.
Ruël, H.J., Bondarouk, T. and Smink, S., 2010. The waterfall approach and requirement uncertainty:
An in-depth case study of an enterprise systems implementation at a major airline company.
International Journal of Information Technology Project Management (IJITPM), 1(2), pp.43-60.
Schlager, E., 2007. A comparison of frameworks. Theories of the policy process, pp.293-320.
Silva, E. and Chaix, Y., 2008, January. Business and IT governance alignment simulation essay on a
business process and IT service model. In Proceedings of the 41st Annual Hawaii International
Conference on System Sciences (HICSS 2008) (pp. 434-434). IEEE.
Smart, P.A., Maull, R.S., Karasneh, A.A.F., Radnor, Z.J. and Housel, T.J., 2003. An approach for
identifying value in business processes. Journal of Knowledge Management.
Snyder, H., 2019. Literature review as a research methodology: An overview and guidelines. Journal
of business research, 104, pp.333-339.
Song, M. and van der Aalst, W.M., 2007, December. Supporting process mining by showing events at
a glance. In Proceedings of the 17th Annual Workshop on Information Technologies and Systems
(WITS) (pp. 139-145).
Stertz, F., Mangler, J., Scheibel, B. and Rinderle-Ma, S., 2021, September. Expectations vs.
experiences–process mining in small and medium sized manufacturing companies. In International
Conference on Business Process Management (pp. 195-211). Springer, Cham.
Syring, A.F., Tax, N. and van der Aalst, W.M., 2019. Evaluating conformance measures in process
mining using conformance propositions. In Transactions on Petri Nets and Other Models of
Concurrency XIV (pp. 192-221). Springer, Berlin, Heidelberg.
Theis, J., Galanter, W., Boyd, A. and Darabi, H., 2021. Improving the In-Hospital Mortality Prediction
of Diabetes ICU Patients Using a Process Mining/Deep Learning Architecture. IEEE Journal of
Biomedical and Health Informatics.
Trkman, P., 2010. The critical success factors of business process management. International journal
of information management, 30(2), pp.125-134.
Turner, C.J., Tiwari, A., Olaiya, R. and Xu, Y., 2012. Process mining: from theory to practice. Business
Process Management Journal.
Van der Aalst, W.M., Benatallah, B., Casati, F., Curbera, F. and Verbeek, E., 2007. Business process
management: Where business processes and web services meet. Data & Knowledge
Engineering, 61(1), pp.1-5.
Van der Aalst, W.M., 2011. Process discovery: An introduction. In Process mining (pp. 125-156).
Springer, Berlin, Heidelberg.
Van Der Aalst, W., Adriansyah, A., De Medeiros, A.K.A., Arcieri, F., Baier, T., Blickle, T., Bose, J.C., Van
Den Brand, P., Brandtjen, R., Buijs, J. and Burattin, A., 2011, August. Process mining manifesto.
In International conference on business process management (pp. 169-194). Springer, Berlin,
Heidelberg.
Van Der Aalst, Wil. "Process mining: Overview and opportunities." ACM Transactions on
Management Information Systems (TMIS) 3.2 (2012): 1-17.
42
Van Der Aalst, W., 2016. Data science in action. In Process mining (pp. 3-23). Springer, Berlin,
Heidelberg.
Van der Aalst, W.M. and Gunther, C.W., 2007, July. Finding structure in unstructured processes: The
case for process mining. In Seventh International Conference on Application of Concurrency to
System Design (ACSD 2007) (pp. 3-12). IEEE.
Vogelgesang, T., Ambrosy, J., Becher, D., Seilbeck, R., Geyer-Klingeberg, J. and Klenk, M., 2022.
Celonis PQL: A query language for process mining. In Process Querying Methods (pp. 377-408).
Springer, Cham.
Von Davier, M., 2008. A general diagnostic model is applied to language testing data. British Journal
of Mathematical and Statistical Psychology, 61(2), pp.287-307.
Wen, L., Wang, J., van der Aalst, W.M., Huang, B. and Sun, J., 2009. A novel approach for process
mining based on event types. Journal of Intelligent Information Systems, 32(2), pp.163-190.
Wilcox, A.B., Gallagher, K.D., Boden-Albala, B. and Bakken, S.R., 2012. Research data collection
methods: from paper to tablet computers. Medical care, pp.S68-S73.
Zelst, S.J.V., Fani Sani, M., Ostovar, A., Conforti, R. and Rosa, M.L., 2018, June. Filtering spurious
events from event streams of business processes. In International Conference on Advanced
Information Systems Engineering (pp. 35-52). Springer, Cham.
43