CST 499 - Project Proposal
CST 499 - Project Proposal
Jonathan Delgado
Tayler Mauk
Bodey Provansal
25 April 2020
EXECUTIVE SUMMARY
Our team is planning to develop a prototype product that will be able to encrypt and upload a
user-defined set of files. This encryption will be done using a secret key provided by the user.
After files are encrypted, they will either be transferred to a network location or uploaded to a
cloud service. Uploading will be done using either network or cloud credentials provided by the
user. The goal of this product is to provide a simple-to-use application that allows users to
further protect their important data. The hope is to make this application simple and reliable
enough to be used at many different technical skills levels. From individual users protecting their
personal data to power users like Information Technology professionals that work for
small-to-medium sized organizations. We expect by the end of this project, to have produced an
application that can be installed and operated by our test group and that the features we set out to
Ethical Considerations 11
Legal Considerations 13
Project Scope 14
Timeline/Budget 14
Resources Needed 15
Milestones 15
Risks and Dependencies 16
Final Deliverables 17
Usability Testing/Evaluation 18
Team Members 19
References 21
Appendix A 22
Delgado et al. 1
This project will set out to develop a locally run application that will allow users to encrypt files
in a directory on their workstation or server before uploading that data to an online cloud storage
or network location. The intent is for this application to be quick, straightforward, and simple to
use, so that the target audience will not depend on technical skill level. Our group will plan to
focus test this application with both novice and “power” users encrypting and backing up
In the last decade, there has been a growing concern with small and large businesses alike
regarding the increasing risk involved with a company’s data. With every new security measure
and backup solution, there are new threats to overcome. One of the most prevalent threats to gain
traction in the last 5 years has been ransomware. In a recent article in the Journal of
Cybersecurity, “Unfortunately, there is less evidence of individuals and organizations taking the
necessary measures (particularly, regular backups) to mitigate and possibly deter the damage
from attack. This means that ransomware is likely to remain a serious threat for many years to
come” (Cartwright). While the main source of profit for ransomware criminals is with large
organizations, small businesses and people who are self-employed can still be targeted by these
attacks. The best course of defense against these attacks is to have a reliable back-up strategy.
This requires multiple back-up locations and, in the case one of those locations are compromised,
a way to encrypt that data so it cannot be used by anyone who is not authorized to access it.
devices, and cloud storage solutions like Amazon S3, it has never been easier to find places to
store data. Unfortunately, for many small companies or individuals, the data stored in these
Delgado et al. 2
locations is often not protected in ways that reflect its importance. Critical data stored in a shared
Google Drive or Dropbox is only as safe as each user who has access to it. A laptop left behind
on the bus with an easy to guess login is all it can take for unprotected data to be stolen,
This is typically not done out of negligence, because backup appliances and services that
properly encrypt this data often come with a hefty price tag or require a dedicated IT professional
to manage. Our solution to this problem is to provide a simple, “one-click” method of encrypting
a user’s directory and uploading that secured data to a secondary back-up location.
Our team is proposing a project to develop an application that can be run on a user’s
computer that will allow them to encrypt files of a predetermined size. Afterwards, these files
can then be uploaded to online cloud storage locations like Google Drive, Microsoft OneDrive,
or Amazon Web Services. Back-ups are useless without a reliable way to restore data, so our
application will also allow users to unencrypt the files after downloading from an online storage
location.
The critical function of this program is the encryption of data. We would use information
provided by the user that would help with the reliability of the backups we produce. These can
then be saved as “default” settings for the user until they provide new information. Most
importantly, we hope to achieve this using a unique hash that is based on a password provided by
the user. The program would then encrypt all selected files in a directory given by the user. The
user would also provide the size of the storage location, this would allow the program to
calculate a recommended data “chunk” size that we would split the data into. By segmenting data
into encrypted chunks, a user can be further assured that the information put in the cloud is safe.
Delgado et al. 3
There are potentially countless goals for the project, but a few that are considered important are
given in the table Figure 1. Note that the inclusion or exact implementation of nonessential goals
Goal Details
Create an abstracted, encrypted backup ● This goal refers to the core of the
Create profiles to store settings for various ● Users define jobs containing data
within
The objectives surrounding this project are largely focused on security without compromising
ease of use. The table in Figure 2 lists a few core objectives to accomplish this task.
decryption
encryption
data
remote management
Users can set a file (chunk) size to meet needs ● Access to local filesystem
change)
Users can interact with the program using a ● Evaluate frequently accessed and
Electron platform
The goals and objectives listed in this section are not exhaustive, but instead relay the general
agreement that these features are viewed as essential or at least more so than others.
The process of encrypting data before uploading to cloud storage is referred to as “client-side
encryption”, or CSE. CSE requires three steps to be an effective and reliable back-up strategy:
encryption, upload and storage, and restoration. Non-CSE storage environments have seen a
Delgado et al. 6
massive rise in popularity in the last decade. These are solutions like DropBox, Google Drive,
OneDrive, and iCloud. However, even the most popular cloud storage solutions offer no
“guarantees regarding the confidentiality and integrity of the data stored” using their servers.
(Henzinger) It is safe to say that the most common services that offer cloud storage will be
wide-open for most of their users, with data only being protected by a pool of user’s password.
There are a handful of smaller products that do offer CSE, the most popular products we
found were SpiderOak, Tresorit, and MegaSync. There have been a few issues found with these
products so far. The most impactful of these is synchronization. Compared to the non-CSE cloud
giants is the fact that CSE products have a difficult time syncing updated files between multiple
users effectively, a major selling point for products like Microsoft’s OneDrive. In a study done
by Linkoping University in Sweden, they found that the process needed to synchronize files on
cloud storage servers, delta encoding, usually hurts the performance of CSE products compared
to the non-CSE counterparts. CSE services “typically have significantly higher resource usage on
the client” and SpiderOak, in particular, “comes with a higher storage footprint on the client and
on the servers, has higher bandwidth overhead for both uploaders and downloaders, and
implements less effective delta encoding than Dropbox and iCloud”. (Henziger)
Furthermore, once this data is encrypted and stored, it is a black box. It is impossible to
extract, update, or search for any part of the data without first: restoring the entire encrypted
block of data, finding the desired file(s), making updates, and performing the encryption and
upload process again. Also, since these services do not have any record of a user’s unencrypted
data, if a secret key is lost, effectively, so is the data. (Zhang) This likely seems like too much of
Delgado et al. 7
a risk for many businesses. With all of these factors combined, it may show why unprotected
The stakeholders in this project are largely just the developers, as no other parties are involved
financially nor by any other means. However, the community is a much larger pool of
individuals who may benefit from such a project. This community consists of Information
Technology professionals, enthusiasts and hobbyists alike. The grounds for classifying such an
expansive set of people is qualified under the assumption that no individual would prefer to lose
The community and stakeholders gain several advantages resulting from the development
of this project including, but not limited to: accessible, inexpensive backup solution; secured data
through means of both abstraction and encryption; and on-premises functionality. Large
businesses may not have a lot to gain from a project of this nature, as it is likely these have either
without the financial or technical means will have an opportunity to securely back up data. The
same holds true for personal use in regards to financial accessibility and ease of implementation.
The only notable loss from investing into this product would be the loss of absolute
control over encryption and organized storage within the backup. The reasons behind this are the
fact that the software will attempt to organize files in such a way that data chunk sizes are as
close to one another as possible and only certain encryption standards may be supported. Only
enthusiasts and high-security organizations are feasibly seen to fall into this category.
Delgado et al. 8
The difference made by this project will be the innovation of an all-in-one solution for
securely backing up data to servers that may be considered “insecure”, such as Google Drive,
Dropbox, et cetera. (It is known that these services operate under certain security standards, but
the data stored is inherently not secure in the sense that what is stored is unencrypted.) As a
result, a secure backup solution can be made available to the general population, or at least those
In order to complete this project efficiently and on time, we will employ some common software
development best practices. We have decided to work most closely to the Agile methodology in
order to accomplish as much as possible within the time allotted. Since this isn’t an existing
project, speed of initial development is the largest priority, thus we decided to proceed with
Agile over something like Scrum where there is more structure, yet more time requirements.
PROCESS APPROACH
In order for our team to be able to accomplish this project on time, we’re specifying some
process guidelines to ensure communication, code contributions and features are streamlined.
- We will have a weekly team meeting in which we will discuss outstanding issues
and reevaluate if the existing priorities are still accurate. We will use Slack as
well as pull requests and GitHub issue comments to communicate with each other
- We will leverage GitHub Issues in order to manage tasks and bugs. GitHub Issues
allows us to have a Kanban board similar to Pivot Tracker, while being able to
strongly integrate individual commits and pull requests with these tasks.
- Immediately prior to starting this project, we will break out all related work into
individual issues and move work that is ready to start to the repos GitHub Kanban
board. For each ticket, we will identify and outline the expected approach.
- Prior to starting the work, when breaking out tasks, we will plan out an MVP “
Minimum Viable Product” where we will identify which features are absolutely
required versus those which are “nice to haves”. When getting closer to the
completion of this project, if we have additional time, we may opt to add some of
these “nice to have” features, however, we expect to complete all of the essential
features.
TECHNOLOGY APPROACH
In terms of the technology side of this project, we’ve considered a few different programming
languages and technology stacks. One of the technologies that allows us to most rapidly
implement this product is going to be NodeJS, which already offers a lot of features we’ll need,
such as file system streaming, encryption and hashing, while any additional features may already
be available on the hugely popular NPM (Node Package Manager) ecosystem. Additionally,
we’ll leverage ElectronJS in order to create a native desktop application that is well styled that
can directly interface with the NodeJS runtime. Other languages had alternatives to this
approach, however, it would take considerably more time to get a well styled product using CSS
Delgado et al. 10
versus something like Visual Basic or a Java GUI. JavaScript isn’t the most performant language
to handle processing like encryption or hashing, however, we can leverage native C++ modules
as well as multiple threads to make up for that gap. With very little effort, we’ll be able to
generate an installer (which will be unsigned for the purpose of this class) as well to distribute
this application. Due to these technology choices, we anticipate having a very polished and
feature-full application with a fraction of the development time. For the primary cloud provider,
we’ll be leveraging AWS’s S3 offering. However, we will abstract the project in such a way
where we should be capable of supporting most providers with future code additions.
In terms of the approach for programming, we’ll try to follow best practices as much as possible
to create a sustainable product, ripe for future development. Consider the following:
- Utilize unit testing and documentation, as much as possible within the deadline.
- Leverage Electron’s IPC channels to separate the core logic on the NodeJS side
- Abstract each provider into a seperate class, allowing us to easily support multiple
- Create handlers in which this application can still most likely run on a machine
ETHICAL CONSIDERATIONS
As our project is not a research study or experiment, the ethical considerations largely revolve
around the assumptions our team will make regarding the design, deployment, and user behavior
after the product is released. Namely, the assumptions our team will make for: how the product
will be used, by whom will the product be used, and how will the product be accessed.
This product will act on a set of data decided by a user and sent to a secondary location
that is also decided by the user and protected with the user’s private credentials. That data will be
encrypted by a chosen password and can only be unencrypted with that exact password. In this
example, the user has given over trust that the following are or are not occurring: the credentials
to the online storage location are not being saved, shared, or otherwise compromised, that the
chosen encryption password is not being compromised, that the encrypted data is not being
compromised, and that the data will be restored exactly as it was before encryption. In the best
case, where there is no intentional or accidentally immoral use of data or credentials from the
development end, our product is handling a user’s private credentials to encrypt private data
In the situation above, we are also assuming the “user” is synonymous with the “owner”
of the data or storage location. This is not true in all cases. Other situations to consider can
include when the data is being handled by a trusted second party, like an I.T. Managed Service
Provider. Furthermore, there are two potential ends for misuse by untrusted sources. If the
primary data is stolen, altered, or has malware unknowingly included in the user’s directory, the
tool will not know the difference. This can potentially infect the secondary location or give bad
actors access to the files stored there. Secondly, if saved settings or user profiles are
Delgado et al. 12
implemented in this program and those are unknowingly changed by a malicious user, good data
can be uploaded to an unknown secondary location that only the malicious user has access to.
The data is now effectively stolen and the intended user will not know unless they check their
intended secondary location for new files. We will need to consider the misuse of our application
and determine ways to mitigate malicious behavior that put user’s privacy and security first.
Assuming the product is working as intended and is not being misused after deployment,
there remains the consideration of accessibility. In design and development, our team will make
certain assumptions regarding how users will be able to access the executable, install and
configure the program, and ultimately, if they will be able to operate it effectively. Our team will
assume that intended users will meet a certain minimum specification of operating system,
hardware age, processing power, data transfer speed, and networking capabilities. Essentially, we
would be assuming that a user will have a “newer” computer that is internet-capable, and that the
user has a readily available and reliable internet connection. This is certainly not the case in
many personal and small business environments. On the same note, since we are utilizing the
“electron” node.js library to implement a GUI, we must consider the visual accessibility of our
program. Luckily there are also tools included in and compatible with Electron that help build
more accessible applications. These tools also include the ability to allow the user’s native OS
accessibility options and assistive technologies interact with Electron apps. Not all users or user
environments will be able to interact with our product in the same way, our team will need to
LEGAL CONSIDERATIONS
The main liability that we may be open to when distributing this backup program is data loss.
Since this project directly interacts with sensitive files on the host’s file system, we need to take
extra precautions legally to prevent litigation. Data loss with our system can happen through
either a bug in which we traverse the local file system and delete files (perhaps if we were
attempting to delete files on the remote server instead) or if a file that is being uploaded becomes
corrupt but still passes validation. It’s possible that we could be liable for either one of those
issues, dependent upon our guarantees to the end-consumer. We need to include a software
license agreement with this project that explicitly states we do not guarantee data consistency as
well as an agreement that we hold no liability for issues, essentially the software is provided
as-is. Software user agreements are very common and help protect software developers from
litigation when there are bugs that may lead to unforeseen consequences.
Since this is a program that directly interacts with users personal files, we may want to
consider distributing the code under some kind of an open source license upon release. An open
source license would allow others to freely analyze and contribute to our projects codebase,
allowing us to both gain credibility and trust as well as gain additional code contributions from
outside the core programming group. A good license for this type of distribution would be the
MIT license, which is very popular among open source software. However, if we would like to
create a company out of this, we would most likely not release the code at all, or release it
without a license, so that the implicit “all rights reserved” set by copyright laws would be in
effect.
Delgado et al. 14
PROJECT SCOPE
The following section defines and clarifies various aspects of this project's scope.
TIMELINE/BUDGET
The only associated cost for this project, outside of normal development, is going to be for an S3
bucket. This bucket will allow us to do full integration tests to determine how our backup
program is working outside of unit tests. The costs associated with an S3 bucket is $0.023/GB
for storage and $0.09/GB for outbound transfer. Accordingly, we’re putting a very generous
budget of $20 aside for inevitably paying that bill. There is a free tier where we could potentially
do this entire project for free, however, we’re opting to use an existing AWS account for
convenience.
We’re expecting to complete primary development on this project within five weeks,
allotting the remainder of this course for testing, bug fixing and polishing. The first week we’re
planning on setting up the project as well as implementing a local filesystem scan and S3
adapter. The second and third week we’ll be using that scaffold to implement the actual meat of
this project, where we will process files, generate manifest files and create a chunking system.
From there, week four we will focus on implementing encryption on the chunk system. Week six
will focus on implementing the restore feature. The remainder of time will be to test, fix bugs
and polish the project. We’re planning on completing command line and user interface work
alongside those milestones, allowing all team members to have a good end-to-end understanding
of the entire system, rather than partitioning group members to a specific side, such as front end
or back end.
Delgado et al. 15
RESOURCES NEEDED
The biggest resource needed for this (and perhaps any project) is time.
Other than time, the usual software development resources such as computers, technical
skill and pristine project management will be required. The development environment, consisting
largely of the Electron framework, will need to be configured in order to begin. The product will
require little to no media resources, until such time that a graphical user interface is
implemented. At this time, the project will require a fluid, intuitive design, tasteful color palette
MILESTONES
The major milestones of our project can be broken down into two categories: implementing core
features into our product and goals we hope to achieve from user testing. We will measure our
overall progress during the design and implementation phase by how quickly we are achieving
1) The ability for our program to scan the local file system of the machine where it is
installed.
2) The capability to upload those files to an online cloud storage location. For the initial
design we will first design our implementation around Amazon Web Services S3 buckets.
3) The ability to take the selected files and abstract them into chunks. Then unpack those
chunks.
Delgado et al. 16
4) The creation of a manifest file that will hold the information needed to encrypt and
5) The ability to download and restore the encrypted chunks for S3.
6) During the completion of the previous items, we will be developing a way to interact with
Once enough of the above items have been verified by the development team, we will
move on to user testing these features. The milestones of user testing will be marked in order of
priority as:
1) Scheduling and guiding user tests. Our team will guide users through an initial round of
test runs of the program, recording any useful information gained from users or test
results.
4) Scheduling a second round of user tests after updates have been made. Document any
disastrous risk to the completion of this project. Should a team member fall ill, it may be unlikely
that that individual will recover in time. In terms of other non-health-related risks, unseen
difficulties and complex software bugs also pose a strong threat to the project’s completion. This
Delgado et al. 17
would not be due to lack of understanding of the software itself, but the concepts around it such
as cryptography that are typically seen as more demanding tasks. Another category of risks to
consider is the risk that too many features are left in infancy and never fully developed, which is
arguable that such events ultimately lead to bugs but is still a point in its own, differentiated from
The dependencies for completing this project can be divided in two categories: the agility
of the development team and the availability of cloud storage application programming
interfaces (APIs).
As the project unfolds, issues and obstacles will begin to arise; however, these are easily
remediated by remaining focused and dedicated to the task at hand. As a result, tasks will be
done to their fullest as opposed to being left behind in some form of an “it works” state, leading
to faster development due to less unforeseen software bugs. Having a strong pipeline such as this
is especially important when dealing with encryption and cryptography because one mistake can
The availability of cloud storage APIs is straightforward, if the platform does not offer an
API, then this project has no way of directly interfacing with it.
FINAL DELIVERABLES
The final deliverables for this product include a functioning backup program. The program
should be able to first and foremost act on user input whether it be through command line or a
graphical interface (but not necessarily both). The actions performed should be correct and
benign in nature. Any other requirements needed to perform the requested actions should be
Delgado et al. 18
automatically evaluated and executed by the software. The product will be able to scan the file
system for the requested data and abstract and encrypt it to ensure that the intended security
measures are taken in order to maintain the primary focus of this application. The product should
be able to connect to a cloud service and upload the encrypted data to it as well as retrieving data
from the cloud service and allowing the user to view it, provided that user has produced the
proper credentials.
Beyond core elements, the product itself will have a graphical user interface, should time
permit. This interface will allow the user to configure various options within the program, start
USABILITY TESTING/EVALUATION
For our tests groups, we will limit users to those who have a solid understanding of and make
regular use of cloud storage. We will separate those users into two self-described groups: casual
users and power users. We hope to make our product simple and effective enough for both
groups to use. After our internal development testing is finished, we will schedule times for
guided user testing sessions. These testing sessions will use a combination of monitoring users
while they use the program and getting feedback from users in the form of a short survey and
questionnaire.
Our plan for scheduled testing sessions will involve one-on-one meetings with a member
of our team and a test user. We will walk the user through the steps of installing and configuring
the application on their computer. Before testing the application we will confirm the data and
Delgado et al. 19
cloud storage location for the test. We first want to guarantee that the data used during the test is
not critical and will not be missed if something unexpected happens during testing. We also want
to make sure the user has the correct credentials and access to the cloud storage location. To do
this, a test upload from a local directory to the cloud storage will be made, and then the same file
will be downloaded from cloud storage to the same local directory. After this, a member of the
team will demonstrate how the product should be used on the machine. Finally, after the
demonstration, the test user will run through all the features of the program.
Once the user has finished the hands-on testing, we will document their feedback through
a short survey. We will expand upon the short and simple “System Usability Scale” survey in
order to make it more applicable to our program and any issues we faced during development.
We can then have a discussion with the users regarding the answers they gave on the survey,
Ideally, we will have enough time to schedule at least 2 meetings with test users, this
would give us the chance to address the most critical issues found during testing. Depending on
the progress of the product, we may schedule “alpha” tests of specific features, like the UI,
before the rest of the product is ready for full user testing. Luckily, performing remote tests
should not impact our results as long as we are able to establish a screen-sharing session with
users.
TEAM MEMBERS
Rather than assigning each user to a task upfront, we’re planning on taking an Agile approach
where any team member can take the next highest priority ticket available from the Kanban
Delgado et al. 20
board. The development team would like to enjoy learning the entirety of our technology stack of
this project, so we’ve opted to include everyone in development across the full stack.
responsible for the user interface and another for the backend. There may be some inherent risk
in this decision where tasks may be cherry picked or a team member may not carry their full
weight, however, we’ve worked very well together for several years and we’re confident in the
trade offs.
Delgado et al. 21
REFERENCES
Cartwright, E., Hernandez Castro, J., & Cartwright, A. (2019). To pay or not: game theoretic
Henziger, E., & Carlsson, N. (2019). Delta Encoding Overhead Analysis of Cloud Storage
Zhang, X., Tang, Y., Wang, H., Xu, C., Miao, Y., & Cheng, H. (2019). Lattice-based
proxy-oriented identity-based encryption with keyword search for cloud storage. Information
APPENDIX A
The System Usability Scale, is a short standardized survey to quickly and efficiently have test
users gauge the usability of your product. It is typically presented as shown here:
From: https://measuringu.com/sus/
4. I think that I would need the support of a technical person to be able to use this system.
7. I would imagine that most people would learn to use this system very quickly.
10. I needed to learn a lot of things before I could get going with this system.