CloudVault : Encrypted Backup Solution
A Project Report
Submitted by:
Dhyey Chirag Patel (AU2040257)
in partial fulfillment for the award of the degree
of
BACHELOR OF TECHNOLOGY
in
(COMPUTER SCIENCE AND ENGINEERING)
at
School of Engineering and Applied Science (SEAS)
Ahmedabad, Gujarat
May, 2024
DECLARATION
I hereby declare that the project entitled “CloudVault : Encrypted Backup
Solution” submitted for the B. Tech. (Computer Science and Engineering) degree
is my original work and the project has not formed the basis for the award of any other
degree, diploma, fellowship or any other similar titles.
Signature of Student
Date: 2/5/2024
Place: School of Engineering and Applied Science, Ahmedabad University
i
CERTIFICATE
This is to certify that the project titled “CloudVault : Encrypted Backup Solution”
is the bona fide work carried out by Dhyey Chirag Patel, a student of B. Tech.
(Computer Science and Engineering) of School of Engineering and Applied Science
at Ahmedabad University during the academic year 2023-2024, in partial fulfillment of
the requirements for the award of the degree of Bachelor of Technology in Computer
Science and Engineering and that the project has not formed the basis for the award
previously of any other degree, diploma, fellowship or any other similar title.
This project was done under the supervision of the faculty mentor Mr Sanjay Chaud-
hary.
Signature of Faculty Mentor
Date: 02/5/2024
Place: School of Engineering and Applied Science, Ahmedabad University
ii
Abstract
This project is focused on the development and implementation of a robust cloud
storage service utilizing Nextcloud as the substructure and leverages the power of HTTPS
tunneling, a technique used to establish a secure communication channel between a client
and a server over the internet bypassing network firewalls and proxies. Other features
include a tailored backup policy that is implemented through customized Python scripts.
These scripts provide features like encryption, compression, logging and backup rotation
to optimize storage resources and security.
The backup mechanism ensures routine creation of encrypted and compressed data.
The logging features provide cloud administrators insights into all backup operations with
timestamps, facilitating troubleshooting of failures. Overall, the project is focused on
repurposing unused hardware to create a private cloud storage that provides security,
backup, and accessibility.
iii
Table of Contents
Declaration . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . i
Certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Certificate . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . ii
Abstract . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iii
Table of Contents . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . iv
List of Figures v
Gantt Chart vi
1 Introduction 1
1.1 Project Definition . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 1
1.2 Project Objectives . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 2
2 Literature Survey 3
2.1 Related Work . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
2.2 Tools and Technologies . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 3
3 Methodology 5
3.1 Setting up Nextcloud . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 5
3.2 Exposing the server online . . . . . . . . . . . . . . . . . . . . . . . . . . . 7
3.3 Programming the Backup policy . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.1 Incremental Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.2 Encryption . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 11
3.3.3 Backup Rotation . . . . . . . . . . . . . . . . . . . . . . . . . . . . 13
3.3.4 Drive Backup and Email Notification . . . . . . . . . . . . . . . . . 13
4 Results 16
4.1 Project Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 16
4.2 Learning Outcomes . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 21
5 Conclusion 22
Bibliography 22
iv
List of Figures
0.0.1 Rotated Image . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . vi
3.2.1 Sequence Diagram . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 8
3.3.1 Incremental Backup Policy . . . . . . . . . . . . . . . . . . . . . . . . . . . 9
3.3.2 Architecture . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 15
4.1.1 Localhost Nextcloud Server . . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.2 Ngrok Connection Dashboard . . . . . . . . . . . . . . . . . . . . . . . . . 17
4.1.3 Access from the client machine . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.4 Full Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 18
4.1.5 Incremental Backup of File uploaded . . . . . . . . . . . . . . . . . . . . . 19
4.1.6 Drive Backup . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . . 19
4.1.7 Email Notification for Backup Completed . . . . . . . . . . . . . . . . . . . 20
v
Gantt Chart
Figure 0.0.1: Rotated Image
vi
Chapter 1
Introduction
In this world and age which is heavily focused on data-driven operations, there is a critical
need for secure and efficient storage solutions. With the growing importance of data,
institutions want to rely less on third - party cloud storage providers and want to store
data on their premises and their hardware.
This project delves into the development and implementation of cloud storage service
which utilizes unused or lying hardware to create a well rounded storage solution with
encryption security and backup features. Nextcloud serves as the foundation for crafting
the cloud interface. It provides a range of functionalities from file synchronization, sharing
and collaborative document editing. To bypass network constraints, firewalls and proxies,
the project leverages the technique of HTTPS Tunneling which helps in establishing secure
communication channels between clients and servers on the internet, bypassing networking
restrictions.
The next layer project provides the service of an efficient backup scheme, which is pro-
grammed in Python and is executed as cron job for periodical execution. The script
provides features like incremental backup rather than full backup every time to ensure
optimize storage, logs with timestamp for auditing, encryption for secure access to backup
data with a key, compression for optimum storage use and backup rotation for removal
of old and used backup. Overall this project uses Nextcloud and HTTPS Tunneling to
create a private cloud storage system.
1.1 | Project Definition
The goal is to use idle hardware to create a comprehensive cloud storage system using
Nextcloud to create the server by writing PHP scripts to configure trusted domain,
memcache, hostname etc and using HTTPS tunneling provided by Ngrok for remote access.
Additionaly a strong backup scheme is implemented using Python that entails features
like incremental backup, compression, encryption and backup rotation.
1
CloudVault : Encrypted Backup Solution
1.2 | Project Objectives
■ Learn to use idle resources to create a private cloud storage using open-source
software and tools
■ Learn to program manually and debug the config files in PHP for creating the server
■ How HTTPS Tunneling works and bypasses network restrictions by creating secure
tunnels between server client and the service provider.
■ Exploring the various python libraries like os, shutil, cryptography, hashlib etc and
how the SHA 256 encryption works
■ To debug and program the incremental backup scheme manually in Python.
Page 2 of 23
Chapter 2
Literature Survey
Creating the private cloud server required survey of the web and related research papers for
plausible open source options to configure and deploy it on an idle PC. For programming
the incremental backup scheme involved the study of various research papers in the field
of effective backup polices.
2.1 | Related Work
■ Research-Paper : This paper describes custom cloud storage implementation
techniques with Nextcloud along with a case study[1]
■ Official Nextcloud Documentation - This website provides full documentation
regarding installation guides, version releases, APIs and more.[2]
■ Ngrok.com - Provides HTTPS tunneling server for free upto 20000 requests to
expose your web services online.[3]
■ Official Python Documentation - Provides object method definitions and package
installtion guidelines to work with Python3[4]
2.2 | Tools and Technologies
Backend Languages
■ PHP - A scripting language used to program and write the configuration files for
creating the server for customized server needs.
■ Python - A high level programming language, dynamically typed that is used to
write the incremental backup policy and other features like encryption, compression
and backup rotation.
User Interface Layer
■ Nextcloud Software - An open source suite of software and related applications that
facilitate in creating and using file hosting and other services.
Server Layer
3
CloudVault : Encrypted Backup Solution
■ Apache2 HTTP Server - An open source web server software used to mount Nextcloud.
It handles incoming HTTP requests from clients and serves the Nextcloud interface.
Secure Tunneling Service
■ Ngrok - A platform that provides tunneling services, reverse proxy, firewall etc. It is
used to expose the Apache2 server running Nextcloud to the Internet securely.
Database
■ MySQL - An open source RDBMS that stores server metadata information.
Protocols and APIs
■ SMTP(Simple Mail Transfer Protocol) - A standard protocol for transferring an
electronic mail (email) over a network.
■ Drive API - Drive API Service to communicate and interact with Google Drive.
Page 4 of 23
Chapter 3
Methodology
3.1 | Setting up Nextcloud
The first phase of building the project consists of setting up the private cloud. First it
involved learning the manual installation and configuration of apache2 web server and
mounting Nextcloud upon it. First it involves downloading the software file of Nextcloud
from https://download.nextcloud.com/server/releases/latest.zip
. The classes from cloud computing about web services, deployment and communication
protocols helped me in building this server.
The next part consists of setting up the Mariadb Server. We execute the following com-
mands:
Accessing the MariaDB console
■ "sudo mariadb"
Creating the Database and setting up the required permissions
■ "CREATE DATABASE nextcloud";
■ "GRANT ALL PRIVILEGES ON nextcloud.* TO ’nextcloud’@’localhost’
IDENTIFIED BY ’mypassword’";
■ "FLUSH PRIVILEGES";
The next part entails the installation of Apache Webserver Setup. First step is install the
snap module to assist Apache.
■ "sudo apt install php"
"php-apcu php-bcmath php-cli php-common php-curl"
"php-gd php-gmp php-imagick php-intl php-mbstring"
"php-mysql php-zip php-xml"
Checking the status of Apache
■ "systemctl status apache2"
Enable the recommended PHP extensions:
5
CloudVault : Encrypted Backup Solution
■ "sudo phpenmod bcmath gmp imagick intl"
The next part consists of moving the nextcloud directory from where the website files will
be served from and also set the permissions as well:
■ "mv nextcloud nextcloud"
■ "sudo chown -R www-data:www-data nextcloud"
■ sudo mv nextcloud /var/www
■ "sudo a2dissite 000-default.conf"
The next step is to create a host config file for Nextcloud in PHP in the apache2/sites-available/
"nextcloud.conf/" "<VirtualHost *:80>"
"DocumentRoot "/var/www/nextcloud""
"ServerName nextcloud"
"<Directory "/var/www/nextcloud/">"
"Options MultiViews FollowSymlinks"
"AllowOverride All"
"Order allow,deny"
"Allow from all"
"</Directory>"
"TransferLog /var/log/apache2/nextcloud.log"
"ErrorLog /var/log/apache2/nextcloud.log"
"</VirtualHost>"
The next steo is to permit the site
■ "sudo a2ensite apache-config-file-name.conf"
Configuring PHP
We have to configure some PHP options in the php.ini file. I set the following parameters
according to my desired configuration.
■ "memory_limit = 512M"
■ "upload_max_filesize = 200M"
■ "max_execution_time = 360"
■ "post_max_size = 200M"
■ "date.timezone = America/Detroit"
■ "opcache.enable=1"
Page 6 of 23
CloudVault : Encrypted Backup Solution
■ "opcache.interned_strings_buffer=8"
■ "opcache.max_accelerated_files=10000"
■ "opcache.memory_consumption=128"
■ "opcache.save_comments=1"
■ "opcache.revalidate_freq=1"
Now to start the following PHP modifications for Apache:
■ "sudo a2enmod dir env headers mime rewrite ssl"
Restarting Apache to ensure the new PHP modifications take effect:
■ "sudo systemctl restart apache2"
3.2 | Exposing the server online
The server is running on localhost and the now it is to be made discoverable to the internet.
To achieve this, HTTPS tunneling is used as an alternative to port forwarding. It is a
technique used to establish a secure connection chennel between a client and a server over
the internet, at the same time permitting data to be transmitted safely and securely over
the internet even bypassing network restrictions such as firewalls and proxies. Knowledge
from the computer networks course about network sequence diagram as well as HTTPS
Request method types from cloud computing assisted me in exposing my server online.
Client-Server Communication
In a typical client-server model, a client sends requests to the server over the internet and
the server in turn, responds to those requests. But if the server is covered or is behind a
firewall or proxy then direct communiction may be blocked.
Ngrok initialization
The client initializes Ngrok by running the application on the local machine and in turn it
communicates with the Ngrok server which acts as the relay between the client and server
behind the firewall.
Tunnel Initialization
Ngrok authenticates client and generate unique tunnel id (normally, a subdomain or a
random string) for each client’s tunnel. Such an identifier functions as a public URL which
can be used by external users to access the server of the client located locally.
Secure Connection Establishment
Ngrok builds a secure WebSocket or HTTPS tunnel to the client and exchanges data in an
encrypted manner, thus protecting it from viewing by users of Ngrok server. This provides
a guarantee that secret data like credentials and sensitive content cannot be listened to or
changed maliciously.
Page 7 of 23
CloudVault : Encrypted Backup Solution
Figure 3.2.1: Sequence Diagram
Forwarding to Server
Once the Ngrok server receives request from the external users, then the server passes on
this request to the client’s own local server through the pre-established tunnel. Ngrok
has the ability to forward incoming HTTP/HTTPS or TCP traffic to the local server by
tunneling the incoming data around the client’s firewall and then to a user’s machine. By
doing so, an external user can interact with the local server as if it is running on their
own machines.
Response Transmission
The server response is sent through the channel. Ngrok client recognizes an encrypted
response and passes it to the Ngrok server through a secure connection.
Client Response
The Ngrok client receives the server’s response and forwards it to the client application
that initiated the request. The client application processes the response as usual.
3.3 | Programming the Backup policy
The next phase of the software includes implementing a strong backup policy that focuses
on optimizing storage alongwith other features like encryption, compression, logging and
backup retention scheme.
Page 8 of 23
CloudVault : Encrypted Backup Solution
3.3.1 | Incremental Backup
Figure 3.3.1: Incremental Backup Policy
It is a backup method that is used in data backup and recovery that is focused on storage
optimization and reduce the time it takes to perform backups.
At first, a complete backup of the all the whole data is made. The list is the one which
includes the files and folders that should be backed up in the backup. A full backup, with
this calm, is the starting point for further incremental backups.
After the complete success of the full backup, incremental backups are created at a given
interval (daily, weekly, etc.). Nevertheless, the distinguishing feature between a full backup
that stores all the data, and an incremental backup that only copies the information that
has changed since the last backup, is that the second one copies the data whether the last
backup was a full or incremental backup. This implies that the changes that occurred
after the last backup are the only ones that are logged and stored decreasing the space
taken and running time for the same.
Disadvantages:
Dependency on previous backups: Periodic backups basically used previous backups
to reconstruct the data. In case of any of the previously kept backups getting damaged or
deleted, it can affect the integrity of the whole backup chain.
Longer recovery time: Although a backup would take less time to be incremental, due
to the fact that all backups need to be applied sequentially to be able to restore data to
Page 9 of 23