Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
Abstract
SMS authorization codes play an important role in the application ecosystem, as a number of
transactions (e.g., personal identification and online banking) require users to provide a code
for authorization purposes. However, authorization codes in SMS messages can be stolen and
forwarded by attackers, which introduces serious security concerns. In this paper, we propose
CodeTracker, a lightweight approach to track and protect SMS authorization codes.
Specifically, we leverage the taint tracking technique to mark the authorization code with taint
tags at the origin of the incoming SMS messages (taint sources), and then, we propagate the
tags in the system. To this end, we modify the related array structure, array operations, string
operations, IPC mechanism, and file operations for secondary storage of SMS authorization
codes to ensure that the taint tags cannot be removed. When the authorization code is sent out
via either SMS messages or network connections (taint sinks), we extract the taint tag of the
data and enforce pre-defined security policies to prevent the code from being leaked. We have
developed a prototype of CodeTracker on Android’s ART virtual machine and used 1; 218
SMS-stealing Android malware samples to evaluate the system. The evaluation results show
that CodeTracker can effectively track and protect SMS authorization codes with a small
performance overhead (< 2% on average).
Index Terms—Data privacy, tags, Android, SMS authorization codes
Page 1
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
CHAPTER 1
INTRODUCTION
SMARTPHONES are widely used in our daily life. Increasingly more users leverage
smartphones for online transactions, bank transfers and other operations. Simultaneously,
increasingly more websites and applications (apps for short) leverage codes delivered via
SMS messages to authorize users.We call this type of code an authorization code in this
paper. For instance, an SMS authorization code can be required when users log into a banking
application or reset their passwords. Leveraging SMS codes for authorization is convenient;
however, it may present security concerns. If the code is stolen by attackers, it can cause
financial losses to users.
On the other hand, SMS-stealing malware is emerging [1, 2]. A research report from
the Qihoo 360 company [3] revealed that 6:1% of mobile malware is stealing information.
Among these information-stealing malware samples, 67:4% of them are targeting SMS
messages. A research paper [4] noted that among the 49 malware families, 27 of them are
harvesting user information, including user accounts and short messages. To this end, there is
an urgent need
to protect the SMS authorization codes in smartphones.
Before Android version 4.4 (KitKat) [5], malicious apps could intercept SMS
messages to retrieve authorization codes and then block the SMS broadcasting stealthily
without informing users. However, starting with Android version 4.4, the SMS mechanism
has been changed. Malicious apps are unable to block SMS broadcasting, and the system
SMS app will get the SMS messages. However, malicious apps can still steal SMS messages
by registering a broadcast receiver that listens to certain system events or requesting the
READ_SMS permission to retrieve SMS messages from the database.
We noted that a number of systems have been proposed to protect SMS authorization
codes. For instance, TISSA [6] can provide null or bogus values instead of real data, which
avoids data leakage (including SMS authorization codes). However, TISSA is currently
Page 2
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
implemented on legacy Android’s Dalvik runtime and not the newly designed ART runtime.
SecureSMS [7] is another system used to protect SMS messages by changing the Android
framework. In particular, when an SMS message arrives, Secure SMS searches the message
text. If a pre-defined keyword is found in the message, it adjusts the apps’ receiving sequence
of text
messages in the system so that the default SMS app can get the text message first. Then, it
stops the SMS broadcasting to prevent malicious apps from getting the message. This system
works but may cause compatibility issues in some benign apps that rely on received text
messages. In addition, starting with Android version 4.4, the SMS broadcasting mechanism
has been changed, and the new unordered broadcasting cannot be blocked.
1.1 Problem Statement
From another point of view, since SMS approval codes are a sort of touchy information in cell
phones, they can be ensured with the notable corrupt following method. TaintDroid [8] is
such a framework for continuous security observing that can be utilized to ensure approval
codes. Be that as it may, TaintDroid is actualized on the Dalvik virtual machine under
Android variant 4.4 and has not been relevant for the recently presented ART runtime since
Android 4.4. TaintART [9] actualizes a commonsense staggered data stream following
framework on Android's ART virtual machine and can be utilized to track and ensure private
information. Nonetheless, its extensibility is an issue in light of the fact that the bit length of
an enroll (32 bits) for pollute sign is restricted. Craftsman [10] is a framework in Android that
tracks private information by instrumenting applications utilizing a tweaked dex2oat
apparatus. Craftsman is an astounding framework; be that as it may, it works for intra-
application following and comes up short on the between application following that is
important for SMS approval code assurance.
Page 3
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
1.2 Objectives
To implement Authorization Code Identification.
Identifying Authorization Codes before SMS Broadcasting.
CHAPTER 2
LITERATURE SURVEY
2.1 Background
Page 4
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
In this section, we will briefly introduce the key concepts of the Android SMS system,
as well as the Android runtime environment, to provide necessary background information for
our proposed approach.
2.1 Android SMS System
In Android, when receiving a text message, the system sends the message from the RIL
(Radio Layer Interface) layer to the framework layer. The framework layer then packs the text
message into an SMS PDU and sends a broadcast indicating the receiving of an SMS
message. All apps with the RECEIVE_SMS permission will receive the broadcast along with
the SMS message if they have registered the SMS_RECEIVED_ACTION action.
Before Android version 4.4, SMS broadcasting was ordered, and apps with higher priority
(declared by apps in the manifest file) could access SMS messages first and then discard the
messages, which makes apps with low priority unreachable to the SMS messages. This
mechanis has been abused by malware to intercept SMS messages [4]. In addition, if a
malicious app has the permissions (READ_SMS or WRITE_SMS) to directly operate on the
SMS database, i could monitor the database continuously. Once an SMS authorization code is
received, it could steal the code and then delete it. Starting with Android version 4.4, the SMS
system has been changed. When the system receives a text message, the framework layer
encapsulates the text message into an SMS PDU and sends it with two types of broadcasting.
One type is ordered broadcasting, i.e., SMS_DELIVER_ACTION, in which
only the default SMS app can receive it. In other words, only the default SMS app has the
permission to delete and insert the text messages to the SMS database. The other type is
unordered broadcasting, i.e., SMS_RECEIVED_ACTION, in which the broadcasting cannot
be interrupted, and all apps can receive SMS messages by registering the broadcasting. Due
to this difference, malicious apps cannot intercept and delete the received SMS messages, but
they still can steal and forward the SMS messages to remote servers.
2.2 Android Runtime Environment
On an Android system, each app is running inside a separated runtime environment and has
its own unique running environment. This runtime environment was called the Dalvik runtime
in old Android versions and is called the ART runtime in Android versions 5.0 and above.
Page 5
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
Dalvik is a register-based virtual machine that will translate a dex file into an odex file with
the dexopt command and then execute it. To further improve the performance of Android,
Google introduced a new Android runtime, i.e., ART (Android Runtime) [11], which adopts
the AOT (ahead of time) mechanism. When an Android app is being installed, the ART
virtual machine leverages the dex2oat tool to transform the app’s dex file into an oat file,
which actually compiles the bytecode into native machine code. When the app is running, the
machine code will be directly executed, which greatly improves the performance. The
transition from the the Dalvik to ART runtime leads to several challenges to the taint tracking
system. For instance, TaintDroid [8] is implemented in Dalvik, which
stores the taint tags by applying extra space adjacent to the variables in the stack of the Dalvik
virtual machine. In the ART runtime, some of the parameters are stored directly in
registers. To support taint tracking in the ART runtime, the method of storing taint tags should
be changed accordingly. This is only one challenge, and we will illustrate how to implement
taint tracking on the ART runtime.
2.2 Existing System
From another perspective, because SMS authorization codes are a type of sensitive data in
smartphones, they can be protected with the well-known taint tracking technique. TaintDroid
[8] is such a system for real-time privacy monitoring that can be used to protect authorization
Page 6
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
codes. However, TaintDroid is implemented on the Dalvik virtual machine under Android
version 4.4 and has not been applicable for the newly introduced ART runtime since Android
4.4. TaintART [9] implements a practical multilevel information-flow tracking system on
Android’s ART virtual machine and can be used to track and protect private data. However,
its extensibility is an issue because the bit length of a register (32 bits) for taint indication is
limited. ARTist [10] is a system in Android that tracks private data by instrumenting apps
using a customized dex2oat tool. ARTist is an excellent system; however, it only works for
intra-application tracking and lacks the inter-application tracking that is necessary for SMS
authorization code protection.
2.2.1 Disadvantages of Existing System
Time consuming
Accuracy is less
2.3 Proposed System
We propose a lightweight approach with data-flow tracking to protect SMS
authorization codes in Android smartphones, called CodeTracker.
We have implemented a prototype of CodeTracker in the Android ART runtime.
CodeTracker adds taint tags to the SMS authorization code at the very beginning of
the incoming SMS messages and ensures that the tags cannot be removed when
propagating through the system. When the authorization code is sent out, CodeTracker
protects the code by enforcing pre-defined security policies.
We have evaluated our system with a collection of 1,218 malware samples. The
evaluation results demonstrate the effectiveness and low performance overhead of our
system.
Page 7
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
2.3.1 Advantages of Proposed System
Accuracy
Efficiency
2.4 Proposed Methodology
For case (1), when the data are sent between processes, the processing data will be packed
into a Parcel object, which might lead to the removal of the taint tag. This is indeed the case
when an SMS message is sent from the framework to the application layer via SMS
broadcasting. Therefore, we need to modify the structure of the Parcel class. Specifically,
when the array data are packed into a Parcel object, we extract the taint tag of the array and
save it into the Parcel object. Consequently, when the data are unpacked from the Parcel
object, we add the tag stored in the Parcel object to the corresponding array object
.
Page 8
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
For case (2), because many string operations will create a new array by invoking the methods
in the native layer, corresponding changes must be made. Specifically, we extract the taint tag
of the source array and then add it to the new target array.
For case (3), to save memory, we add a taint tag for an array but not for each element in the
array. This could lead to a loss of tags when the data elements in an array are traversed and
might be assigned to a new array. Some malicious apps often encrypt and modify text
messages byte by byte, resulting in the loss of taint tags. Therefore, we need to instrument the
compiler for the ART runtime and the interpreter. In particular, when it executes the
instruction of fetching array elements, we save the tag of the array in the current thread
instance. Later, when it executes the instruction of storing array elements, we get the tag from
the current thread instance and add it to the target array.
For case (4), when a user saves the tainted data into a file, the taint tag in the data could be
lost. To prevent this from occurring, we save the taint tag in the file’s extra extended attribute.
When the data are read from the file, we restore the tag back to the data.
Page 9
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
CHAPTER 3
SYSTEM REQUIREMENTS AND SPECIFICATION
SYSTEM REQUIREMENTS
One of the most important activities in software development is preparation of Software
Requirement Specification (SRS). Since the problems in modern world are becoming more
and more complex, it is getting increasingly difficult for the developers to comprehend the
problems fully and work exactly according to the predicted goal all through the work. Hence,
there is a need for a more righteous requirement analysis. In the present time the analysis
phase is considered to be the most critical and difficult.
3.1 Functional Requirements
Identify the SMS Authorization Code
Page 10
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
To identify an SMS authorization code and then apply the taint tag, our system has to
determine whether an SMS message contains an authorization code. First, we need to decide
when to identify the authorization code. Note that the Android SMS system mainly obtains
SMS messages via SMS broadcasting or by reading from the SMS database. Therefore, we
only need to determine whether an SMS message contains an authorization code before the
SMS broadcasting and after the message is fetched from the SMS database. However, because
the framework layer of Android will not have decoded the message content before
the SMS broadcasting, it is difficult for us to recognize the authorization code by searching
the content of the message. Therefore, we leverage the sender address of the SMS message
to determine whether the message possibly contains an authorization code; if so, we mark it as
a potential SMS authorization code. We maintain a list of sender addresses of SMS
authorization codes, and we treat all the SMS messages that originate from these addresses as
messages potentially containing SMS authorization codes. After the SMS message can be
read from the SMS database, we search the content of the message to obtain the string pattern
of the authorization code to determine whether the message contains an authorization code.
Propagate Taint Tags
Ensuring that the taint tags cannot be removed during the internal processing of the system is
a challenge. Because the SMS message is saved in an array that is created in the heap, the
taint tag will not be removed during general operations, e.g., function calls. However, in the
processing of multiple cases, the Android system can lose a tag carried by an array. These
cases include (1) IPC, (2) string operations, (3) single element processing in an array, and (4)
the secondary storage of the data.
3.2 Non Functional Requirements
Usability
Simple is the key here. The system must be simple that people like to use it, but not so
complex that people avoid using it. The user must be familiar with the user interfaces and
Page 11
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
should not have problems in migrating to a new system with a new environment. The menus,
buttons and dialog boxes should be named in a manner that they provide clear understanding
of the functionality. Several users are going to use the system simultaneously, so the usability
of the system should not get affected with respect to individual users.
Reliability
The system should be trustworthy and reliable in providing the functionalities. Once a
user has made some changes, the changes must be made visible by the system. The changes
made by the Programmer should be visible both to the Project leader as well as the Test
engineer.
Security
Apart from bug tracking the system must provide necessary security and must secure
the whole process from crashing. As technology began to grow in fast rate the security
became the major concern of an organization. Millions of dollars are invested in providing
security. Bug tracking delivers the maximum security available at the highest performance
rate possible, ensuring that unauthorized users cannot access vital issue information without
permission. Bug tracking system issues different authenticated users their secret passwords so
that there are restricted functionalities for all the users.
Performance
The system is going to be used by many employees simultaneously. Since the system
will be hosted on a single web server with a single database server in the background,
performance becomes a major concern. The system should not succumb when many users
would be using it simultaneously. It should allow fast accessibility to all of its users. For
example, if two test engineers are simultaneously trying to report the presence of a bug, then
there should not be any inconsistency while doing so.
Scalability
The system should be scalable enough to add new functionalities at a later stage. There
should be a common channel, which can accommodate the new functionalities.
Page 12
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
Maintainability
The system monitoring and maintenance should be simple and objective in its
approach. There should not be too many jobs running on different machines such that it gets
difficult to monitor whether the jobs are running without errors.
Portability
The system should be easily portable to another system. This is required when the web
server, which s hosting the system gets stuck due to some problems, which requires the
system to be taken to another system.
Reusability
The system should be divided into such modules that it could be used as a part of
another system without requiring much of work.
Page 13
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
3.3 Minimum Hardware Requirements
H/W System Configuration:-
Processor - Intel Core i3
Processor Speed - 2.6 GHz
RAM - 4 GB (minimum), 8GB (recommended)
Disk space - 160 GB
3.4 Minimum Software Requirements
S/W System Configuration
Operating System : Windows 7 or higher
Tools : Android SDK, JDK
IDE : Android studio 2.2 or higher
Application Server : Wamp Server
Smartphone Configuration:
OS - Android 2.3 (Gingerbread) or higher
RAM: 516Mb
GPS supported Android Smartphone
Internet Connection speed up to 500kbps
Page 14
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
BIBILOGRAPHY
[1] Twitter Usage/Company Facts, https://about.twitter.com/company
[2] Posting a tweet, https://support.twitter.com/articles/15367-posting-atweet
[3] King R. A., Racherla P. and Bush V. D., What We Know and Don't Know about Online
Word-of-Mouth: A Review and Synthesis of the Literature, Journal of Interactive Marketing,
vol. 28, issue 3, pp. 167- 183, August 2014
[4] Ministry of Human Resource Development, http://mhrd.gov.in/statist
[5] India’s Best Colleges, 2015, http://indiatoday.intoday.in/bestcolleges/2015/
[6] Arora D., Li K.F. and Neville S.W., Consumers’ sentiment analysis of popular phone
brands and operating system preference using Twitter data: A feasibility study, 29th IEEE
International Conference on Advanced Information Networking and Applications, pp. 680-
686, Gwangju, South Korea, March 2015
[7] Choi C., Lee J., Park G., Na J. and Cho W., Voice of customer analysis for internet
shopping malls, International Journal of Smart Home: IJSH, vol. 7, no. 5, pp. 291-304,
September 2013
[8] Kanakaraj M., Guddeti R M.R., Performance Analysis of Ensemble Methods on Twitter
Sentiment Analysis using NLP Techniques, 9 th IEEE International Conference on Semantic
Computing, pp. 169-170, Anaheim, California, 2015
[9] Bahrainian S.-A., Dengel A., Sentiment Analysis and Summarization of Twitter Data”,
16th IEEE International Conference on Computational Science and Engineering, pp. 227-234,
Sydney, Australia, December 2013
[10] Pak A. and Paroubek P., Twitter as a Corpus for Sentiment Analysis and Opinion
Mining, 7th International Conference on Language Resources and Evaluation, pp. 1320-1326,
Valletta, Malta, May 2010
[11] Shahheidari S., Dong H., Bin Daud M.N.R., Twitter sentiment mining:A multidomain
analysis, 7th IEEE International Conference on Complex, Intelligent and Software Intensive
Systems, pp.144-149, Taichung, Taiwan, July 2013
[12] Neethu M. S. and Rajasree R., Sentiment Analysis in Twitter using Machine Learning
Techniques, 4th IEEE International Conference on Computing, Communications and
Networking Technologies, pp. 1-5, Tiruchengode, India, 2013
Page 15
Code Tracker: A Light Weight Approach to Track and Protect Authorization Codes in
SMS Messages
[13] Bespalov D., Bai B., Qi Y., and Shokoufandeh A., Sentiment classification based on
supervised latent n-gram analysis, 20th ACM international conference on Information and
knowledge management, pp. 375-382, New York, USA, 2011
[14] Jotheeswaran J. and Koteeswaran S., Decision Tree Based Feature Selection and
Multilayer Perceptron for Sentiment Analysis, Journal of Engineering and Applied Sciences,
vol. 10, issue 14, pp. 5883-5894, January 2015
[15] Socher R., et al, Recursive Deep Models for Semantic Compositionality Over a
Sentiment Treebank, Conference on Empirical Methods in Natural Language Processing,
Seattle, Washington, October 2013.
Page 16