0% found this document useful (0 votes)
11 views10 pages

Expectation and Purpose

This paper explores users' mental models regarding mobile app privacy, emphasizing the importance of expectations in privacy perceptions. It introduces a crowdsourcing approach to capture users' expectations about app behaviors and presents a new privacy summary interface that highlights discrepancies between user expectations and actual app behaviors. The findings suggest that informing users about the purposes of resource access can enhance their privacy awareness and decision-making regarding app installations.

Uploaded by

nabeeha sahar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
0% found this document useful (0 votes)
11 views10 pages

Expectation and Purpose

This paper explores users' mental models regarding mobile app privacy, emphasizing the importance of expectations in privacy perceptions. It introduces a crowdsourcing approach to capture users' expectations about app behaviors and presents a new privacy summary interface that highlights discrepancies between user expectations and actual app behaviors. The findings suggest that informing users about the purposes of resource access can enhance their privacy awareness and decision-making regarding app installations.

Uploaded by

nabeeha sahar
Copyright
© © All Rights Reserved
We take content rights seriously. If you suspect this is your content, claim it here.
Available Formats
Download as PDF, TXT or read online on Scribd
You are on page 1/ 10

Expectation and Purpose: Understanding Users’ Mental Models of

Mobile App Privacy through Crowdsourcing


Jialiu Lin1 Shahriyar Amini1 Jason I. Hong1
Jialiul@cs.cmu.edu samini@ece.cmu.edu jasonh@cs.cmu.edu
Norman Sadeh1 Janne Lindqvist2 Joy Zhang1
sadeh@cs.cmu.edu janne@winlab.rutgers.edu joy.zhang@sv.cmu.edu
1
Carnegie Mellon University 2Rutgers University

ABSTRACT pertinent services and attractive features. However, access


Smartphone security research has produced many useful to these capabilities also opens the door to new kinds of
tools to analyze the privacy-related behaviors of mobile security and privacy intrusions. Malware is an obvious
apps. However, these automated tools cannot assess problem[17], but a more prevalent problem is that a good
people’s perceptions of whether a given action is number of legitimate apps gather sensitive personal
legitimate, or how that action makes them feel with information without users’ full awareness. For example,
respect to privacy. For example, automated tools might Facebook and Path, were found uploading users’ contact
detect that a blackjack game and a map app both use lists to their servers, which greatly surprised their users
one’s location information, but people would likely view and made them feel very uncomfortable [21, 34].
the map’s use of that data as more legitimate than the
game. Our work introduces a new model for privacy, A number of research projects have looked at protecting
namely privacy as expectations. We report on the results mobile users’ privacy and security by leveraging
of using crowdsourcing to capture users’ expectations of application analysis [10, 13-15, 19], or proposing security
what sensitive resources mobile apps use. We also report extensions that provide app-specific privacy controls to
on a new privacy summary interface that prioritizes and users [6, 22, 39]. These systems are useful for capturing
highlights places where mobile apps break people’s and analyzing an app’s usage of sensitive resources.
expectations. We conclude with a discussion of However, no purely automated technique today (and
implications for employing crowdsourcing as a privacy perhaps not ever) can assess people’s perceptions of
evaluation technique. whether an action is reasonable, or how that action makes
users feel with respect to their privacy. For example, is a
Author Keywords given app’s use of one’s location solely for the purpose of
Mental model, Privacy as expectations, Privacy summary, supporting its core functionality? It all depends on the
Crowdsourcing, Android permissions, Mobile app. context: for a blackjack game, probably not, but for a map
ACM Classification Keywords application, very likely so. However, currently, users have
H5.m. Information interfaces and presentation (e.g., HCI): very little support in making good trust decisions
Miscellaneous. regarding what apps to install.
General Terms In this paper, we frame mobile privacy in the form of
Design, Human Factors. people’s expectations about what an app does and does
not do, focusing on where an app breaks people’s
INTRODUCTION
expectations. There has been a lot of discussion about
The number of smartphone apps has undergone
expectations being an important aspect of privacy [33].
tremendous growth since the inception of app markets. As
We framed our inquiry on the psychological notion of
of June 2012, the Android Market offered 460,000 apps
mental models that first introduced by Craik [11] and later
with more than 10 billion downloads since the Market’s
mentioned in other domains [29]. All people have a
launch; the Apple App Store offered more than 650,000
simplified model that describes what people think an
apps with over 30 billion downloads since its launch.
object does and how it works (in our case, the object is an
These mobile apps can make use of a smartphone’s
app). Ideally, if a person’s mental model aligns with what
numerous capabilities (such as users’ current location, call
the app actually does, then there would be fewer privacy
logs, and other information), providing users with more
problems since that person is fully informed as to the
Permission to make digital or hard copies of all or part of this work for app’s behavior. However, in practice, a person’s mental
personal or classroom use is granted without fee provided that copies are model is never perfect. We argue that by allowing people
not made or distributed for profit or commercial advantage and that copies to see the most common misconceptions about an app, we
bear this notice and the full citation on the first page. To copy otherwise,
or republish, to post on servers or to redistribute to lists, requires prior can rectify people’s mental models and help them make
specific permission and/or a fee. better trust decisions regarding that app.
UbiComp’ 12, Sep 5 – Sep 8, 2012, Pittsburgh, USA.
Copyright 2012 ACM 978-1-4503-1224-0/12/09...$15.00.

501
We believe that this notion of privacy as expectations can requested permissions or not to install the app at all. Once
be operationalized by combining two ideas. The first is to granted, permissions cannot be revoked unless users
use crowdsourcing to capture people’s mental models of uninstall the app.
an app’s privacy-related behaviors in a scalable manner.
There have also been several user studies looking at
This requires some knowledge of an app’s actual
usability issues of permission systems in warning users
behaviors, which can be obtained with app analysis tools
before downloading apps. Kelley et al. [26] conducted
such as TaintDroid. The second is to convey these
semi-structured interviews with Android users, and found
expectations to users through better privacy summaries
that users paid limited attention to permission screens,
that emphasize the surprises that the crowd had about a
and had poor understanding of what these permissions
given app.
imply. Permission screens generally lack adequate
Our long term goal is to build a system that leverages explanation and definitions. Felt et al. [18] found similar
crowdsourcing and traditional security approaches to results from Internet surveys and lab studies that current
evaluate the privacy-related behaviors of mobile apps. Android permission warnings do not help most users
This paper presents the first step to understand the design make correct security decisions.
space and the feasibility of our ideas.
Our work leverages this past work investigating
We make the following research contributions: Android’s permissions. We extend their ideas in two new
• We demonstrate a way of capturing people’s ways. The first is using crowdsourcing as a way of
expectations using crowdsourcing. More specifically, measuring people’s expectations regarding an app’s
we conducted user studies on Amazon Mechanical behavior, rather than relying solely on automated
Turk (AMT) with 179 Android users, surveying their techniques. This allows us to capture a new aspect of
expectations and subjective feelings about different mobile app privacy that past work has not. The second is
apps accessing sensitive resources (such as location, the design and evaluation of a new privacy summary
contact lists, and unique ID) in different conditions. interface that emphasizes access to sensitive resources
• We identify two key factors that affect people’s that people did not expect.
mental model of a mobile app, namely expectation Mobile Application Analysis and Security Extensions
and purpose, and show how they impact users' Researchers have also developed many useful techniques
subjective feelings. and tools to detect the sensitive information leakage in
• We present an analysis which indicates that mobile apps [3, 10, 12-16, 19, 35, 36], by using
informing users of why a given resource is being permission analysis (e.g. [3, 16]), static code analysis
used can allay their privacy concerns, since most (e.g. [12]), network analysis (e.g. [35]), or dynamic flow
users have difficulty figuring out these purposes. analysis (e.g. [14]). Their results identified the strong
• We present the design and evaluation of a new penetration of ads and analytics libraries, and other
privacy summary that emphasizes behaviors that did prevailing privacy violations including excessively
not match the crowd’s expectations. Our results accessing sensitive information. We used TaintDroid [14]
suggest that our interface significantly increases in our work to investigate the ground truth of the top 100
users’ privacy awareness and is easier to comprehend popular Android apps on how and for what purpose
than Android’s current permission interface. sensitive resources were used. Amini et al. [2] offered an
RELATED WORK vision of an cloud-based service that leverages
We have organized related work into three sections: an crowdsourcing and traditional security approaches to
overview of the Android permission system; research on analyze mobile applications. Our work follows this vision
mobile app analysis and security extensions; and relevant and demonstrates the feasibility of incorporating
work in mental model analysis and design for privacy- crowdsourcing in application analysis.
related user interfaces. Many security extensions have been developed to harden
Android Permissions privacy and security. MockDroid [6], TISSA [39] and
The Android permission framework is intended to serve AppFence [22] substitute fake information into API calls
two purposes in protecting users: (1) to limit mobile apps’ made by apps, such that apps could still function but with
access to sensitive resources, and (2) to assist users in zero disclosure of users' private information. Nauman et
making trust decisions before installing apps. Android al. [28] proposed Apex which provided more fine-grained
apps can only access sensitive resources if they declare control over the resources usage based on context and
permissions in their manifest files and get approved by runtime constraints. To enable wide deployment, Jeon et
users during the installation time. On the official Android al. proposed an alternative solution that rewrote the
Market, before installing an app, users are shown a bytecode of mobile apps to enforce more privacy controls
permission screen listing the resources an app will access. [24] instead of modifying the Android system as the
Users can choose to either install the app with all the previous solutions.

502
Though app analysis provides us with a better generated by the owner of the web site. In our case,
understanding of apps’ behaviors, it cannot infer people’s information is gathered through both crowdsourcing
perceptions of privacy or distinguish between behaviors users’ mental models and profiling mobile apps using
which are necessary for an app’s functionality versus dynamic taint analysis (e.g. using TaintDroid).
behaviors which are privacy-intrusive. Similarly, while
CROWDSOURCING USERS’ MENTAL MODELS
the security extensions above provide users with more In this section, we present the design and results of our
control over their private data, it is unclear if lay users can
study using crowdsourcing to capture users’ mental
correctly configure these settings to reflect their real
models about a mobile app’s behavior.
preferences. Our work complements this past work by
suggesting an alternative way of looking at mobile Taking a step back, there are four reasons why
privacy from the users’ perspective. We study users’ crowdsourcing is a compelling technique for examining
mental models of mobile privacy, aiming to identify the privacy. Past work has shown that few people read End-
most pertinent information to help users make better User License Agreements (EULAs) [20] or web privacy
privacy-related trust decisions. policies [23], because (a) there is an overriding desire to
install the app or use the web site, (b) reading these
Expectations of Privacy, Mental Model Studies and
Privacy Interface Design
policies is not part of the user’s main task (which is to use
The notion of expectations is fairly common in the app or web site), (c) the complexity of reading these
discussions of privacy [33]. For example, in Katz v. policies, and (d) a clear cost (i.e. time) with unclear
United States, Supreme Court put forward “reasonable benefit. Crowdsourcing nicely addresses these problems.
expectation of privacy” to test reasonableness of legal It dissociates the act of examining permissions from the
privacy protections under the Fourth Amendment [1]. act of installing apps. By paying participants, we make
Palen and Dourish [30] and Barth et al. [4] discussed how reading these policies part of the main task and also offer
expectations are governed by norms, past experiences, clear monetary benefit. Lastly, we can reduce the
and technologies. Our notion of privacy as expectations is complexity of reading Android permissions by having
a narrower construct, focusing primarily on people’s participants examine just one permission at a time rather
mental models of what they think an app does and does than all of the permissions, and by offering clearer
not do. Our core contribution is in operationalizing explanations of what the permission means.
privacy in this manner, in terms of using crowdsourcing Study Design
to capture people’s expectations as well as reflecting the We recruited participants using Amazon’s Mechanical
crowd’s expectations directly in a privacy summary to Turk (AMT). We designed each Human Intelligence Task
emphasize places where an app’s behavior did not match (HIT) as a short set of questions about a specific Android
people’s expectations. app and resource pair (see Figure 1). Participants were
asked to read the provided screenshots and description of
Past work has looked at understanding people’s mental
an app, as retrieved from the official Android market.
models regarding computer security. For example, Camp
Then they were asked if they have used this app before
[9] discussed five different high-level metaphors for how
and what category this app belongs to. The categorization
people think about computer security. Wash [38]
questions were designed as an easy check to detect if
identified eight mental models (‘folk models’) of security
participants were gaming our system (e.g., clicking
threats that users perceived and how these models can
through HITs without answering questions).
justify why users ignored security advice. Bravo-Lillo et
al. [8] conducted studies to explore the psychological After these two questions, participants were shown one of
processes of users involving perceiving and responding to two sets of follow-up questions. One of the conditions
computer alerts. Sadeh et al. also studied the complexity (referred to as the expectation condition) was designed to
of people’s location sharing privacy preferences [5, 32]. capture users' perceptions of whether they expected a
This past research has a similar flavor as ours in terms of given app to access a sensitive resource and why they
trying to understand the mental models people used to thought the app used this resource. Participants were also
make trust decision. Our work extends this past work to a asked to specify how comfortable they felt letting this app
new domain, namely mobile app privacy. access the resource, using a 4-point Likert scale ranging
from very comfortable (+2) to very uncomfortable (-2). In
Kelley et al. proposed simple visualizations called
the other condition (referred to as the purpose condition),
“privacy nutrition labels” [25] to inform user how their
we wanted to see how people felt when offered more fine-
personal information is collected, used and shared by a
grained information. Participants were told that a certain
web site. Our new proposed mobile privacy summary
resource would be accessed by this app and given specific
interface is inspired by their work. Our work differs in
reasons, e.g. user’s location information is accessed for
how we acquire privacy-related information. In their
target advertising. We identified these reasons by
work, the expectation is that a ‘nutrition label’ would be
examining TaintDroid logs and using knowledge about ad

503
Please read the application description carefully and answer the questions below. 3. Suppose you have installed Toss it on your Android device,
App Name: Toss it would you expect it to access your precise location? (required)
Yes No
Toss it does access users’ precise location information.
4. Could you think of any reason(s) why this app would need
to access this information? (required)
precise location is necessary for this app to serve its
major functionality.
precise location is used for target advertisement or
market analysis.
precise location is used to tag photos or other data
generated by this app.
Toss a ball of crumpled paper into a waste bin. Surprisingly addictive! Join the precise location is used to share among your friends or
MILLIONS of Android gamers already playing Toss It, the most addictive casual game
people in your social network.
on the market -- FREE!
other reason(s), please specify
- Simple yet challenging game play: toss paper balls into a trash can, but don't forget to
account for the wind! I cannot think of any reason.
- Challenge your friends to a multiplayer game with Scoreloop 5. Do you feel comfortable letting this app access your precise
- Toss that paper through 9 unique levels -- you can even throw an iPhone! – Glob location? (required)
And if you like Toss It, check out these other free games from myYearbook: - Tic Tac Very comfortable
Toe LIVE! - aiMinesweeper (Minesweeper) - Line of 4 (multiplayer game like Connect Somewhat comfortable
Four) Somewhat uncomfortable
Very uncomfortable
1. Have you used this app before? (required)
Yes No Based on our analysis, Toss it accesses user's precise
2. What category do you think this mobile app should belong to? location information for targeted advertising .
(required) 3. Suppose you have installed Toss it on your Android device,
Game Application Book, music or video do you feel comfortable letting it access your precise location?
(required)
The Expectation Condition OR The Purpose Condition
Very comfortable
Please provide any comments of this app you may have below. Somewhat comfortable
Somewhat uncomfortable
Very uncomfortable

Figure 1. Sample questions in our study to capture users’ mental models. Participants were randomly assigned to one of the
conditions. In the expectation condition, participants’ were asked to specify their expectations and speculate the purpose for
this resource access. In the purpose condition, the purpose of resource access was given to participants. In both conditions,
participants were asked to rate how comfortable they felt having the targeted app access their resources.
networks. Participants were then asked to provide their ensure a between-subject design where a participant
comfort ratings as in the expectation condition. Finally, would only be exposed to one condition.
participants from both conditions were encouraged to
To prevent other confounding factors such as cultural or
provide optional comments on the apps in general. The
language issues, we restricted our participants to those
separation of the two conditions let us compare users’
who were located within the U.S. To guarantee the quality
perceptions and subjective feelings when different
of our data, we also required participants to have a
information was provided.
lifetime approval rate higher than 75% (i.e. the rate of
We focused our data collection on four types of sensitive successfully completing previous tasks).
resources (as suggested by AppFence [22]): unique device
All the HITs of this study were completed over the course
ID, contact list, network location, and GPS location. We
of six days. We collected a total of 5684 responses. 211
also restricted the pool of apps to the Top 100 most
were discarded due to incomplete answers, and 113 were
downloaded mobile apps on the Android market. Overall,
discarded due to failing the quality control question,
56 of these apps requested access to unique phone ID, 25
yielding 5360 valid responses. There were 179 verified
to the contact list, 24 to GPS location, and 29 to Network
Android users in our study, with an average lifetime
Location. This resulted in 134 app and resource pairs, i.e.
approval rate of 97% (SD=8.79%). The distribution of
134 distinct HITs. For each HIT, we recruited 40 unique
Android versions our participants used was very close to
participants to answer our questions (20 per condition).
Google’s official numbers [37]. On average, participants
We used the following qualification test to limit our spent about one minute per HIT (M=61.27, SD=29.03),
participants to Android users, as well as to filter out and were paid at the rate of $0.12 per HIT.
people who were not serious. Crowd participants were
The Most Unexpected and the Most Uncomfortable
asked to provide the Android OS version of their device, Our first analysis looked at what sensitive resource usages
with instructions on where to find this information on were least expected by users based on data from the
their Android devices. When reviewing participants’ expectation condition. For each app and resource pair, we
qualification requests, we also randomly assigned aggregated the data by calculating the percentage of
qualified participants to different conditions by giving
participants who expected the resources to be accessed,
them different qualification scores. In this way, we could
and averaging the self-reported comfort ratings (ranging

504
from very comfortable +2.0 to very uncomfortable -2.0). Resource App name % Expected Avg
Table 1 summarizes the resource usages that less than Comfort
20% of participants said that they expected. For example, Network Brightest Flashlight 5% -1.25
only 5% of participants expected the Brightest Flashlight Location Toss It 10% -1.15
app would access users’ network location information, Angry Birds 10% -0.43
and overall, participants felt uncomfortable about this Air Control Lite 20% -0.55
Horoscope 20% -1.05
resource usage (M= -1.25, SD=0.39). Similarly, only 10%
GPS Brightest Flashlight 10% -0.95
of participants expected the Talking Tom app would
Location Toss It 5% -0.95
access users’ device ID, and 20% of people expected
Shazam 20% -0.05
Pandora to access their contact list. Device ID Brightest Flashlight 5% -1.35
Generally speaking, when participants were surprised by TalkingTom Free 10% -0.78
an access to a sensitive resource, they also found hard to Mouse Trap 15% -0.85
explain why this resource were needed. Note that in the Dictionary 15% -0.69
expectation condition, participants were only informed Ant Smasher 20% -1.13
Horoscope 20% -1.03
about which resources were accessed; they were not
Contact Backgrounds HD 10% -1.35
informed about the purpose of why these resources were List Wallpapers
accessed. This is similar to what the existing Android Pandora 20% -0.70
permission list conveys to users. In this condition, we GO Launcher EX 20% -0.75
observed a very strong correlation (r= 0.91) between the
percentage of expectations and the average comfort Table 1. The most unexpected resource usages identified in
ratings. In other words, the perceived necessity of the the expectation condition, i.e. resource usage expected by
no more than 20% of participants. Users felt
resource access was directly linked to their subjective
uncomfortable with these unexpected app behaviors. For
feelings, thus guiding the way users make trust decisions each app and resource pair, 20 participants were surveyed.
on mobile apps. As many participants also mentioned in The comfort rating was ranging from -2.0 (very
their comments, these surprises prompted them to take uncomfortable to +2.0 (very comfortable). For all the apps
different actions. For example, participant W27 said about we surveyed, there was a strong correlation (r=0.91)
Brightest Flashlight app, “Why does a flashlight need to between people’s expectation and their subjective feelings.
know my location? I love this app, but now I know it
WeatherBug application uses location for retrieving local
access my location, I may delete it.” W92 said, “I didn't
weather information as well as for targeted advertising.
know Pandora can read my phone book. But why? Can I
turn it off? I'll search for other internet radio app.” We compared the reasons our participants provided in the
Similarly, W56 showed a similar concern (for the Toss It expectation condition against the ground truth from our
game), “I do not feel that games should ever need access analysis as shown in Table 2. In most cases, the majority
to your location. I will never download this game.” of participants could not correctly state why a given app
requested access to a given resource. When the resources
Lay Users Have a Hard Time Identifying the Reason an
App Accesses a Resource were accessed for functionality purposes, participants
Another way to look at the expectation condition is that it generally had better answers; however, the accuracy never
presented users with information comparable to what is exceeded 80%. When sensitive resources were used for
provided by the Android permission system, namely what multiple purposes, the accuracies tended to be much
resources may be accessed. We wanted to see to what lower. We also note that, participants had slightly better
extent people understand the behaviors of apps in this answers of why their location information was needed
optimal case, where they were paid to read the privacy compared to the other two types of sensitive resources.
summaries. Based on our results, even if users were fully Note that, these results are for the situation where
aware of which resources were used, they still had a hard participants were paid to carefully read the description.
time understanding why these resources were needed. Many of them had even already used some of these apps
We used TaintDroid [14] to analyze all the mobile apps in before. We believe for general Android users, their
our study to identify the actions that triggered the ability to guess would be even worse. This also indicates
sensitive resource access and where the sensitive that simply informing users of what resources are used (as
information was sent to. We then manually categorized today’s Android permission screen does) is not enough
each app and resource pair into three categories: (1) for for users to make informed decision.
major functionality, (2) for sharing and tagging (or Clarifying the Purpose May Ease Worries
supporting other minor functions), (3) for target Given the lack of clarity of why their resources are
advertising or market analysis. Many resource usages fell accessed, users have to deal with significant uncertainties
into more than one category. For example, the when making trust decisions regarding installing and

505
Resource Resource used for cnt % of % of comfort comfort
Type [1] Major functionality accurate no Resource rating w/ rating w/o
[2] Tagging or sharing guess idea Type purpose purpose df T p
[3]Advertising or
market analysis Device ID 0.47(0.30) -0.10(0.41) 55 7.42 0.0001
Contact [1] 20 56% 8% Contact
List (25) [2] 2 28% 35% List 0.66(0.22) 0.16(0.54) 24 4.47 0.0002
[1]+[2] 2 19% 16% Network
[1]+[2]+[3] 1 27% 14% Location 0.90(0.53) 0.65(0.55) 28 3.14 0.004
GPS [1] 14 74% 11% GPS
Location [2] 4 80% 10% Location 0.72(0.62) 0.35(0.73) 23 3.60 0.001
(24) [3] 2 35% 55% Table 3. Comparison of comfort ratings between the
[1]+[3] 3 15% 27% expectation condition (2nd column) and the purpose
[2]+[3] 1 15% 40% condition (3rd column). Standard deviations are shown
Network [1] 15 77% 8% between parentheses. When participants were informed
Location [2] 2 55% 10% of the purpose of resource access, they generally felt more
(29) comfortable. The differences were statistically significant
[3] 7 29% 63%
for all four types of resources. The comfort ratings were
[1]+[3] 3 15% 22%
ranging from -2.0 (very uncomfortable to +2.0 (very
[2]+[3] 2 13% 25%
comfortable).
Device [1] 1 51% 29%
ID (56) [3] 30 22% 58% (!>1.0) in comfort rating when the purpose of a resource
[1]+[3] 12 7% 55% access was explained.
Table 2. Participants had a difficult time speculating on This finding suggests that providing users with the
the purposes of their sensitive resource usages. The first
reasons why their resources are used not only gives them
column shows the type of resource accessed and the total
number of apps accessing that resource. The second more information to make better trust decisions, but can
column shows the ground truth of why the resource is also ease their concerns caused by uncertainties. Note that
accessed, the third column shows the number of apps in informing users about the “purpose” for collecting their
each category (e.g. 20 apps access contact list for reason information is a common expectation in many legal and
[1]). The third column shows the percentage of regulatory privacy frameworks. Our results confirm the
participants stated the purpose correctly. The last importance of this information. This finding also provides
column shows the percentages of participants who had us with strong rationale for including the purpose(s) of
no idea why the resource is accessed. resource access in our new design of privacy summary
interface.
using a given mobile app. We wanted to see if providing
users with more fine-grained information, especially the Impact of Previously Using an App
purposes of resource access, would have any influence on We also wanted to see how previous experiences with an
users’ privacy-related subjective feelings. To answer this app impacted participants’ expectations and level of
question, we compared the average comfort ratings from comfort. To answer this question, we compared the
both conditions, for each mobile app and resource pair. responses between participants who had and hadn’t used
the app before. The ratio of people who had and had not
We observed that for all four types of sensitive resources used the apps in our study varied greatly. Some apps
(i.e. device ID, contact list, network location, and GPS (such as Facebook and Twitter) saw high usage among
location), participants felt more comfortable when they our participants, while others (such as Kakao Talk
were informed of the purposes of a resource access (see Messenger and Horoscope) had fairly low usage. To make
Table 3). The differences between the comfort ratings
the comparison fair, we only examined apps that had at
were statistically significant in t-tests. For example, with
least 5 responses in both the used and not used categories.
regard to accessing the device ID, the average comfort In our data, the differences between participants who had
rating in the purpose condition was 0.3 higher than in the and had not used these apps before were not statistically
expectation condition (t(55)=7.42, p<0.0001). For some significant with respect to their expectation of sensitive
apps, informing people of the purpose led to totally resource access. Regarding their comfort level, the only
different feelings. For example, participants felt uneasy significant difference we observed is the average comfort
when told the Dictionary app accessed their network ratings for accessing the contact list. Participants who
location (Mcomfort= -0.83, SD=0.41). However, when they used an app before felt more comfortable letting that app
were informed that the location was only used to search access their contact list (t(20)=2.68, p=0.015). For the
for trending words that people nearby are looking up, they other three types of resources, the experiences with apps
felt much less concerned (Mcomfort=0.80, SD=0.29).
didn't cause any statistically significant differences in
Similarly, Air Control Lite, eBuddy, Shazam, Antivirus,
participants' subjective feelings.
and other 7 apps all demonstrate a significant increase

506
This finding suggests that people who use an app do not
necessarily have a better understanding of what the app is
actually doing, in terms of accessing their sensitive
resources. It also suggests that, if we use crowdsourcing
to capture users’ mental models of certain apps, we do not
have to restrict our participants to people who are already ;23$%&'(&$)'('$&%(*(+&',$-.+&$/**$ 123$%&'(&$)'('$&%(*(+&',$-.+&$/**$
&'2-$-.'+($(4)0.56#70+87.#9"$-0$
familiar with these apps, allowing us access to a &'2-$-.'+($%((&)*+,%-.#/)$%-+)0$
70:+;'$/,&$*(0<+,'(&3
-0$70:+;'$/,&$*(0<+,'(&3
potentially larger crowd. 45#$%&'(&$)'('$&%(*(+&',$-.+&$/**$
;23$%&'(&$)'('$&%(*(+&',$-.+&$/**$ &'2-$-.'+($%((&)*+,%-.#/)$%-+)0$-0$
NEW PRIVACY SUMMARY INTERFACE &'2-$-.'+($(4)0.56#70+87.#9"$-0$ ,+1-+02/(63107$80($&'/(1.+29$2'/(:6$
)0(,&3
In the previous section, we had identified that purpose and 70:+;'$/,&$*(0<+,'(&3

expectation are two key factors that impact users’ ;<3$%&'(&$)'('$&%(*(+&',$-.+&$/**$ !"#$%&'(&$)'('$&%(*(+&',$-.+&$/**$
&'2-$-.'+($(&.$+6.#/)$%-+)0$-0$ )(0-'$102-'2-&$-0$-.'+($!"#$%&'3
subjective feelings. Based on this finding, we present the 70:+;'$/,&$*(0<+,'(&3
design of a new privacy summary interface highlighting "#$%&'(&$)'('$&%(*(+&',$-.+&$/**$
10%;,$102-(0;$-.'+($%7'+)#6.--+0:63
the purposes of sensitive resource usage and people’s "#$%&'(&$)'('$&%(*(+&',$-.+&$/**$
1/2$$)0-&)/#$%,.&%#=/%64/+:4-3
perceptions about app’s behaviors. =''$/;;

=''$/;;
Design Rationale
Privacy summary interfaces, such as the permission Figure 2: A mockup interface of our newly proposed privacy
screen in current Android, are designed for users to summary screen, taking the Brightest FlashLight and the
review before downloading mobile apps. By that time, Dictionary app as examples. The new interface provides
extra information of why certain sensitive resources are
users have limited information to form their mental model
needed and how other users feel about the resource usages.
of the targeted mobile app since they haven't had any Warning sign will appear if more than half of the previous
interaction with it. In contrast with our crowdsourcing users were surprised about this resource access.
study, we cannot rely on general users to carefully
examine an app's description or screenshots to understand storage, contact list etc. Users could choose to check
how this app works in reality. In our new design, we out other low-risk resources by clicking “See all”.
directly leverage other users’ mental models. The • Sorting the list based on expectation as captured
underlying rationale is similar to the idea of Patil et al. through crowdsourcing. We order the list so that the
[31] in the sense of incorporating others’ opinions in more surprising resource usages are shown first.
making privacy decisions. Our work differs from their • Highlighting important information. We bold the
work by aggregating users’ subject feedback from crowds sensitive resources mentioned in text, and use
instead of from one’s social circle and highlighting users’ warning sign and striking color to highlight the
surprises. By presenting the most common suspicious resource usages, i.e. when the surprise
misconceptions about an app, we can rectify people’s value exceeds a certain threshold.
mental models and help them make better trust decisions.
We consider users’ expectations and the purposes of Figure 2 shows two examples of our new privacy
resource access as the two key points that we want to summary interface. To make the comparison more
convey to users in our new summary interface. symmetric, our design uses the same background color
and pattern are used in the current Android permission
Previous research has discussed several problems with the screen. The surprise numbers (i.e. “n% of users were
existing Android permission screens [18, 26], including: surprised”) used in these mockups were obtained from our
• The wording of the permission list contains too much crowdsourcing study where possible. The surprise
technical jargon for lay users. numbers for other resources (such as camera flashlight,
• They offer little explanations and insight into the SD card) were reasonable estimates made by our team.
potential privacy risk.
Evaluation
• A long list of permissions make users experience
We used AMT to conduct a between-subject user study to
warning fatigue.
evaluate our new privacy summary interface. Participants
With these problems in mind, in addition to the two
were randomly assigned to one of the two conditions in
identified key features, we proposed several principles for
the same way as our previous study. In the permission
our own design:
condition, participants were shown the permission screen
• Using simple terms to describe the relevant
that the current Android Market uses; in the other
resources; e.g., instead of using “coarse (Network)
condition (referred as the new interface condition),
location”, we use the term “approximate location”.
participants were shown our new interfaces. We used the
• Only displaying the resources that have greater data we collected in our previously described
impact on users’ privacy, such as location, device ID, crowdsourcing study to mock up the privacy summary

507
* p <0.05 ** p<0.005 # of People Mentioning
Privacy Concerns (out of 20) Accuracy (max=1.0) Time spent (sec)
App Name Permission New Interface Permission New Interface p Permission New Interface p
Brightest Flashlight 4 6 0.58 0.86 ** 74.59 65.11
Dictionary 1 3 0.73 0.91 ** 68.21 43.92 **
Horoscope 3 7 0.75 0.95 * 68.41 48.72 *
Pandora 3 3 0.68 0.94 ** 76.86 76.82
Toss it 4 13 0.61 0.88 ** 67.43 57.10
Table 4. Comparisons between the existing Android permission screen (permission condition) and our newly proposed privacy
summary (new interface condition). Our new interface makes users more aware of the privacy implications and is easier to
understand. Users in general spent less time on these newly proposed interfaces but got more fine-grained information.

interfaces for five mobile apps, namely Brightest users understood the privacy summary. This is measured
Flashlight, Dictionary, Horoscope, Pandora, and Toss it. by the accuracy in answering questions about the app’s
behavior. The third is efficiency, i.e. how long it took
In both conditions, the app’s name, screenshots,
participants to understand the privacy summary, measured
description and the quality control question were
by the number of seconds they spent on reading the
presented the same way as in previous study. The privacy
privacy summary screens.
summary was then shown (either the current permission
screen or our newly proposed interface). Participants were The comparisons between the two conditions are
asked whether they would recommend this app to a friend summarized in Table 4. Generally speaking, participants
who might be interested in it, and why (or why not). We in the new interface condition weighted their privacy
used JavaScript to keep track of the time participants more when they made decisions about whether the app
spent on reading the privacy summary before making was worth recommending. More people in this condition
their recommendation choices. After this question, mentioned privacy-related concerns when they were
privacy summary screens were covered by grey justifying their choices. When we asked people in both
rectangles. Participants could recheck the privacy conditions to specify the resources used by the target apps
summaries by moving their mice over the grey rectangles. of the target apps, people in the new interface condition
In this way, we could accurately record the additional also demonstrated a significantly higher accuracy
time participants spent on viewing privacy summary compared to their counterparts. Furthermore, except for
screens by monitoring the mouse hovering events. We the Pandora app, participants in the new interface
then added up all these time fragments to compute the condition on average spent less time reading the privacy
total time participants spent on reading the privacy summaries on average, though the time difference was not
summary. Participants were tested on their understanding always statistically significant. This finding suggests that
of the presented privacy summary screen by specifying we can provide more useful information without requiring
the resource(s) usages suggested by the privacy summary. users to spend more time to understand it.
For each condition per app, 20 unique participants were In our future work, we plan to conduct lab studies to
recruited. Participants could evaluate multiple apps within evaluate our new privacy summary interface in depth. We
the same condition. A total of 237 responses were will focus on the effectiveness of the new interface when
submitted, 19 of which were discarded due to users only look at it briefly (e.g. for 5-10 secs), since in
incompletion and 18 of which were discarded due to reality general users are not likely to devote a lot of time
failing the quality control question. Sixty-seven Android to reading.
users participated in this study with an average lifetime
DISCUSSION
approval rate of 96.31% (SD=6.27%). Thirty-five
In this section, we discuss the potential implications of
participants were assigned to the permission condition,
our work and how it fit into our vision of leveraging
and thirty-two were assigned to the new interface crowdsourcing for application analysis.
condition. Participants on average spent 2 min and 41.4
sec (SD=77.3 sec) in completing each evaluation task, and Implications for Privacy Analysis
were paid at the rate of $0.20/HIT. A Potential Win-Win A major finding of our work is that
users feel more comfortable when they are informed of
We evaluated the new privacy summary interface from the reasons why their sensitive resources are needed. In
three perspectives to test its effectiveness and usability. some cases, it might be again tied to users’ expectations.
The first is privacy awareness, i.e. whether users are more For example, the “trending, popular and nearby search”
aware of the privacy implications. This is measured by functionality provided by the Dictionary app uses location
counting the number of participants who mentioned information to retrieve the words that people nearby are
privacy concerns when justifying their recommendation looking up. It is a relatively minor function of this app
decisions. The second is comprehensibility, i.e. how well and may not be expected even for users who are familiar

508
with this app. Therefore, when we asked participants to were not very good at speculating on the purpose of
state the reasons for accessing location information, most resource access, which is not surprising and might be
of them thought it was for targeted advertising purpose, compensated by leveraging existing mobile app analysis
hence rating the comfort level much lower than they were techniques. However, specifying their expectations is a
informed about the actual reason. We also observed relatively easy job for most people but cannot be
several cases (e.g. the Weather Channel, GasBuddy, addressed by existing app analysis tools.
Compass) where participants had correct answers as to
As the first work of this kind, we simplified the problem
why the app was using one’s location, but still felt less
by focusing only on privacy, although we realize that
comfortable when compared to the condition where
users may weigh utility over privacy when making
participants were directly given the purpose. It suggests
decisions about installing an app. Future research will
that when dealing with uncertainties, users tend to be
need to take utility into account in understanding how
more concerned or even paranoid about their privacy. Our
people make trust decisions.
results provide evidence that properly informing users
with the purposes of resource usage can actually ease We also only captured people’s perceptions at a coarse
their worries. In other words, it would potentially benefit granularity and with limited types of sensitive resources.
all parties, including app developers, market owners, and We will extend our work to finer-grained interactions, e.g.
advertisers. whether users expect the Yelp app to send their location
to yelp.com when they press 'Search nearby restaurant'
Currently, the default Android permission screen doesn’t
button. We envision that this level of analysis could
contain any explanations. One possible approach for
provide us more detailed information for evaluating
getting this information is to scale up our crowdsourcing
mobile apps, and could possibly lead to better results
approach, but there is the potential for errors, as we saw
when asking the crowd why an app accesses a given
in Table 2. Another approach is to require app developers
resource.
to include a rationale, but this is an optimistic approach
assuming that developers won’t lie. This also suggests In our crowdsourcing study, it cost us $2.40 and about 20-
that better tools are still needed for analyzing apps’ 25 minutes (deducted from the effective hourly rate
behaviors in a more scalable and automated manner, as reported by AMT) to examine one app and resource pair
envisioned by Amini et al. [2]. with input from 20 participants. There is ample room to
improve the crowdsourcing efficiency. Examples include
Privacy Concerns of Mobile Advertising We observed
extending the participant pool to all smartphone users,
that mobile advertising services were a consistent privacy
minimizing the number of questions, and so on. There are
concern for the most participants. For all four types of
also several techniques suggested by previous
resources, users felt the least comfortable when they were
crowdsourcing work [7, 27] that we can leverage to
used for advertising or market analysis. We understand
improve the overall efficiency, e.g. dynamically
that many developers rely on ads for income. However,
publishing HITs, adaptively adjusting the compensation
there is still space for app developers and ad networks to
rate and the number of required responses. Given that it
improve the user experience, such as by providing users
only took about one minute for our participants to
with more informed consent and more explanations on
complete a crowdsourcing task, we believe this method
how and why their personal information is used. Other
would scale well, though formal scalability analysis is
potential ways include tweaking the sensitive resource
still an open issue and will be included in our future work.
usage to a coarser level, or using hashing or other
methods to conceal users’ identities. These technical Alternatively, crowdsourcing users’ perceptions could be
methods can address users’ privacy concerns without achieved in conjunction with the exiting app rating
sacrificing too much on the ads' quality. mechanism. When users rate a mobile app, they can also
Leveraging Crowd for Application Analysis
optionally specify their expectations of one aspect of the
The long term vision of our work is to design a scalable target app. As the number of rating grows, the aggregated
privacy evaluation system for mobile apps by combining perceptions will be more representative.
automated application analysis with crowdsourcing CONCLUSION & FUTURE WORK
techniques. The automated techniques are meant to A great deal of past work in mobile security and privacy
capture an app's behaviors involving sensitive resources, research has focused on providing tools for automated
whereas the crowdsourcing techniques capture people's analysis. However, there is still no easy way to
perceptions and expectations about an app's behaviors. distinguish whether accessing certain sensitive resource is
necessary, or how that action makes users feel with
One important contribution of this paper is to demonstrate
respect to their privacy. Our work demonstrates a new
the feasibility of using crowdsourcing to capture users'
way for evaluating mobile app’s privacy. We explore
perceptions, and to identify the strength and weakness of
users’ mental models of mobile privacy by crowdsourcing
the crowd in evaluating privacy. Based on our data, users

509
users’ expectations of mobile apps’ sensitive resource [16]A. P. Felt, et al., "Android permissions demystified," In
usage. Our results suggest that both users’ expectation Proc. CCS, 2011.
and the purpose of why sensitive resources are used have [17]A. P. Felt, et al., "A survey of mobile malware in the wild,"
a major impact on users’ subjective feelings and their In Proc. SPSM, 2011.
[18]A. P. Felt, et al., "Android Permissions: User Attention,
trust decisions. Another major finding is that properly
Comprehension, and Behavior," UCB/EECS-2012-26,
informing users of the purpose of resource access can University of California, Berkeley, 2012.
ease users' privacy concerns to some extent. Based on our [19]A. P. Felt, et al., "Permission re-delegation: attacks and
findings, we proposed a new privacy summary interface defenses," In Proc. USENIX conference on Security, 2011.
that highlights common misconceptions that other users [20]N. Good, et al., "Stopping spyware at the gate: a user study
have and the purpose of a resource access. Compared to of privacy, notice and spyware," In Proc. SOUPS, 2005.
the existing Android permission screen, our interface is [21]S. Grobart. "The Facebook Scare That Wasn't." Available:
much easier to understand and provides users with more http://gadgetwise.blogs.nytimes.com/2011/08/10/the-
pertinent information for users to make better trust facebook-scare-that-wasnt/
[22]P. Hornyack, et al., "These aren't the droids you're looking
decision.
for: retrofitting android to protect data from imperious
ACKNOWLEDGEMENT applications," In Proc. CCS, 2011.
[23]C. Jensen and C. Potts, "Privacy policies as decision-making
This research was supported by CyLab at Carnegie Mellon tools: an evaluation of online privacy notices," In Proc. CHI,
under grants DAAD19-02-1-0389 and W911NF-09-1-0273 2004.
from the Army Research Office and by Google. Support was [24]J. Jeon, et al., "Dr. Android and Mr. Hide: Fine-grained
also provided by the National Science Foundation under security policies on unmodified Android," 2012.
Grants CNS-1012763 and CNS-0905562. [25]P. G. Kelley, et al., "A "nutrition label" for privacy," In
Proc. SOUPS, 2009.
REFERENCES
[26]P. G. Kelley, et al., "A Conundrum of permissions:
[1]"Katz v United States 389 U.S. 347." Available: Installing Applications on an Android Smartphone," In Proc.
http://en.wikipedia.org/wiki/Katz_v._United_States USEC, 2012.
[2]S. Amini, et al., "Towards Scalable Evaluation of Mobile [27]G. Liu, et al., "Smartening the crowds: computational
Applications through Crowdsourcing and Automation," techniques for improving human verification to fight phishing
CMU-CyLab-12-006, Carnegie Mellon University, 2012.
scams," In Proc. SOUPS, 2011.
[3]D. Barrera, et al., "A methodology for empirical analysis of
[28]M. Nauman, et al., "Apex: extending Android permission
permission-based security models and its application to
model and enforcement with user-defined runtime
android," In Proc. CCS, 2010.
constraints," In Proc. ASIACCS, 2010.
[4]A. Barth, et al., "Privacy and Contextual Integrity: [29]D. Norman, The design of everyday things: Basic Books,
Framework and Applications," In Proc. IEEE Symposium on 2002.
Security and Privacy, 2006. [30]L. Palen and P. Dourish, "Unpacking "privacy" for a
[5]M. Benisch, et al., "Capturing location-privacy preferences: networked world," In Proc. CHI, 2003.
quantifying accuracy and user-burden tradeoffs," Personal [31]S. Patil, et al., "With a little help from my friends: can social
and Ubiquitous Computing, 2010.
navigation inform interpersonal privacy preferences?," In
[6]A. Beresford, et al., "MockDroid: trading privacy for
Proc. Proceedings of the ACM 2011 conference on Computer
application functionality on smartphones," In Proc.
supported cooperative work, 2011.
HotMobile, 2011. [32]N. Sadeh, et al., "Understanding and Capturing People's
[7]M. S. Bernstein, et al., "Soylent: a word processor with a Privacy Policies in a Mobile Social Networking Application,"
crowd inside," In Proc. UIST, 2010. The Journal of Personal and Ubiquitous Computing, 2009.
[8]C. Bravo-Lillo, et al., "Bridging the gap in computer security [33]D. J. Solove, "A Taxonomy of Privacy," University of
warnings: a mental model approach," IEEE Security & Pennsylvania Law Review, Vol. 154, No. 3, January 2006.
Privacy Magazine, 2010.
[34]A. Thampi. "Path uploads your entire iPhone address book
[9]L. J. Camp, "Mental models of privacy and security,"
to its servers." Available: http://mclov.in/2012/02/08/path-
Technology and Society Magazine, IEEE, vol. 28, 2009.
uploads-your-entire-address-book-to-their-servers.html
[10]E. Chin, et al., "Analyzing inter-application communication [35]S. Thurm and Y. I. Kane, "Your Apps are Watching You,"
in Android," In Proc. MobiSys, 2011. WSJ, 2011.
[11]K. Craik, the nature of explanation, Cambridge University [36]T. Vidas, et al., "Curbing android permission creep,"
Press, 1943. Proceedings of the Web, vol. 2, 2011.
[12]M. Egele, et al., "PiOS: Detecting Privacy Leaks in iOS [37]A. Wagner. "Google Posts Refreshed Android Distribution
Applications," In Proc. NDSS, 2011.
Numbers." Available:
[13]W. Enck, "Defending Users against Smartphone Apps:
http://www.twylah.com/surfingislander/tweets/177040176181
Techniques and Future Directions," in LNCS. vol. 7093, ed,
288960
2011.
[38]R. Wash, "Folk models of home computer security," In
[14]W. Enck, et al., "TaintDroid: An Information-Flow Proc. SOUPS, 2010.
Tracking System for Realtime Privacy Monitoring on [39]Y. Zhou, et al., "Taming Information-Stealing Smartphone
Smartphones," In Proc. OSDI 2010. Applications (on Android)," In Proc. TRUST, 2011.
[15]W. Enck, et al., "A Study of Android Application Security,"
In Proc. USENIX Security Symposium, 2011.

510

You might also like