CampusLife

Explore mental-emotional-social context with smartphones

Mobile Health Ubiquitous

Introduction

Together with Dartmouth College, Carnegie Mellon University and Cornell University, we at Georgia Tech are interested in extending the seminal work of StudentLife, started by Dartmouth’s Andrew Campbell a few years ago. Campbell sought to determine if mental health and academic performance could be correlated, or even predicted, through a student’s digital footprint. The CampusLife project is a logical extension, which aims to collect data from relevant subsets of a campus community through their interactions with mobile and wearable technology, social media and the environment itself.
In the Spring of 2016, the CampusLife team at Georgia Tech had performed a campus wide study similar to that of Dartmouth. The data collected from the study along with follow up studies with participants brought to light many pitfalls of the deployment. Due to which it was necessary to understand why these problems occurred and how we can mitigate for a future deployment in 2017.

My Role

This is a large project involving multiple universities. The team at Georgia Tech consists of 5 PIs and 8 student researchers - each student working under a PI and focusing on a specific area for the project. The areas can be broadly divided into privacy, social, health, informatics and technology. My work was primarily guided by Gregory Abowd in the realm of technology aspect of HCI but was influenced by other areas too.

Development Engineer
I joined the team because of my background in Android development and experience in collecting sensor data passively from smartphones. I was the "goto guy" when it came to how the technology for the study is expected to work and understanding what its limits were. This role involved testing, unpacking and modifying the existing tools -mainly the AWARE framework used to collect context data. As a result of which I also step in as a liaison between the CampusLife team and the AWARE team.
Data Capture Research
In this position, I needed to investigate the data that was collected in the previous study for accuracy, consistency and usefulness - and subsequently the tools used for pooling this data needed to be tweaked to get better results. As this data involves quantifying humans, it was important to understand how individuals behave in order to understand what the data is and how we retrieve that information remotely. Moreover, I was also responsible for exploring other kinds of data that can be collected and new methods to acquire it. Along with this position enabled me to conduct smaller studies to test improvements and changes as well as lead the ongoing pilot study for the next deployment.
Note: These roles are not necessarily exclusive to me nor are they exhaustive in describing everything I did in the project

Past Work

Original Dartmouth Study

StudentLife used passive and automatic sensing data from the phones of a class of 48 Dartmouth students over a 10 week term to assess their mental health (e.g., depression, loneliness, stress), academic performance (grades across all their classes, term GPA and cumulative GPA) and behavioral trends (e.g., how stress, sleep, visits to the gym, etc. change in response to college workload -- i.e., assignments, midterms, finals -- as the term progresses).

They used computational methods and machine learning algorithms on the phone to assess sensor data and make higher level inferences (i.e., sleep, sociability, activity, etc.) The StudentLife app that ran on students' phones automatically measured the following human behaviors 24/7 through the user's phones:

Sleep Timings
Number of Conversations
Physical Activity
Location
Colocated Students
Indoor/Outdoor Mobility
Stress Level
Positive Affect
Eating Habits
App Usage

Some of the results from the study have been published on their website along with the dataset that they used.

Some trends from the original study by Dartmouth

CampusLife Spring 2016

CampusLife is Georgia Tech's effort to emulate and improve upon the work done at Dartmouth, in the context of its own campus. The researchers at GT used 2 different tools in this study for the purpose of data acquisition:

AWARE is an open source Android and iOS instrumentation framework for logging, sharing and reusing mobile context. The mobile application allows you to enable or disable sensors and plugins on smartphones that will unobtrusively collect data in the background. The data is saved locally on your mobile phone. The AWARE dashboard, lets researchers setup studies on the cloud and request participant data with periodic uploads.
Samples of different data types AWARE is capable of collecting from phones Samples of different data types AWARE is capable of collecting from phones
Quedget is a proprietary utility app for marketing teams and research groups to collect self reported data. Quedget replaces the standard lock-screen of a smartphone with a question widget that must be either answered or skipped to unlock the phone. This is especially useful for disseminating ecological momentary assessment (EMA) questions to participants of a study remotely and regularly.
rough sketch rough sketch

In order to meet the needs of the CampusLife, both these tools were modified accordingly. While AWARE is open-source, researchers from the team were directly involved in making changes to core mechanics. On the other hand since Quedget is a commercial product, any requirements regarding that were outsourced.
Please note, CampusLife collects some information differently (e.g. the self reported information is far more broader) than the StudentLife study, and also some new kind of information (e.g. passive activity recognition, mobile communication etc).

38 days
From April 6, 2016 - May 6, 2016
51 participants
Of the total 63 that were enrolled 12 were removed

The data that was collected from these studies brought forth fresh requirements for the various stakeholders in this project. Why these were necessary and how we addressed these pain points will be discussed in the following sections.

New Requirements

Stakeholders

The major parties relevant to the overall project are:

Participants
This study has a long duration and asks for a large volume and varying types of information from participants regularly. Understandably, those who take part in this study want to ensure the tools installed on their phones don't disrupt or interfere with their routine use beyond what they signed up for. While they are concerned about their privacy they want to remain motivated to stay in the study as well and provide good data.
Researchers
The backbone of this study is to collect contextual data remotely over an extended period of time. In light of this it's of utmost importance to the researchers that this data is reliable, valid and consists of sample size large enough to run any meaningful analytics on. There's also a need to understand what kind of data is important and how feasible it is to obtain this given other constraints.
Makers
In order to get the highest quality data over a longer study period is to ensure that the technology used to retrieve it is usable and can seamlessly blend into their daily lives. Thus, there is a need to mould the tools to be nuanced enough to collect data the researchers want without leading to frustrations for the user or violating their consent.
Privacy Experts
As is the norm, studies involving human subjects have certain procedural and ethical constraints. Apart from ensuring the privacy of the data collected from participants, it is also important to ensure that useful data can be collected and meaningful analytics can be performed without breaching any contract as established by the IRB.

Passive Data

A huge amount of data that is being collected is from the background, it is almost unnoticeable for the participant. However, this leads to a few different problems. The first being that while the process of data collection is invisible, it might lead to repercussions, or if I may, side-effects, that users might experience. The next issue is the kinds of data we as researchers need to collect so that we can arrive at the goal we're aiming at. And lastly, due to the sensitivity of the data, a balance between privacy and security of information along with efficiency and quality of information needed to be struck.
To be specific these are some of the major requirements that spun out of the Spring Study:

Energy Considerations
Although on paper it seems that the data will be retrieved from the user and uploaded to the server ubiquitously, it doesn't blend into a participant's everyday life as well as you'd like to believe. The battery on the phones of the users took a major hit during the study. All the sensors being turned on and uploading large amounts of data from the users phone to the cloud was overburdening the battery on the devices. This lead to major frustrations for the participants.
Application Attention
In order to understand participant activity better the user's app usage was being monitored. This included installations, notifications and which application is on the foreground. However, there was a need to understand which applications the user is actively interacting with. Moreover, there was an urge to understand what we can learn about the user based on how they are communicating with an application - beyond knowing the fact that they are using an app.
Improved Context
Currently, the types of data that we're looking to explore the user's context beyond their use of the phone itself is location, wifi and accelerometry data. Yet, this information alone might not be enough to understand the contextual information of a user we need at the scope we need.
Emotional State
Strictly speaking none of the passive data intrinsically tells us about a user's emotional state, their stress or their feelings. As researchers, we do hope to see what the correlation between the self-reported data and passive data to ascertain that we can infer certain emotional states from the passive data we are collecting, like accelerometer. Having said that, there might be better forms of data we should be looking for that can secure our assumption better.

Self-Report Data

The User Research analysts conducted focus groups with the participants of the Spring '17 study to breakdown and decipher the problems faced by responding to the EMA questions asked to them throughout the day. Understandably there were communication design issues in the question options, the strongest issue highlighted was that users felt they were asked to answer certain questions at times when they wanted to either answer a different question or not answer any at all.

Changes and Suggestions

Improvements

Battery Relief
To relieve the burden on the user's battery we first investigated what could be the largest battery draining process among the various background services that are running on top of the AWARE framework. After performing multiple smaller tests with an improved AWARE (which already had new optimisations) we learnt the major culprit was the data service that periodically pushes data onto the cloud. This was mitigated by changing the settings to only upload data when the users phone is connected to WiFi and on charge. When you come to think of it there is no urgency of retrieving data in real-time, therefore it's sensible to pull these large chunks of data when the user's phone either at 100% or at an equilibrium state. Another factor to consider is that many users don't use the phone when it satisfies the two conditions under which we plan to pull the data - thus waiting for that phase on the phone's lifecycle is the most unobtrusive way to collect that data.
The new configurations are specifically designed to improve battery life
Short-Range and Localized Context
Apart from all the data that we are collecting for context, we realized that the information from the Bluetooth adapter on phones can be extremely meaningful and relevant given what we're looking for. The data we get from the WiFi adapter does improve our scope in understanding where a student is - this can be learnt from the access points they were around. But, the Bluetooth adapter could give us information about who or what the user is around. This sheds interesting light on the people a user is surrounded by virtue of the devices they own that would be detected by the Bluetooth.
The challenge of utilising Bluetooth is the privacy issue concerning storing device names of other individuals and third parties who are not consenting to it. The solution to this was to anonymize any personally identifiable information even locally. In spite of all this there are still some technical glitches around using this sensor, particularly when the user has wireless audio devices users connect to. These need to be worked on before the new deployment.
Replacing the device names with aliases for the purpose of the study would be the sensible thing to do
Measuring User Engagement
There is a difference between knowing that a user has an app running on their phone and learning how a user utilises the application. To explain with an anecdote, a user might be scrolling through their Facebook feed reflexively because they're bored vs them commenting on posts on an old photo. The way our data collection tools worked earlier, there was no way of distinguishing the two processes - each of which have different meanings when it comes to analysing the state of the user.
To segregate the two situations, we began logging the user's keystrokes. Knowing whether the user is typing, how fast they're typing, how much they're writing and in which app they are typing - these are key features to understand how a user is interacting. Naturally, this would raise major privacy questions. A lot of which would be hard to convince the IRB of and even worse, get new participants to agree to. Thus, we decided to obscure the characters we record and never store any password information. And this level of abstraction is maintained at the local level itself, even before it's pushed to our study servers.
For the pilot study the every character of logged is replaced and stored as '*' to protect privacy

Explorations

Emotions on Wearables
With the advent of technology, there are ways to measure or detect a user's emotion today with the development of wearable devices. The market for these devices is hot, and many commercial grade products can show user's stats about their physiological state. The primary sensors that let us tap into this kind of information is Heart-Rate Variability (HRV) and Galvanic Skin Response (GSR). Even the accelerometry data from a wearable device is far more accurate than that on a phone (that stays in your phone) and because of its positioning it can tell you more about the user.
The first device that we sought to work with was the Microsoft Band. Not only did it have all the sensors we required, it is also gave developer access via its SDKs, that were available for both Android and iOS. Unfortunately, as we toiled to build a plugin for AWARE that could harness the power of the Band, Microsoft announced that it will be shutting production on the device. Additionally they also took down the SDKs for developers. Since our requirement was to use an off-the-shelf device, despite the Band's sophistication, it was rendered virtually obsolete for our purposes.
This lead us to investigating all the other alternatives in the market that not only have the sensors we need but also provide the end points needed to retrieve that data for third parties. A survey of these devices lead to a competitive analysis of the products that we have considered for the future. A summary of the analytics can be seen in the table on the right. As of now, the two top runners are Jawbone UP and the Empatica E4. The only drawback is the limited developer access. This problem should solve itself given enough time.
Competitive analysis of the sensors on wearable trackers
Revamping Self-Logging
Currently the development of the tools used for collecting self-reported data are outsourced to the Quedget team as per contract (it is a commercial application after all). This leads to delays in development and a lack of control when it comes to making quick changes and failing fast over multiple iterations.
One of the alternatives we have considered to use is an Unlock Journaling Utility developed by James Fogarty et al at University of Washington. Prof. Fogarty graciously gave us open access to the tool and we've been tinkering with it since. This journaling app relies on a single fluid motion (without requiring the user to raise their finger from the touchscreen) to get the user to go through multiple states. We hypothesise this interaction to be far more usable. Moreover, it lets the user actually select which kind of question they want to answer at a given time.
Single instance lets user pick from two different type of questions to answer Modified utility to test usability for EMAs to parents that can be helped to understand their confidence parenting

Pilot Study

On December 13, 2016, we began a pilot test of the new AWARE client with changes we've made. This pilot is almost identical to the Spring study in terms of sensors, but will be listening for some more information too. In some sense it is a reliability measure, a stress test and benchmarking analysis.
All the participation on this study is internal, i.e. the researchers are volunteering their phones to work on this. The devices include both dominating smartphone operating systems, iOS and Android. For Android specifically, we've ensured that the devices belong to different versions so that we can get a better breadth of data.

78,876,059 records
8.432 GB
As of January 4, 2017, this is the amount of data the pilot study has aggregated from 5 Android and 2 iOS smartphones

During the study the participants have been briefed to follow these instructions:
  1. Keep an eye on your device’s battery - in how many hours do you need to charge your device
  2. Make note of any odd behaviours on your device (e.g. Sensors turning ON/OFF automatically, disturbances in communication etc.)
  3. Have a general idea/awareness of the activity on your phone (e.g. You didn’t receive/send any calls between X date and Y date)
The hypothesis of the pilot study that will be verified from the analytics is that:
Data should be reliable
Every participant device should have regular values for every passive data being measured
Every participant should have some values for event-driven data within reasonable assumptions (e.g. Participant may not have installed any new app over the study period but the participant must have turned on the screen of the device once)
Data should be valid
Compare inferences from data with retrospective accounts of participants (e.g. Validate location data with participant via a one-on-one interview)

This pilot study ceased on January 08, 2016. Based on the analysis of this data we have the following details to share.

Data Reliability
Based on data collected over 3 weeks, with multiple sensors activated it was learnt that certain sensors are not available across all devices. At the same times sometimes certain sensors might not reliably or continuously collect data.
Android user iPhone user
Inconsistency Across Devices
Although all the devices are set to sample data at the same rate for all the sensors, we see a difference in the sensor data collected. This could be data transmission losses, storage glitches or even inherent firmware constraints.
iPhone user
Battery Burden
Depending on the number of sensors activated for data collection, we analysed the trend of battery use across different charging cycles. Certain device models can act extremely overburdened based on the sensors activated.
iPhone user

These observations can help improve the configurations of future sensor studies that are performed over a longer period of time.

Further analysis of the self-reporting in CampusLife through EMAs can be found at CampusLife II