Supported in Android & iOS.
This data source measures the precise location of the device using GPS sources.
Each GPS record includes the following:
The time at which this GPS record was received. Internally stored as
Provider:\ Determines how this GPS record was acquired. It can contain one of the following five options:
- GPS: means the Avicenna app explicitly requested the user's location and received it from GPS satellites.
- Network: means the Avicenna app explicitly requested the user's location and received it from nearby cell towers and Wi-Fi access points (only in Android) (technical details).
- Fused: means the Avicenna app is using a location API of Google Play service named fused location provider API. This API intelligently combines different signals to provide location information.
- GPS-Passive: means other apps running on the participant's device requested location from GPS satellite, and Avicenna received a copy as well (only in Android).
- Network-Passive: means other apps running on the participant's device requested location from nearby cell towers and Wi-Fi access points, and Avicenna received a copy as well (only in Android).
- Fused-Passive: means other apps running on the participant's device requested location from fused location provider API, and Avicenna received a copy as well.
- GPS-Reuse: means the app detected the participant did not move since the last GPS reading, and therefore the GPS records for the previous cycle are reused.
- Network-Reuse: means the app detected the participant did not move since the last reading, and therefore the records of the previous cycle provided by the network provider are reused.
- Fused-Reuse: means the app detected the participant did not move since the
last reading, and therefore the records of the previous cycle provided by the
fused provider are reused. This is internally stored as a
The latitude of this record, in degrees. Internally stored as
The longitude of this record, in degrees. Internally stored as
The altitude of this record, in meters above the WGS 84 reference ellipsoid. Internally
The speed of this record, in meters/second over the ground. Internally stored as
The bearing of this record, in degrees. Internally stored as
The accuracy of this reading, in meters. Internally stored as
When participants join a study that requires GPS, the Avicenna app will request permission to access their location data. Participants will have three options:
- Decline the permission request.
- Grant the permission, but only while the app is open and in use.
- Grant permission once.
If the Participant declines this permission, or grants it only while the app is in use, Avicenna will not collect any GPS data and will show the participant a notification to inform them that the missing GPS permission is interrupting their study participation.
If the participant chooses "grant permission once," Avicenna is allowed to collect GPS data for a limited time, likely for one or two days. This period is determined by Android or iPhone operating systems. After this period, the permission is automatically revoked by the operating system, and Avicenna will ask the participant for permission again. The second time Avicenna asks the participant for the GPS permission, in addition to the 3 options listed above, the participant will have the fourth option, "Allow Always". Choosing this option grants Avicenna permanent permission to access GPS data, until it's revoked by the participant explicitly.
The data collection continues as long as the participant is actively participating in the Study. Participants can revoke GPS permission at any time, turn off their device’s GPS (accessible from the home screen of most smartphones), or simply terminate the Avicenna app. In all of these cases, Avicenna will not collect any GPS data, and will notify them via a notification that their study participation has been interrupted. In this case, the missing data will be visible to the researchers on the Participation Page of the Researcher Dashboard. At any time, Participants can turn on their GPS, grant the missing permissions, or restart the app. Each of these events will resume the data collection in Avicenna immediately, and will remove the related notifications from the app.
Data Collection Behavior
Compared to other data sources, monitoring GPS requires considerable power and can drain a participant's battery rapidly. To reduce this impact, GPS data sources in Avicenna use different methods to collect as much data as possible and at the same time keep resource consumption very low. Understanding these helps you better understand and analyze the collected GPS data.
Collecting GPS Records
Avicenna starts collecting fresh GPS records every 5 minutes. Avicenna collects GPS data until it reads 3 accurate data points for a maximum period of 60 seconds.
Reusing GPS Records When Detecting Stationary State
Collecting fresh GPS data is very resource consuming. That's why at the beginning of each cycle, before Avicenna starts the GPS data collection, it checks if it can confidently conclude the participant has been stationary since the last GPS reading. If yes, the app simply reuses the GPS records from the last cycle. To detect whether the participant is stationary, Avicenna uses motion-based activity recognition (MBAR) and Wi-Fi data.
In both Android and iOS, Avicenna considers the participant as stationary if all of the following conditions are met since the last cycle:
- The app has been able to collect MBAR data with a high confidence value (CMMotionActivityConfidence.high in iOS and 100% Confidence in android).
- All MBAR data collected indicate the phone has been stationary (CMMotionActivity.stationary in iOS and STILL or TILTING in Android).
If both of the above are
True, the app assumes the device has been stationary
during the last cycle.
In addition to MBAR data, Avicenna on Android uses Wi-Fi data as well. Based on Wi-Fi data, Avicenna considers a device stationary if all the following conditions are true:
- At least 3 Wi-Fi networks were detected in proximity (based on BSSID) in the previous cycle.
- At least 3 Wi-Fi networks are detected in proximity (based on BSSID) in the current cycle.
- The Wi-Fi network sets for the current cycle and the previous cycle have at least 30% similarity.
If all of the above 3 conditions are met, Avicenna for Android concludes that Wi-Fi data suggest the participant has been stationary.
Now, in each cycle, the Avicenna app skips collecting new GPS data and reuses data from the previous cycle, if the following conditions are met:
- There is enough GPS data collected in the previous cycle; and
- MBAR data exists and shows the participant has been stationary; or
- (Android only) Wi-Fi data exists and shows the participant has been stationary.
In this case, Avicenna finds the best GPS reading from the previous cycle, makes a
copy of it, updates the
provider value of the copied record by appending a
-reuse to it, updates the
record_time of the record to the current time, and
uploads it as the GPS reading for this cycle. Note that in this case:
satellite_timestill refers to the time the data was collected originally, but the
record_timerefers to the time the stationary state was detected by the app, and the previous records were reused.
- Only one record is uploaded, which is the most accurate reading from the cycle where fresh GPS data was collected.
For example, consider the following GPS record:
It shows a GPS record was collected at
Tuesday, October 27, 2020 23:44:41, identified by
satellite_time), had the
accuracy of 3.7 meters, and was reused at
Wednesday, October 28, 2020 10:47:10.265, identified by
this example, the GPS record was collected nearly 11 hours before it was reused.
During these 11 hours, the Avicenna app had sent the same "reused" GPS record once
for each cycle.
Also, it worth mentioning that the above GPS data reuse only works if the study has MBAR and/or Wi-Fi data source added to the study, and the participant has been granted the necessary permissions. Otherwise, no data for MBAR or Wi-Fi will be available to Avicenna's GPS component, and the app will record fresh GPS data each cycle.
Passive Data Collection in Android
Android allows apps to listen for GPS records passively. It means an app like Avicenna does not have to actively ask for fresh GPS records. Instead, it can sign up to receive a copy of the GPS record that is requested by other apps (given that Avicenna holds required permissions to collect GPS data). This way, Avicenna can collect GPS data without causing additional battery consumption.
Note that this behavior does not interfere with Avicenna's periodic GPS data
collection. Avicenna collects GPS data periodically. The
provider, in this case,
is set to
fused-reuse. It also listens to and records GPS records requested and received
by other apps. The
provider, in this case, is set to
For example, assume the participant is navigating from
Point A to
using Google Maps, and the navigation takes 1 hour. During this 1 hour, Avicenna
receives every GPS record requested by Google Maps, and records them as passive
data. It also collects GPS data (fresh data, as the person is moving and is hot
stationary), and stores the alongside the passive data.
It is worth mentioning that while GPS records collected by Avicenna are still collected in 5-minute intervals, you might find passive GPS records captured between intervals as well.
Avicenna trained a machine-learning model named "GPS mobility mode classification" to detect the mobility mode of GPS data points.
The algorithm preprocesses the GPS data to clean the data, remove noises, and extract kinematic features from the GPS data source. Then, it predicts the mobility mode of GPS data points using the extracted features. Here we first define terminologies and then explain the steps of the algorithm in detail.
GPS Trajectory: A sequence of time-stamped GPS points for one user.
GPS Segment: A GPS segment is a subdivision of a user’s trajectory, which is traveled by only one mobility mode (e.g., stationary, walking, driving).
The steps of the algorithm are as follows:
1. Cleaning and Feature Extraction
- Remove data points that their latitude or longitude is not in the acceptable range.
- Remove duplicated data points.
- Downsample data to 5s frequency so that where we have more than one data point in a 5s interval, we only keep one point and remove the remaining points in that interval.
- Calculate kinematic features (i.e., distance, speed, acceleration, jerk, bearing, and bearing rate).
- Segment each GPS trajectory into GPS segments using a trained machine-learning model.
- Split GPS segments into smaller ones at points where the travel time between two consecutive GPS points exceeds 20 minutes.
- Split GPS segments into smaller ones at points where the number of GPS data points exceeds a predefined value (i.e., 50).
- Finally, small GPS segments with less than 5 data points are skipped with
mobiliy_mode = N/A.
3. Mobility Mode Classification
- In the end, Avicenna utilizes its trained machine-learning model to predict the
mobility mode of each GPS segment and appends the new data fields to the raw
GPS data source. The model classifies each GPS segment to one of stationary,
walking, and driving modes (internally stored as
The machine-learning algorithm adds the following data fields to the GPS data source:
Distance-Feature Extraction: The distance of this record from the previous
record, calculated by our feature extraction algorithm. Internally stored as
Speed-Feature Extraction: The speed of this record, in m/s, calculated by
our feature extraction algorithm. Internally stored as
Acceleration-Feature Extraction: The acceleration of this record, in
m/s2, calculated by our feature extraction algorithm . Internally
Jerk-Feature Extraction: The jerk of this record, in m/s3. Jerk
is the rate at which an object's acceleration changes with respect to time,
calculated by our feature extraction algorithm. Internally stored as
Bearing-Feature Extraction: The bearing of this record, in degrees,
calculated by our feature extraction algorithm. Internally stored as
Bearing rate-Feature Extraction: The bearing rate of this record which is
the rate of bearing changes with respect to time, calculated by our feature
extraction algorithm. Internally stored as
Segment ID: Unique identifier of a GPS segment in a GPS trajectory.
Internally stored as
segment_id. GPS trajectories with fewer than 5 data
points are excluded from segmentation and mobility mode classification steps,
segment_id field is labeled as
N/A. Note that the segment id is
unique among data points of a GPS trajectory with unique (
device_id) during a given month.
Mobility mode: The mobility mode of this record (e.g.
car) predicted by our machine-learning model. Internally stored as
mobility_mode. GPS segments with fewer than 5 data points are skipped in our
mobility mode classification and their
mobility_mode field is labeled as
Supported in Android.
Monitors Wi-Fi signals in the surrounding environment. This data source scans different frequency channels and records the Wi-Fi networks available.
Each Wi-Fi record includes the following:
Service Set Identifier, or the name of the network. Internally stored as
Basic Service Set Identifier, or the address of the access point in proximity. Internally
Describes the authentication, key management, and encryption schemes supported by
the access point. Internally stored as
The frequency (in MHz) of the channel over which the client is communicating with
the access point. Internally stored as
The detected signal level in dBm, is also known as the RSSI. Internally stored as
Data Collection Behavior
Android devices collect WiFi data in the following steps:
- Android asks the OS to scan and send a list of all SSIDs in proximity.
- OS sends the list, usually immediately.
- Avicenna puts each SSID in one record.
After getting the first batch of SSIDs, Avicenna continues scanning for 1 minute, and the OS keeps sending new data points.
During data collection, the number of collected data records correlates to the proximity of the network. If a network is nearby, Avicenna will get a lot of records for its SSID (often with slightly varying levels of RSSI). But if a network is far, Avicenna collects fewer records, because OS will see this network less often in its scans.
Note that Avicenna collects data points with different RSSI levels, frequencies, and BSSIDs for the same SSID and the same record time.