Skip to main content

Data Sources

Data Collection in Avicenna can be divided into two broad categories: the first, called Activities, involves the data that participants provide by actively interacting with the Avicenna app, such as responding to a survey or completing a cognitive task. The second, called Data Sources, involves the data that is automatically collected without the participant being directly engaged in it, for example collecting GPS data and step count.

Data sources can refer to different sensors, such as GPS, or be in the form of digital footprints, such as screen time, or be collected from wearables such as Google Fit devices. The common attribute between all of them is that participants don’t have to actively engage in collecting this data. They provide the necessary permissions initially when they join the study, and the rest happens automatically.

In this section, we explain how you can view, add, or modify data sources in your Avicenna Study. We also explain what data sources are available and what kind of data each of them collects.

Accessing Data Sources

In order to access the list of data sources currently monitored as part of your study, go to the Researcher Dashboard and navigate to the Data Sources page:

Accessing list of data sources for a given Avicenna study

Here you can add or remove data sources from your study. To add a new data source, click on the Add New Data Source button. On the dialog that opens, you can see the list of all data sources Avicenna supports. Scroll through the list and click on the data source you are interested in. This will take you to another dialog to enter some details about this data source:

Add a new data source

In this dialog, first, you should specify whether providing this data source is mandatory or optional for your study participants. If a data source is marked as optional, the Avicenna app allows participants to opt out of this data source within the app. Note that in most cases, participants can simply revoke the necessary permissions for Avicenna to collect the requested data source. In this case, this lack of necessary permission is reported via the Audit Trail.

You also should choose a Name and a Description for your data source. These values will be shown to the participant to explain what is being collected and why. You may add more details on why your study collects certain data sources within the informed consent, but the description here can also help participants to better understand why a specific data source is needed for your study.

After completing these fields, press Add to add your new data source and set up its data table. You will then be taken back to the study’s Data Sources, where you can see the list of data sources along with their configurations in your study. If you click on each data source's menu, you can see a few options:

Menu options for a data source

In this menu, pressing Go to Data Export will take you to the Data Export page where you can export the data collected by this data source.

You can also press the Remove button and confirm your intent if you want to remove the data source from your study. This will stop collecting that data for your study immediately. If you want to delete the data for this data source as well, mark the Delete the data from the data source checkbox as checked. If for any reason you decided to delete the data after you deleted the data source with that checkbox left unchecked, please contact Avicenna support staff.

To edit a data source, simply press Edit and apply your modifications.

Common Data Fields

You can access the collected data either by exporting them via the Data Export page, or by directly querying them using Kibana. The data format is different based on the data source, for example, GPS data contains location coordinates, while the Pedometer data contains the number of steps taken. Regardless, there are some common fields for each record of each data source that we explain below.

Study ID: The unique ID of the study provided the data. Internally stored as study_id.

User ID: The unique ID of the participant provided the data. Internally stored as user_id.

Device ID: The unique ID of the smart device provided the data. Internally stored as device_id.

Record Time: The time this record was captured. Internally stored as record_time.

Relative Record Time: The time this record was captured, relative to the participation period's start time, in milliseconds. For example, 3,600,000 indicates the record was captured 1 hour after the participant joined the study. Internally stored as rel_record_time. Please note that this field won't be updated if you change a participant's start time.

Data Collection Behavior of Avicenna

Avicenna supports collecting data from Android, iOS, and wearable devices.

Collecting data from some data sources needs specific permissions. If participants join a study that has any of those data sources and the corresponding permissions are not granted already, participants will see a message at the top of the study homepage stating that the study setup is incomplete. They should either grant necessary permissions or select Don't have this device. Choosing the latter option, which is available for wearable data sources only, excludes that data source for that particular participant. The participant can still grant permissions later by visiting the Data Sources page of the study. The Avicenna app also lets the participants revoke such permissions.

If the participant is participating in the study using only the web app, smartphone sensor data won't be collected, but wearable data still will be collected. That's because Avicenna collects wearable data from OEM's data servers (e.g. server of Garmin) instead of directly connecting to the physical device. Avicenna asks the participant's permission to have access to their account on the OEM's servers, and then it collects data from the servers at the end of every day. That's why you can configure such data sources using the web app too.

For collecting data from sensors embedded in Android and iOS devices such as GPS or Pedometer, Avicenna requests data from the OS once every 5 minutes. iOS guarantees this 5-minute interval while Android doesn't guarantee it and might provide data either less often or more often than 5 minutes.

The operating system of Android and iOS devices collects sensor data in two approaches: Continuous (e.g. pedometer) or Episodic (e.g. GPS). In the continuous approach, the device's operating system continuously collects data. The OS then provides all the collected data to the Avicenna app when Avicenna queries it from the device. For example, Android and iPhone devices continuously count the participant's steps. If a study has the Pedometer sensor enabled, the Avicenna app queries the pedometer data once every 5 minutes, but it gets the total number of steps taken since the last request. So even though the Avicenna app queries data once every 5 minutes, it collects all steps taken by the participant. Similarly, Android and iPhone always check whether the screen is on or off. When the screen state changes, the OS notifies the Avicenna app, regardless of the 5-minute data query interval.

In the episodic approach, Avicenna asks the OS every 5 minutes to collect data for a certain period, called Burst Length. The burst length is different for different data sources. For example, GPS keeps collecting data until it reads three accurate data points in a maximum time of 60 seconds. For battery, Avicenna collects one record in each cycle. For the accelerometer, Avicenna collects data for 60 seconds.

For details on the data collection behavior of each data source, please refer to the relevant documentation page following this section.