Big Data is increasingly recognized for its substantial potential to enhance research on human behavior. Data from diverse sources, such as social media platforms, telecommunications companies, and mapping services like OpenStreetMap (OSM), provide a wealth of information. However, merely having this data is insufficient to fully grasp individuals’ socio-economic and personal contexts. While Big Data is effective for in-depth studies of specific aspects of human behavior, it often falls short of capturing the overall complexity of daily life.
For example, social media data facilitates sophisticated analyses of interactions within a platform but doesn’t adequately reflect the intricacies of real-world activities. Conversely, traditional social science methods—such as questionnaires, interviews, and participant observation—capture the nuances of daily routines and individual perspectives but have a different, more limited scale compared to Big Data.
In this context, the DataScientia (DS) initiative provides tools for gathering real-time behavioral data, combining insights from both individual interactions and sensor streams from personal devices, particularly smartphones. This approach is beneficial for researchers across various fields, from social sciences to computer science. The impact of this data is further amplified by integrating it with other secondary data, such as merging GPS data from smartphones with location descriptions and Points of Interest obtained from OSM.
However, collecting this type of data presents several challenges, including technological issues and privacy concerns, particularly regarding GDPR compliance. Researchers must also navigate potential errors due to participant interactions, such as response bias and participant burden, as well as prepare the data for distribution, addressing issues like data loss and lack of anonymization.
To help address these challenges, DS is equipped with several essential tools:
This tutorial proposal aims to introduce interested researchers to the services offered by DS through an interactive and hands-on approach. Participants will have the opportunity to experiment with the various services, replicating the user journey of a participant in a data collection process to better understand the impact and potential of these technologies.
Time | Title | Activity | Presenter |
9:30 – 9:50 | Introduction | Matteo | |
9:50 – 10:20 | The DataScientia Community | Community registration and Project joining: Participants will explore the DS Community features, such as profile creation, project management, participants recruitment. Interested people can register in the Community and, from there, access the data collection project designed for the tutorial. | Ali |
10:20 – 10:40 | Register to the community and download the iLog app | Ali | |
10:40 – 11:10 | The data collection system and services |
Download the iLog app and Collect data: Participants will be invited to download the GDPR compliant iLog app and start collecting their smartphone data and interacting with the app to explore its capabilities. Monitor data collection (demo): During the collection, participants will explore the dashboard, which is designed for researchers to monitor the data collected in real-time. |
Leonardo |
11:10 – 11:25 | BREAK | ALL | |
11:25 – 12:05 | The data preparation pipeline | Data transformation: Participants will learn the pipeline for transforming and preparing the data and consolidating it from a privacy point of view. After the tutorial, the participants can download their data. |
Andrea |
11:25 – 12:05 | The LivePeople Catalog | Participants will be guided through the LivePeople catalog and its organization. How to access the catalog and request data will be shown. | Andrea |
12:05 – 12:20 | Wrap up and conclusion | Matteo | |
12:20 – 12:30 | Q&A Session | ALL |
University of Trento – DISI matteo.busso@unitn.it
University of Trento – DISI andrea.bontempelli@unitn.it
University of Trento – DISI ali.hamza@unitn.it
University of Trento – DISI leonardo.malcoltti@unitn.it