The paper titled “Keep Sensors in Check: Disentangling Cross-Country Generalization Issues in Mobile Sensor-Based Models with Diversity Scores”, by our WeNet consortium partners at Idiap Research Institute (Alexandre Nanchen, Lakmal Meegahapola, William Droz, and Daniel Gatica-Perez), will be presented at the AAAI/ACM Conference on Artificial Intelligence, Ethics, and Society (AIES). The conference is held in Montreal, Canada, August 8-10, 2023.
Co-author Alexandre Nanchen will present it at the conference. Below is the abstract of the paper:
Machine learning models trained with passive sensor data from mobile devices can be used to perform various inferences pertaining to activity recognition, context awareness, and health and well-being. Prior work has improved inference performance through the use of multimodal sensors (inertial, GPS, proximity, app usage, etc.) or improved machine learning. In this context, a few studies shed light on critical issues relating to the poor cross-country generalization of models due to distributional shifts across countries. However, these studies have largely relied on inference performance as a means of studying generalization issues, failing to investigate whether the root cause of the problem is linked to specific sensor modalities (independent variables) or the target attribute (dependent variable). In this paper, we study this issue in complex activities of daily living (ADL) inference task, involving 12 classes, by using a multimodal, multi-country dataset collected from 689 participants across eight countries. We first show that the ‘country of origin’ of data is captured by sensors and can be inferred from each modality separately, with an average accuracy of 65%. We then propose two diversity scores (DS) that measure how a country differentiates from others w.r.t. sensor modalities or activities. Using these diversity scores, we observed that both individual sensor modalities and activities have the ability to differentiate countries. However, while many activities capture country differences, only the ‘App usage’ and ‘Location’ sensors can do so. By dissecting country-level diversity across dependent and independent variables, we provide a framework to better understand model generalization issues across countries and country-level diversity of sensing modalities.