Language:
You're in PublicationsExploiting foursquare and cellular data to infer user activity in urban environments

 

Exploiting foursquare and cellular data to infer user activity in urban environments

Anastasios Noulas, Cecilia Mascolo, Enrique Frias-Martinez

Exploiting foursquare and cellular data to infer user activity in urban environments

Mobile Data Management (MDM), 2013 IEEE 14th International Conference on, June 2013

 

Abstract

Inferring the type of activities in neighborhoods of urban centers may be helpful in a number of contexts including urban planning, content delivery and activity recommendations for mobile web users or may even yield to a deeper understanding of the geographical evolution of social life in the city . During the past few years, the analysis of mobile phone usage patterns, or of social media with longitudinal attributes, have aided the automatic characterization of the dynamics of the urban environment.

In this work, we combine a dataset sourced from a telecom- munication provider in Spain with a database of millions of geo- tagged venues from Foursquare and we formulate the problem of urban activity inference in a supervised learning framework. In particular, we exploit user communication patterns observed at the base station level in order to predict the activity of Foursquare users who checkin-in at nearby venues. First, we mine a set of machine learning features that allow us to encode the input telecommunication signal of a tower. Subsequently, we evaluate a diverse set of supervised learning algorithms using labels extracted from Foursquare place categories and we consider two application scenarios. Initially, we assess how hard it is to predict specific urban activity of an area, showing that Nightlife and Entertainment spots are those easier to infer, whereas College and Shopping areas are those featuring the lowest accuracy rates. Then, considering a candidate set of activity types in a geographic area, we aim to elect the most prominent one. We demonstrate how the difficulty of the problem increases with the number of classes incorporated in the prediction task, yet the classifiers achieve a considerably better performance compared to a random guess even when the set of candidate classes increases. 

 

Dowload here