Inferring social relationships in a phone call from a single party’s speech
S.H. Yella, X. Anguera, J. Luque, “Inferring social relationships in a phone call from a single party’s speech“, IEEE International Conference on Acoustics, Speech, and Signal Processing (ICASSP’14), May 2014
People usually speak differently depending on who they talk to. Based on this hypothesis, in this paper we propose an automatic method to detect the social relationship between two people based solely on a set of acoustic and conversational characteristics. We argue that changes in these features of an individual reflect her social relationship with the other person. To infer relationship we only require the speech of one of the conversation partners and the interaction patterns between both speakers. We validate the proposed system using a real-life telephone database with calls made by several speakers to close family members and to their partners. We trained a classifier using a boosting algorithm on a set of conversational and acoustic features and use it to classify calls according to the social relationship between both speakers. Tests performed on models trained on single speaker’s data show that for most people such prediction is feasible. We also show that these characteristics generalize quite well across speakers, achieving around 75% accuracy when both sets of features are combined.