Have a personal or library account? Click to login
When Text and Speech are Not Enough: A Multimodal Dataset of Collaboration in a Situated Task Cover

Abstract

To adequately model information exchanged in real human-human interactions, considering speech or text alone leaves out many critical modalities. The channels contributing to the “making of sense” in human-human interactions include but are not limited to gesture, speech, user-interaction modeling, gaze, joint attention, and involvement/engagement, all of which need to be adequately modeled to automatically extract correct and meaningful information. In this paper, we present a multimodal dataset of a novel situated and shared collaborative task, with the above channels annotated to encode these different aspects of the situated and embodied involvement of the participants in the joint activity.

DOI: https://doi.org/10.5334/johd.168 | Journal eISSN: 2059-481X
Language: English
Submitted on: Oct 14, 2023
Accepted on: Dec 5, 2023
Published on: Jan 17, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Ibrahim Khebour, Richard Brutti, Indrani Dey, Rachel Dickler, Kelsey Sikes, Kenneth Lai, Mariah Bradford, Brittany Cates, Paige Hansen, Changsoo Jung, Brett Wisniewski, Corbyn Terpstra, Leanne Hirshfield, Sadhana Puntambekar, Nathaniel Blanchard, James Pustejovsky, Nikhil Krishnaswamy, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.