Have a personal or library account? Click to login
A Corpus of Stories by Members of Civil Society in Rennes, Bretagne, France Cover

A Corpus of Stories by Members of Civil Society in Rennes, Bretagne, France

Open Access
|Nov 2024

Full Article

(1) Overview

Repository location

https://nakala.fr/10.34847/nkl.03b6h61y

Content

The corpus contains project information (.txt), project ethical and methodological documentation (.pdf), and a series of interviews with associations in Rennes, France, each of which include audio segments (.mp3), images (.jpg or .png), transcriptions and prose versions (.docx) of participant stories.

Context

This corpus of interview data is developed in the context of the ‘Ce qui nous concerne’ (‘What it matters to us’) project, a participative narrative ethnographic research project funded by Rennes Metropole, the Bretagne Region and Zone Atelier Armorique. The main aim of the project is the collection, storage, and valorisation of personal experience collected with associative or third sector actors in Rennes.

This corpus derives from the results of qualitative methods, including research interviews, and the isolation and transcription of interactional narratives.

(2) Method

The ‘Ce qui nous concerne’ project collects stories of personal experience with associations and third sector actors in Rennes, France. The overarching approach is one of qualitative sociolinguistic narrative research.

Steps

Firstly, participants are contacted on the basis of their involvement with associations, non-government organizations and social initiatives in Rennes. Secondly, if a participant wishes to become involved in the project she or he is invited to a 30–60 minute interview. Participants are volunteers, association members, committee members and/or association directors or trustees. These interviews discuss their biographical trajectories, their roles within the association or third sector collective, and how they conceive of Rennes’ civil society. Next the interviews are transcribed using ELAN (The Language Archive, Max Planck Institute for Psycholinguistics, 2023), a tool for annotating audio and video data, particularly suited for psycho- and socio-linguistic research. Stories are isolated from talk using the following criteria: i) the existence of a sustained discourse unit over several turns at talk, ii) the employ of embedded characters and events; and ii) an orientation to process and change (Bamberg & Georgakopoulou, 2008; De Fina & Georgakopoulou, 2008; Georgakopoulou, 2007).

Data editing is then carried out by importing an .mp3 file to Audacity1, a free, open-source audio editing software and then exporting a .wav audio format file. This .wav file is used to create a .csv file with linguistic annotations using ELAN. Should sensitive information be found in the .csv file, it is removed using Audacity. The ELAN annotations are exported in .txt format as a linguistic transcription. This .txt file is used to generate a .docx file of the transcription and a further .docx file of the story rendered as prose. Lastly, the files are uploaded to the Nakala multimodal corpus platform (Briand, Chávez Herrera, Djagbre, Kelleher & N’gnete Kouakou, 2022) and included in the open-access interactive library on the Rennes 2 University WorkAdventure 2D collaborative platform (Kelleher & Ramella, 2024), a virtual workspace that combines 2D map environments designed to recreate physical spaces in a virtual setting.

These steps imply decisions about i) the inclusion of participant stories, rather than the interview as a whole, and ii) the working definitional criteria of narrative that is applied to the data. Concerning i) the decision to cut participant stories from talk and to only include those stories in the Nakala dataset is motivated by considerations of length and pertinence of data. Participant stories are on task (they respond to researcher inquiry) and they display more regular turns at talk, with a weighting in turn distribution that favours the participant. Talk, generally, tends to be less on task and contains more researcher turns. Further, whilst the inclusion of preceding talk could, potentially, allow the appreciation of narrative occasioning and researcher rapport, it would also lead to overly lengthy and indiscriminate audio and transcription files that would disadvantage reuse potential. The cuts applied to stories as included in the dataset attempt in each case to include story prefacing and occasioning in preceding and contiguous talk. Concerning ii) the working definitional criteria of narrative refers to the small stories paradigm (Bamberg & Georgakopoulou, 2008; De Fina, & Georgakopoulou, 2008; Georgakopoulou, 2007). Small stories research departs from the conversation analytic tradition in that stories are very much seen as in-situ, emergent productions between interlocutors. It also considers that narratives are rich, multi-layered sites for identity work, which include spatio-temporal orientation and embedded speech, characters and events. Small stories engage strong links to personal epistemologies and praxes and these are key characteristics that are of interest to the ‘Ce qui nous concerne’ project.

Quality control

Nakala metadata use the DCMI–Dublin Core Metadata Initiative2 whilst ELAN metadata corresponds to ISO 12620:2019. A protocol has been established for the storage architecture, conservation and editing of data in both the Nakala and WorkAdventure datasets. During the verification process, we use the secure Sharedocs platform3 a university file storage and sharing solution that is part of the Huma-Num Research Infrastructure (2021) for verifying each researcher’s workflow. Huma-Num (2021) is an open inter-university platform that provides services, assessment and tools for digital research data.

Workflow verification consists of: i) file and extension architecture in which audio files, image files, ELAN files and text files are verified against the project file extension schedule, ii) ELAN annotation and transcription accuracy and editing in which the ELAN .eaf files are verified for the correct identification of story boundaries, and for correct annotation of linguistic data, iii) audio and transcription file creation and editing in which final .mp3 files are checked against ELAN annotations and in which sensitive information is edited out, iv) story transcription and rendering in which the representativity of stories is checked against the story as transcribed and in which language choices are revised, and v) participant information and feedback in which the files that have been prepared for Nakala are checked with participants for any comments or changes before uploading to the virtual archive.

(3) Dataset Description

Repository name

Nakala

Object name

Ce qui nous concerne

Format names and versions

.CSV, .TXT, .DOCX, .PDF, .JPG, .PNG, .MP3

Project documentation (.txt, .pdf) and a series of interviews filed by association name and containing metadata (.csv), interview audio segments (.mp3), interview transcript (.docx) and associated images (.jpg or .png).

Creation dates

From 2022-11-19

Dataset creators

See the Readme file in the repository: https://nakala.fr/10.34847/nkl.03b6h61y

Language

French

License

Non-commercial licence Creative Commons Attribution 4.0 International (CC-BY-NC-4.0)

Publication date

2022-11-19

(4) Reuse Potential

In France, there are corpuses for multilingual oral data (Abouda, Baude & Michaud, 2010; Agence Nationale de la Recherche & Multicultural Paris French, 2021). These corpuses include narrative oral data and argumentative data (Gadet & Guerin, 2016, 4). Increasingly, Huma-Num (2021) is combining corpuses such as these, working on their interoperability and their standardisation. It is for this reason that ‘Ce qui nous concerne’ is hosted on the Huma-Num Nakala platform. Our corpus also contributes to the extant repositories by focusing on a specific socio-economic field, that of third-sector or associative actors.

The dataset stems from a participatory project and, as such, it is designed for reuse by both researchers and members of civil society. Researchers can use the data for reference and further analysis. Members of civil society can use the data for collaborative initiatives and participation.

Researchers can use the dataset to: i) investigate orality and the speech characteristics of Rennes French, influenced by regional and migration dynamics, ii) document the socio-history and socio-politics of Rennes’ civil society and trace its evolution within broader movements, such as social and solidarity economy, iii) rely on the oral historiographies to complement text archives, such as that curated by the Institut Francais du Monde Associatif.4

Participants and members of civil society can use the dataset to: i) gain insight into the third sector, ii) listen again to their own narratives and reflect on those changes, already passed, or forthcoming, reflecting on what their stories mean both to themselves and to others, iii) benefit from the experiences of others and gain a source of further ideas, iv) use the stories as a resource, embedding them in websites, papers and flyers or referring to them during meetings in order to initiate discussions.

The trade-off between ease of access for non-specialists and usefulness for researchers leads to some limitations on reuse. We have chosen proprietary formats for ease of layout and formatting. We have also chosen to present transcriptions as text documents rather than ELAN .eaf files, which means that, although they can be easily integrated into web sites and emails, they are not as close to the data.

The trade-off between usefulness to research and the privacy of participants (see also Kelleher & Bays, 2022) leads, in the metadata, to a decision to exclude information related to age, gender, and other discriminatory personal information. However, the metadata do include the function and address of the association and the role of the participant. This leads to the possibility for those particularly interested by a specific collection of stories, or by the study of specific associative fields, to follow up on the information presented in the corpus by research into Rennes associations themselves.

Other Huma-Num (2021) collections/corpora (see for instance Abouda, Baude & Michaud, 2010) are used to geolocate, structure and to store corpora of oral data. ‘Ce qui nous concerne’ aims primarily for communication with, and re-use by, associative actors (Kelleher, 2024a) in addition to the scientific community (Kelleher, 2024b).

Data Accessibility Statement

The data associated with this article are available at: https://nakala.fr/10.34847/nkl.03b6h61y.

The dataset contains a readme, project documentation (ethics forms and DMP) and participant interviews filed by association name that include audio segments, transcriptions, narrative versions and associated images.

Notes

[1] https://www.audacityteam.org/(last accessed: 23/09/2024).

[3] https://sharedocs.huma-num.fr(last accessed: 28/06/2024).

[4] https://institutfrancaisdumondeassociatif.org (last accessed: 12/09/2024).

Ethics and Consent

Ethics clearance was obtained for the project ‘Ce qui nous concern’ under number: 2022-012 Rennes 2 Research Ethics Committee.

Acknowledgements

We would like to thank Stéphane Pele and Pierre-André Souville from the Maison de Quartier La Touche (Rennes) for their support in the ‘Ce qui nous concerne’ project.

Funding Information

This research has been funded by Rennes Métropole (22CO909), the region of Bretagne (SAD 2023_UR2_CQNC_Ce qui nous concerne), Zone Atelier Armorique –ZAAr and the LIDILE department (University of Rennes 2).

Competing interests

The authors have no competing interests to declare.

Author Contributions

Conceptualisation: W.K.; Data curation: W.K., E.CH.H, B.B., D.R.G.; Formal Analysis: W.K., E.CH.H, B.B.; Writing– original draft: W.K., E.CH.H, B.B.; Writing– review & editing: W.K., E.CH.H.

DOI: https://doi.org/10.5334/johd.244 | Journal eISSN: 2059-481X
Language: English
Submitted on: Sep 16, 2024
Accepted on: Oct 11, 2024
Published on: Nov 11, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Eduardo Chávez Herrera, William Kelleher, Bérénice Briand, Dolly Ramella Georget, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.