Skip to main content
Have a personal or library account? Click to login
Fine-Tuning South Tyrolean Dialect-to-Standard German ASR with AlpiLinK Cover

Fine-Tuning South Tyrolean Dialect-to-Standard German ASR with AlpiLinK

Open Access
|Jun 2026

References

  1. Adda-Decker, M., Lamel, L., Adda, G., & Lavergne, T. (2014). A First LVCSR System for Luxembourgish, a Low-Resourced European Language. In Z. Vetulani & J. Mariani (Eds.), Human Language Technology Challenges for Computer Science and Linguistics (pp. 479490). Springer International Publishing. 10.1007/978-3-319-08958-4_39
  2. Blaschke, V., Winkler, M., Förster, C., Wenger-Glemser, G., & Plank, B. (2025). A Multi-Dialectal Dataset for German Dialect ASR and Dialect-to-Standard Speech Translation. In Proceedings of Interspeech 2025 (pp. 913917). ISCA. 10.21437/Interspeech.2025-318
  3. Colletti, N., & Lombardo, S. (2025). Südtiroler Sprachbarometer: Sprachgebrauch und Sprachidentität in Südtirol. (Tech. Rep.), Autonome Provinz Bozen-Südtirol, Landesinstitut für Statistik – ASTAT. Retrieved 2026-05-06, from https://assets-eu-01.kc-usercontent.com/b5376750-8076-01cf-17d2-d343e29778a7/936599a9-e59d-4c7b-bb13-f84efd0718c8/Sprachbarometer
  4. Dehak, N., Dumouchel, P., & Kenny, P. (2007). Modeling Prosodic Features With Joint Factor Analysis for Speaker Verification. In IEEE Transactions on Audio, Speech and Language Processing (pp. 20952103). 10.1109/TASL.2007.902758
  5. Dogan-Schönberger, P., Mäder, J., & Hofmann, T. (2021). SwissDial: Parallel Multidialectal Corpus of Spoken Swiss German. (Working Paper 2103.11401). Cornell University.
  6. Ducceschi, L., & Franzini, G. (2025). Speech transcription from South Tyrolean Dialect to Standard German with Whisper. In Proceedings of Interspeech 2025 (pp. 15). ISCA. 10.21437/Interspeech.2025-1976
  7. Fant, G. (1960). Acoustic Theory of Speech Production. Mouton.
  8. FFmpeg Developers (2016). ffmpeg tool. https://ffmpeg.org/
  9. Franzini, G., & Ducceschi, L. (to appear). South Tyrolean Dialect-to-Standard Speech Translation: A Resource. In Proceedings of the Workshop on Dialects in NLP — A Resource Perspective (DialRes-LREC26), co-located with the Language Resources and Evaluation Conference (LREC). Palma de Mallorca, Spain.
  10. Gilles, P., Hillah, L., & Hosseini-Kivanani, N. (2023). ASRLUX: Automatic Speech Recognition for the low-resource language Luxembourgish. In R. Skarnitzl & J. Volín (Eds.), Proceedings of the 20th International Congress of Phonetic Sciences (pp. 30913095). https://hdl.handle.net/10993/55819
  11. Hollenstein, N., & Aepli, N. (2014). Compilation of a Swiss German Dialect Corpus and its Application to PoS Tagging. In M. Zampieri, L. Tan, N. Ljubešić, & J. Tiedemann (Eds.), Proceedings of the First Workshop on Applying NLP Tools to Similar Languages, Varieties and Dialects (pp. 8594). Association for Computational Linguistics and Dublin City University. 10.3115/v1/W14-5310
  12. Hosseini-Kivanani, N., Schommer, C., & Gilles, P. (2025). Voices of Luxembourg: Tackling Dialect Diversity in a Low-Resource Setting. In Š. A. Holdt, N. Ilinykh, B. Scalvini, M. Bruton, I. N. Debess, & C. M. Tudor (Eds.), Proceedings of the Third Workshop on Resources and Representations for Under-Resourced Languages and Domains (RESOURCEFUL-2025) (pp. 143152). University of Tartu Library, Estonia. https://aclanthology.org/2025.resourceful-1.29/
  13. Kent, R., & Read, C. (2002). The Acoustic Analysis of Speech. Singular Publishing Group.
  14. Kerle, L. K., Pucher, M., & Schuppler, B. (2023). Speaker interpolation based data augmentation for automatic speech recognition. In Proceedings of the 20th International Congress of Phonetic Sciences (pp. 31263130). International Phonetic Association. https://phaidra.kug.ac.at/o:127503
  15. Kruijt, A., & Rabanus, S. (2025). From VinKo to AlpiLinK: web-based long-term storage and accessibility of information. Korpus im Text, 17. https://www.kit.gwi.uni-muenchen.de/?p=106395&v=1
  16. Linke, J., Winkler, J., & Schuppler, B. (2025). Context is all you need? Low-resource conversational ASR profits from context, coming from the same or from the other speaker. In Proceedings of the Annual Conference of the International Speech Communication Association, INTERSPEECH (pp. 31993203). ISCA, International Speech Communication Association. 10.21437/Interspeech.2025-1824
  17. Plüss, M., Deriu, J., Schraner, Y., Paonessa, C., Hartmann, J., Schmidt, L., …, Cieliebak, M. (2023). STT4SG-350: A Speech Corpus for All Swiss German Dialect Regions. In A. Rogers, J. Boyd-Graber, & N. Okazaki (Eds.), Proceedings of the 61st Annual Meeting of the Association for Computational Linguistics (Volume 2: Short Papers) (pp. 17631772). Association for Computational Linguistics. 10.18653/v1/2023.acl-short.150
  18. Plüss, M., Hürlimann, M., Cuny, M., Stöckli, A., Kapotis, N., Hartmann, J., …, Vogel, M. (2022). SDS-200: A Swiss German Speech to Standard German Text Corpus. In N. Calzolari, et al. (Eds.), Proceedings of the Thirteenth Language Resources and Evaluation Conference (pp. 32503256). European Language Resources Association. Retrieved 2026-05-06, from https://aclanthology.org/2022.lrec-1.347/
  19. Radford, A., Kim, J. W., Xu, T., Brockman, G., McLeavey, C., & Sutskever, I. (2023). Robust speech recognition via large-scale weak supervision. In International Conference on Machine Learning (pp. 2849228518). PMLR. https://arxiv.org/abs/2212.04356
  20. Samardžić, T., Scherrer, Y., & Glaser, E. (2015). Normalising orthographic and dialectal variants for the automatic processing of Swiss German. In Proceedings of the 4th Biennial Workshop on Less-resourced Languages (pp. 294298). ELRA. Retrieved 2026-05-06, from https://ltc.amu.edu.pl/a2015/book/papers/LRL-2.pdf
  21. Sicard, C., Gillioz, V., & Pyszkowski, K. (2023). Spaiche: Extending State-of-the-Art ASR Models to Swiss German Dialects. In H. Ghorbel, M. Sokhn, M. Cieliebak, M. Hürlimann, E. de Salis, & J. Guerne (Eds.), Proceedings of the 8th Edition of the Swiss Text Analytics Conference (pp. 7683). Association for Computational Linguistics. Retrieved 2026-05-06, from https://aclanthology.org/2023.swisstext-1.8/
  22. Solberg, E., Ortiz, P., Parsons, P., Svendsen, T., & Salvi, G. (2023). Improving Generalization of Norwegian ASR with Limited Linguistic Resources. In Proceedings of the 24th Nordic Conference on Computational Linguistics (NoDaLiDa) (pp. 508517). University of Tartu Library. Retrieved 2026-05-06, from https://aclanthology.org/2023.nodalida-1.51/
  23. Stöckle, P., & Vergeiner, P. (2025). Geographical patterns in the Bavarian dialects of Austria and South Tyrol. A real-time comparison using dialectometric methods. Zeitschrift für Sprachvariation und Soziolinguistik - Journal of Language Variation and Sociolinguistics (pp. 2041). 10.5282/jlvs/7
  24. Titze, I. (1994). Principles of Voice Production. Prentice Hall.
  25. Volgger, J., Röggla, M., Ganthaler, D., Iacopino, T., Mühlberger, M., & Mariz, C. (2024). Autonomy Dashboard South Tyrol. Eurac Research. Retrieved from https://www.eurac.edu/doi/10-57749-m70n-ms30
  26. Vásquez-Correa, J. C., Orozco-Arroyave, J. R., Bocklet, T., & Nöth, E. (2018). Towards an automatic evaluation of the dysarthria level of patients with Parkinson’s disease. Journal of Communication Disorders, 2126. 10.1016/j.jcomdis.2018.08.002
  27. Wiesinger, P. (1983). Die Einteilung der deutschen Dialekte. In W. Besch, U. Knoop, W. Putschke, & H. Wiegand (Eds.), Dialektologie (pp. 807900). de Gruyter. 10.1515/9783110203332.807
  28. Zhang, Y., Han, W., Qin, J., Wang, Y., Bapna, A., Chen, Z., …, others. (2023). Google USM: Scaling automatic speech recognition beyond 100 languages. 10.48550/arXiv.2303.01037
DOI: https://doi.org/10.5334/johd.533 | Journal eISSN: 2059-481X
Language: English
Page range: 74 - 74
Submitted on: Mar 1, 2026
Accepted on: May 5, 2026
Published on: Jun 8, 2026
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2026 Greta H. Franzini, Luca Ducceschi, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.