Abstract
WikiTextGraph is an open-source Python package designed to extract and process text from Wikipedia dumps and construct internal link networks across multiple language editions. It uses efficient parsing, redirect resolution, and multilingual graph-building techniques to tackle the challenges of Wikipedia’s scale, structure, and inherent noise. With a modular architecture and a simple graphical user interface (GUI), it is suitable for both technical and non-technical users. Built for scalability and reproducibility, WikiTextGraph supports interdisciplinary research in network science, computational linguistics, and digital humanities. Its flexible design enables easy adaptation for tasks involving low-resource or cross-lingual language studies.1
