Abstract
This work presents a novel computational system for the automated digitization of image-based data from seals of the ancient Indus Valley Civilization (IVC). The objective of this system’s design is to automatically extract and archive key information from seals or images, including the script and motifs. The system operates as a pipeline comprising three deep learning models integrated with a custom-designed database. Two models form the Ancient Script Recognition network (ASR-net), which digitizes sequences of graphemes from Indus seals, similar to Optical Character Recognition for modern languages. The third model, the Motif Identification network (MI-net), identifies recurring motifs—distinctive symbols or iconographic elements with specific functional significance in the IVC. The database stores the extracted information, linking it to the respective seal images in a structured format. This end-to-end pipeline has been fully implemented, from image input to database archival. The overarching aim of this work is to support the application of automated statistical methods in the ongoing efforts to decipher the Indus script.
