Contextualized Vision Transformers (CVT): Adaptive Spectral Embedding and Feature Gating for Precise Text-Graphics Classification
Authors
Mridul Ghosh
Department of Computer Science, Shyampur Siddheswari Mahavidyalaya, Howrah, India
Risk and location analysis, Fraunhofer IIS, Nuremberg, Germany
Konrad Dürrbeck
konrad.duerrbeck@iis.fraunhofer.de
Risk and location analysis, Fraunhofer IIS, Nuremberg, Germany
Roland Fischer
roland.fischer@iis.fraunhofer.de
Risk and location analysis, Fraunhofer IIS, Nuremberg, Germany
Mária Ždímalová
Department of Mathematics and Descriptive Geometry, Slovak University of Technology in Bratislava, Bratislava, Slovakia
Tonmoy Mete
Department of Computer Science, Asutosh College, Kolkata, India
Language: English
Submitted on: Oct 8, 2025
Accepted on: Nov 27, 2026
Published on: Mar 17, 2026
Published by: Slovak Academy of Sciences, Mathematical Institute
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year
Related subjects:
© 2026 Mridul Ghosh, Konrad Dürrbeck, Roland Fischer, Mária Ždímalová, Tonmoy Mete, published by Slovak Academy of Sciences, Mathematical Institute
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.