References
- 1Baró, A., Riba, P., and Fornés, A. (2018). A starting point for handwritten music recognition. In 1st International Workshop on Reading Music Systems. DOI: 10.1016/j.patrec.2019.02.029
- 2Bellini, P., Bruno, I., and Nesi, P. (2001). Optical music sheet segmentation. In Proceedings First International Conference on WEB Delivering of Music.
WEDELMUSIC 2001 , pages 183–190. DOI: 10.1109/WDM.2001.990175 - 3Calvo-Zaragoza, J., Hajič
Jr , J., and Pacha, A. (2020). Understanding optical music recognition. ACM Computing Surveys (CSUR), 53(4):1–35. DOI: 10.1145/3397499 - 4Calvo-Zaragoza, J. and Rizo, D. (2018a). Camera-primus: Neural end-to-end optical music recognition on realistic monophonic scores. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 248–255. DOI: 10.3390/app8040606
- 5Calvo-Zaragoza, J. and Rizo, D. (2018b). End-to-end neural optical music recognition of monophonic scores. Applied Sciences, 8(4):606. DOI: 10.3390/app8040606
- 6Castellanos, F. J., Gallego, A.-J., and Calvo-Zaragoza, J. (2021). Unsupervised domain adaptation for document analysis of music score images. In 22nd International Society for Music Information Retrieval Conference (ISMIR), pages 81–87.
- 7Chen, Y., Li, W., Sakaridis, C., Dai, D., and Van Gool, L. (2018). Domain adaptive faster R-CNN for object detection in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3339–3348. DOI: 10.1109/CVPR.2018.00352
- 8Chiu, C.-C., Sainath, T. N., Wu, Y., Prabhavalkar, R., Nguyen, P., Chen, Z., Kannan, A., Weiss, R. J., Rao, K., Gonina, K., Jaitly, N., Li, B., Chorowski, J., and Bacchiani, M. (2018). State-of-the-art speech recognition with sequence-to-sequence models. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 4774–4778. DOI: 10.1109/ICASSP.2018.8462105
- 9Chowdhury, A. and Vig, L. (2018). An efficient end-to-end neural model for handwritten text recognition. In British Machine Vision Conference 2018 (BMVC), page 202.
- 10Ciregan, D., Meier, U., and Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 3642–3649.
IEEE . DOI: 10.1109/CVPR.2012.6248110 - 11Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., and Le, Q. V. (2019). Autoaugment: Learning augmentation strategies from data. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 113–123. DOI: 10.1109/CVPR.2019.00020
- 12Dalitz, C., Droettboom, M., Pranzas, B., and Fujinaga, I. (2008). A comparative study of staff removal algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5):753–766. DOI: 10.1109/TPAMI.2007.70749
- 13Dietterich, T. G. (2000). Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems, pages 1–15. DOI: 10.1007/3-540-45014-9_1
- 14Durasov, N., Bagautdinov, T., Baque, P., and Fua, P. (2021). Masksembles for uncertainty estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 13539–13548. DOI: 10.1109/CVPR46437.2021.01333
- 15Dürr, O., Sick, B., and Murina, E. (2020). Probabilistic Deep Learning: With Python, Keras and TensorFlow Probability. Manning Publications.
- 16Elezi, I., Tuggener, L., Pelillo, M., and Stadelmann, T. (2018). Deepscores and deep watershed detection: Current state and open issues. In 1st International Workshop on Reading Music Systems.
- 17Fujinaga, I. (2004).
Staff detection and removal . In George, S. E., editor, Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 1–39. IGI Global. DOI: 10.4018/978-1-59140-298-5.ch001 - 18Gallego, A.-J. and Calvo-Zaragoza, J. (2017). Staff-line removal with selectional auto-encoders. Expert Systems with Applications, 89:138–148. DOI: 10.1016/j.eswa.2017.07.002
- 19Ganin, Y. and Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In International Conference on Machine Learning (ICML), pages 1180–1189.
- 20Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., and Lempitsky, V. (2016). Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17(1):2096–2030.
- 21Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11):139–144. DOI: 10.1145/3422622
- 22Gustafsson, F. K., Danelljan, M., and Schon, T. B. (2020). Evaluating scalable Bayesian deep learning methods for robust computer vision. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 318–319. DOI: 10.1109/CVPRW50498.2020.00167
- 23Hajič
Jr , J., Dorfer, M., Widmer, G., and Pecina, P. (2018). Towards full-pipeline handwritten OMR with musical symbol detection by U-Nets. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 225–232. - 24Hajič
Jr , J. and Pecina, P. (2017). The muscima++ dataset for handwritten optical music recognition. In 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pages 39–46. DOI: 10.1109/ICDAR.2017.16 - 25Han, J., Ding, J., Li, J., and Xia, G.-S. (2021). Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing, 60:1–11. DOI: 10.1109/TGRS.2021.3062048
- 26He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770–778. DOI: 10.1109/CVPR.2016.90
- 27Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J. E., and Weinberger, K. Q. (2017). Snapshot ensembles: Train 1, get M for free. In 5th International Conference on Learning Representations (ICLR).
- 28Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning (ICML), pages 448–456.
- 29Kingma, D. P. and Ba, J. (2021). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR).
- 30Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):84–90. DOI: 10.1145/3065386
- 31Lehner, A., Gasperini, S., Marcos-Ramiro, A., Schmidt, M., Mahani, M.-A. N., Navab, N., Busam, B., and Tombari, F. (2022). 3D-VField: Adversarial augmentation of point clouds for domain generalization in 3D object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 17295–17304. DOI: 10.1109/CVPR52688.2022.01678
- 32Li, Y.-J., Dai, X., Ma, C.-Y., Liu, Y.-C., Chen, K., Wu, B., He, Z., Kitani, K., and Vajda, P. (2022). Cross-domain adaptive teacher for object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 7581–7590. DOI: 10.1109/CVPR52688.2022.00743
- 33Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017). Feature pyramid networks for object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 2117–2125. DOI: 10.1109/CVPR.2017.106
- 34Mateiu, T. N., Gallego, A.-J., and Calvo-Zaragoza, J. (2019). Domain adaptation for handwritten symbol recognition: A case of study in old music manuscripts. In Iberian Conference on Pattern Recognition and Image Analysis, pages 135–146. DOI: 10.1007/978-3-030-31321-0_12
- 35Nguyen, A., Yosinski, J., and Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 427–436. DOI: 10.1109/CVPR.2015.7298640
- 36Pacha, A. and Calvo-Zaragoza, J. (2018). Optical music recognition in mensural notation with region-based convolutional neural networks. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 240–247.
- 37Pacha, A., Choi, K.-Y., Coüasnon, B., Ricquebourg, Y., Zanibbi, R., and Eidenberger, H. (2018a).
Handwritten music object detection: Open issues and baseline results . In 13th IAPR International Workshop on Document Analysis Systems (DAS), pages 163–168. IEEE. DOI: 10.1109/DAS.2018.51 - 38Pacha, A., Hajič
Jr , J., and Calvo-Zaragoza, J. (2018b). A baseline for general music object detection with deep learning. Applied Sciences, 8(9):1488. DOI: 10.3390/app8091488 - 39Pugin, L. (2006). Optical music recognitoin of early typographic prints using hidden Markov models. In Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR), pages 53–56.
- 40Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marcal, A. R., Guedes, C., and Cardoso, J. S. (2012). Optical music recognition: state-of-the-art and open issues. International Journal of Multimedia Information Retrieval, 1:173–190. DOI: 10.1007/s13735-012-0004-6
- 41Sato, I., Nishimura, H., and Yokoi, K. (2015). Apac: Augmented pattern classification with neural networks. arXiv preprint
arXiv:1505.03229 . - 42Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61:85–117. DOI: 10.1016/j.neunet.2014.09.003
- 43Simmler, N., Sager, P., Andermatt, P., Chavarriaga, R., Schilling, F.-P., Rosenthal, M., and Stadelmann, T. (2021). A survey of un-, weakly-, and semi-supervised learning methods for noisy, missing and partial labels in industrial vision applications. In 2021 8th Swiss Conference on Data Science (SDS), pages 26–31. DOI: 10.1109/SDS51136.2021.00012
- 44Smith, L. N. (2017). Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 464–472. DOI: 10.1109/WACV.2017.58
- 45Solovyev, R., Wang, W., and Gabruseva, T. (2021). Weighted boxes fusion: Ensembling boxes from different object detection models. Image and Vision Computing, 107:
104117 . DOI: 10.1016/j.imavis.2021.104117 - 46Stadelmann, T., Amirian, M., Arabaci, I., Arnold, M., Duivesteijn, G. F., Elezi, I., Geiger, M., Lorwald, S., Meier, B. B., Rombach, K., et al. (2018). Deep learning in the wild. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition, pages 17–38. DOI: 10.1007/978-3-319-99978-4_2
- 47Toyama, F., Shoji, K., and Miyamichi, J. (2006). Symbol recognition of printed piano scores with touching symbols. In 18th International Conference on Pattern Recognition (ICPR), volume 2, pages 480–483. DOI: 10.1109/ICPR.2006.1099
- 48Tuggener, L., Elezi, I., Schmidhuber, J., Pelillo, M., and Stadelmann, T. (2018a). DeepScores: A dataset for segmentation, detection and classification of tiny objects. In 24th International Conference on Pattern Recognition (ICPR), pages 3704–3709. DOI: 10.1109/ICPR.2018.8545307
- 49Tuggener, L., Elezi, I., Schmidhuber, J., and Stadelmann, T. (2018b). Deep watershed detector for music object recognition. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 271–278.
- 50Tuggener, L., Satyawan, Y. P., Pacha, A., Schmidhuber, J., and Stadelmann, T. (2021). The DeepScoresV2 dataset and benchmark for music object detection. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 9188–9195. DOI: 10.1109/ICPR48806.2021.9412290
- 51Tzeng, E., Hoffman, J., Saenko, K., and Darrell, T. (2017). Adversarial discriminative domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 7167–7176. DOI: 10.1109/CVPR.2017.316
- 52van der Wel, E. and Ullrich, K. (2017). Optical music recognition with convolutional sequence-to-sequence models. In 18th International Society for Music Information Retrieval Conference (ISMIR), pages 731–737.
- 53von Oswald, J., Kobayashi, S., Sacramento, J., Meulemans, A., Henning, C., and Grewe, B. F. (2021). Neural networks with late-phase weights. In 9th International Conference on Learning Representations (ICLR).
- 54Wen, Y., Tran, D., and Ba, J. (2020). BatchEnsemble: An alternative approach to efficient ensemble and lifelong learning. In 8th International Conference on Learning Representations (ICLR).
- 55Wenzel, F., Snoek, J., Tran, D., and Jenatton, R. (2020). Hyperparameter ensembles for robustness and uncertainty quantification. In 34th International Conference on Neural Information Processing Systems (NeurIPS), pages 6514–6527.
- 56Xia, Y., Zhang, J., Jiang, T., Gong, Z., Yao, W., and Feng, L. (2021). HatchEnsemble: An efficient and practical uncertainty quantification method for deep neural networks. Complex & Intelligent Systems, 7:2855–2869. DOI: 10.1007/s40747-021-00463-1
- 57Zhu, X., Liu, Y., Qin, Z., and Li, J. (2017). Data augmentation in emotion classification using generative adversarial networks. arXiv preprint
arXiv:1711.00648 . DOI: 10.1007/978-3-319-93040-4_28 - 58Zhu, X., Pang, J., Yang, C., Shi, J., and Lin, D. (2019). Adapting object detectors via selective cross-domain alignment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 687–696. DOI: 10.1109/CVPR.2019.00078
