Have a personal or library account? Click to login

References

  1. 1Baró, A., Riba, P., and Fornés, A. (2018). A starting point for handwritten music recognition. In 1st International Workshop on Reading Music Systems. DOI: 10.1016/j.patrec.2019.02.029
  2. 2Bellini, P., Bruno, I., and Nesi, P. (2001). Optical music sheet segmentation. In Proceedings First International Conference on WEB Delivering of Music. WEDELMUSIC 2001, pages 183190. DOI: 10.1109/WDM.2001.990175
  3. 3Calvo-Zaragoza, J., Hajič Jr, J., and Pacha, A. (2020). Understanding optical music recognition. ACM Computing Surveys (CSUR), 53(4):135. DOI: 10.1145/3397499
  4. 4Calvo-Zaragoza, J. and Rizo, D. (2018a). Camera-primus: Neural end-to-end optical music recognition on realistic monophonic scores. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 248255. DOI: 10.3390/app8040606
  5. 5Calvo-Zaragoza, J. and Rizo, D. (2018b). End-to-end neural optical music recognition of monophonic scores. Applied Sciences, 8(4):606. DOI: 10.3390/app8040606
  6. 6Castellanos, F. J., Gallego, A.-J., and Calvo-Zaragoza, J. (2021). Unsupervised domain adaptation for document analysis of music score images. In 22nd International Society for Music Information Retrieval Conference (ISMIR), pages 8187.
  7. 7Chen, Y., Li, W., Sakaridis, C., Dai, D., and Van Gool, L. (2018). Domain adaptive faster R-CNN for object detection in the wild. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 33393348. DOI: 10.1109/CVPR.2018.00352
  8. 8Chiu, C.-C., Sainath, T. N., Wu, Y., Prabhavalkar, R., Nguyen, P., Chen, Z., Kannan, A., Weiss, R. J., Rao, K., Gonina, K., Jaitly, N., Li, B., Chorowski, J., and Bacchiani, M. (2018). State-of-the-art speech recognition with sequence-to-sequence models. 2018 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 47744778. DOI: 10.1109/ICASSP.2018.8462105
  9. 9Chowdhury, A. and Vig, L. (2018). An efficient end-to-end neural model for handwritten text recognition. In British Machine Vision Conference 2018 (BMVC), page 202.
  10. 10Ciregan, D., Meier, U., and Schmidhuber, J. (2012). Multi-column deep neural networks for image classification. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 36423649. IEEE. DOI: 10.1109/CVPR.2012.6248110
  11. 11Cubuk, E. D., Zoph, B., Mane, D., Vasudevan, V., and Le, Q. V. (2019). Autoaugment: Learning augmentation strategies from data. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 113123. DOI: 10.1109/CVPR.2019.00020
  12. 12Dalitz, C., Droettboom, M., Pranzas, B., and Fujinaga, I. (2008). A comparative study of staff removal algorithms. IEEE Transactions on Pattern Analysis and Machine Intelligence, 30(5):753766. DOI: 10.1109/TPAMI.2007.70749
  13. 13Dietterich, T. G. (2000). Ensemble methods in machine learning. In International Workshop on Multiple Classifier Systems, pages 115. DOI: 10.1007/3-540-45014-9_1
  14. 14Durasov, N., Bagautdinov, T., Baque, P., and Fua, P. (2021). Masksembles for uncertainty estimation. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1353913548. DOI: 10.1109/CVPR46437.2021.01333
  15. 15Dürr, O., Sick, B., and Murina, E. (2020). Probabilistic Deep Learning: With Python, Keras and TensorFlow Probability. Manning Publications.
  16. 16Elezi, I., Tuggener, L., Pelillo, M., and Stadelmann, T. (2018). Deepscores and deep watershed detection: Current state and open issues. In 1st International Workshop on Reading Music Systems.
  17. 17Fujinaga, I. (2004). Staff detection and removal. In George, S. E., editor, Visual Perception of Music Notation: On-Line and Off Line Recognition, pages 139. IGI Global. DOI: 10.4018/978-1-59140-298-5.ch001
  18. 18Gallego, A.-J. and Calvo-Zaragoza, J. (2017). Staff-line removal with selectional auto-encoders. Expert Systems with Applications, 89:138148. DOI: 10.1016/j.eswa.2017.07.002
  19. 19Ganin, Y. and Lempitsky, V. (2015). Unsupervised domain adaptation by backpropagation. In International Conference on Machine Learning (ICML), pages 11801189.
  20. 20Ganin, Y., Ustinova, E., Ajakan, H., Germain, P., Larochelle, H., Laviolette, F., Marchand, M., and Lempitsky, V. (2016). Domain-adversarial training of neural networks. Journal of Machine Learning Research, 17(1):20962030.
  21. 21Goodfellow, I., Pouget-Abadie, J., Mirza, M., Xu, B., Warde-Farley, D., Ozair, S., Courville, A., and Bengio, Y. (2020). Generative adversarial networks. Communications of the ACM, 63(11):139144. DOI: 10.1145/3422622
  22. 22Gustafsson, F. K., Danelljan, M., and Schon, T. B. (2020). Evaluating scalable Bayesian deep learning methods for robust computer vision. In IEEE/CVF Conference on Computer Vision and Pattern Recognition Workshops, pages 318319. DOI: 10.1109/CVPRW50498.2020.00167
  23. 23Hajič Jr, J., Dorfer, M., Widmer, G., and Pecina, P. (2018). Towards full-pipeline handwritten OMR with musical symbol detection by U-Nets. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 225232.
  24. 24Hajič Jr, J. and Pecina, P. (2017). The muscima++ dataset for handwritten optical music recognition. In 14th IAPR International Conference on Document Analysis and Recognition (ICDAR), pages 3946. DOI: 10.1109/ICDAR.2017.16
  25. 25Han, J., Ding, J., Li, J., and Xia, G.-S. (2021). Align deep features for oriented object detection. IEEE Transactions on Geoscience and Remote Sensing, 60:111. DOI: 10.1109/TGRS.2021.3062048
  26. 26He, K., Zhang, X., Ren, S., and Sun, J. (2016). Deep residual learning for image recognition. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 770778. DOI: 10.1109/CVPR.2016.90
  27. 27Huang, G., Li, Y., Pleiss, G., Liu, Z., Hopcroft, J. E., and Weinberger, K. Q. (2017). Snapshot ensembles: Train 1, get M for free. In 5th International Conference on Learning Representations (ICLR).
  28. 28Ioffe, S. and Szegedy, C. (2015). Batch normalization: Accelerating deep network training by reducing internal covariate shift. In International Conference on Machine Learning (ICML), pages 448456.
  29. 29Kingma, D. P. and Ba, J. (2021). Adam: A method for stochastic optimization. In 3rd International Conference on Learning Representations (ICLR).
  30. 30Krizhevsky, A., Sutskever, I., and Hinton, G. E. (2017). Imagenet classification with deep convolutional neural networks. Communications of the ACM, 60(6):8490. DOI: 10.1145/3065386
  31. 31Lehner, A., Gasperini, S., Marcos-Ramiro, A., Schmidt, M., Mahani, M.-A. N., Navab, N., Busam, B., and Tombari, F. (2022). 3D-VField: Adversarial augmentation of point clouds for domain generalization in 3D object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 1729517304. DOI: 10.1109/CVPR52688.2022.01678
  32. 32Li, Y.-J., Dai, X., Ma, C.-Y., Liu, Y.-C., Chen, K., Wu, B., He, Z., Kitani, K., and Vajda, P. (2022). Cross-domain adaptive teacher for object detection. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 75817590. DOI: 10.1109/CVPR52688.2022.00743
  33. 33Lin, T.-Y., Dollár, P., Girshick, R., He, K., Hariharan, B., and Belongie, S. (2017). Feature pyramid networks for object detection. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 21172125. DOI: 10.1109/CVPR.2017.106
  34. 34Mateiu, T. N., Gallego, A.-J., and Calvo-Zaragoza, J. (2019). Domain adaptation for handwritten symbol recognition: A case of study in old music manuscripts. In Iberian Conference on Pattern Recognition and Image Analysis, pages 135146. DOI: 10.1007/978-3-030-31321-0_12
  35. 35Nguyen, A., Yosinski, J., and Clune, J. (2015). Deep neural networks are easily fooled: High confidence predictions for unrecognizable images. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 427436. DOI: 10.1109/CVPR.2015.7298640
  36. 36Pacha, A. and Calvo-Zaragoza, J. (2018). Optical music recognition in mensural notation with region-based convolutional neural networks. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 240247.
  37. 37Pacha, A., Choi, K.-Y., Coüasnon, B., Ricquebourg, Y., Zanibbi, R., and Eidenberger, H. (2018a). Handwritten music object detection: Open issues and baseline results. In 13th IAPR International Workshop on Document Analysis Systems (DAS), pages 163168. IEEE. DOI: 10.1109/DAS.2018.51
  38. 38Pacha, A., Hajič Jr, J., and Calvo-Zaragoza, J. (2018b). A baseline for general music object detection with deep learning. Applied Sciences, 8(9):1488. DOI: 10.3390/app8091488
  39. 39Pugin, L. (2006). Optical music recognitoin of early typographic prints using hidden Markov models. In Proceedings of the 7th International Conference on Music Information Retrieval (ISMIR), pages 5356.
  40. 40Rebelo, A., Fujinaga, I., Paszkiewicz, F., Marcal, A. R., Guedes, C., and Cardoso, J. S. (2012). Optical music recognition: state-of-the-art and open issues. International Journal of Multimedia Information Retrieval, 1:173190. DOI: 10.1007/s13735-012-0004-6
  41. 41Sato, I., Nishimura, H., and Yokoi, K. (2015). Apac: Augmented pattern classification with neural networks. arXiv preprint arXiv:1505.03229.
  42. 42Schmidhuber, J. (2015). Deep learning in neural networks: An overview. Neural networks, 61:85117. DOI: 10.1016/j.neunet.2014.09.003
  43. 43Simmler, N., Sager, P., Andermatt, P., Chavarriaga, R., Schilling, F.-P., Rosenthal, M., and Stadelmann, T. (2021). A survey of un-, weakly-, and semi-supervised learning methods for noisy, missing and partial labels in industrial vision applications. In 2021 8th Swiss Conference on Data Science (SDS), pages 2631. DOI: 10.1109/SDS51136.2021.00012
  44. 44Smith, L. N. (2017). Cyclical learning rates for training neural networks. In 2017 IEEE Winter Conference on Applications of Computer Vision (WACV), pages 464472. DOI: 10.1109/WACV.2017.58
  45. 45Solovyev, R., Wang, W., and Gabruseva, T. (2021). Weighted boxes fusion: Ensembling boxes from different object detection models. Image and Vision Computing, 107:104117. DOI: 10.1016/j.imavis.2021.104117
  46. 46Stadelmann, T., Amirian, M., Arabaci, I., Arnold, M., Duivesteijn, G. F., Elezi, I., Geiger, M., Lorwald, S., Meier, B. B., Rombach, K., et al. (2018). Deep learning in the wild. In IAPR Workshop on Artificial Neural Networks in Pattern Recognition, pages 1738. DOI: 10.1007/978-3-319-99978-4_2
  47. 47Toyama, F., Shoji, K., and Miyamichi, J. (2006). Symbol recognition of printed piano scores with touching symbols. In 18th International Conference on Pattern Recognition (ICPR), volume 2, pages 480483. DOI: 10.1109/ICPR.2006.1099
  48. 48Tuggener, L., Elezi, I., Schmidhuber, J., Pelillo, M., and Stadelmann, T. (2018a). DeepScores: A dataset for segmentation, detection and classification of tiny objects. In 24th International Conference on Pattern Recognition (ICPR), pages 37043709. DOI: 10.1109/ICPR.2018.8545307
  49. 49Tuggener, L., Elezi, I., Schmidhuber, J., and Stadelmann, T. (2018b). Deep watershed detector for music object recognition. In 19th International Society for Music Information Retrieval Conference (ISMIR), pages 271278.
  50. 50Tuggener, L., Satyawan, Y. P., Pacha, A., Schmidhuber, J., and Stadelmann, T. (2021). The DeepScoresV2 dataset and benchmark for music object detection. In 2020 25th International Conference on Pattern Recognition (ICPR), pages 91889195. DOI: 10.1109/ICPR48806.2021.9412290
  51. 51Tzeng, E., Hoffman, J., Saenko, K., and Darrell, T. (2017). Adversarial discriminative domain adaptation. In IEEE Conference on Computer Vision and Pattern Recognition (CVPR), pages 71677176. DOI: 10.1109/CVPR.2017.316
  52. 52van der Wel, E. and Ullrich, K. (2017). Optical music recognition with convolutional sequence-to-sequence models. In 18th International Society for Music Information Retrieval Conference (ISMIR), pages 731737.
  53. 53von Oswald, J., Kobayashi, S., Sacramento, J., Meulemans, A., Henning, C., and Grewe, B. F. (2021). Neural networks with late-phase weights. In 9th International Conference on Learning Representations (ICLR).
  54. 54Wen, Y., Tran, D., and Ba, J. (2020). BatchEnsemble: An alternative approach to efficient ensemble and lifelong learning. In 8th International Conference on Learning Representations (ICLR).
  55. 55Wenzel, F., Snoek, J., Tran, D., and Jenatton, R. (2020). Hyperparameter ensembles for robustness and uncertainty quantification. In 34th International Conference on Neural Information Processing Systems (NeurIPS), pages 65146527.
  56. 56Xia, Y., Zhang, J., Jiang, T., Gong, Z., Yao, W., and Feng, L. (2021). HatchEnsemble: An efficient and practical uncertainty quantification method for deep neural networks. Complex & Intelligent Systems, 7:28552869. DOI: 10.1007/s40747-021-00463-1
  57. 57Zhu, X., Liu, Y., Qin, Z., and Li, J. (2017). Data augmentation in emotion classification using generative adversarial networks. arXiv preprint arXiv:1711.00648. DOI: 10.1007/978-3-319-93040-4_28
  58. 58Zhu, X., Pang, J., Yang, C., Shi, J., and Lin, D. (2019). Adapting object detectors via selective cross-domain alignment. In IEEE/CVF Conference on Computer Vision and Pattern Recognition (CVPR), pages 687696. DOI: 10.1109/CVPR.2019.00078
DOI: https://doi.org/10.5334/tismir.157 | Journal eISSN: 2514-3298
Language: English
Submitted on: Dec 6, 2022
Accepted on: Jul 31, 2023
Published on: Jan 11, 2024
Published by: Ubiquity Press
In partnership with: Paradigm Publishing Services
Publication frequency: 1 issue per year

© 2024 Lukas Tuggener, Raphael Emberger, Adhiraj Ghosh, Pascal Sager, Yvan Putra Satyawan, Javier Montoya, Simon Goldschagg, Florian Seibold, Urs Gut, Philipp Ackermann, Jürgen Schmidhuber, Thilo Stadelmann, published by Ubiquity Press
This work is licensed under the Creative Commons Attribution 4.0 License.