Have a personal or library account? Click to login
Combination of Resnet and Spatial Pyramid Pooling for Musical Instrument Identification Cover

Combination of Resnet and Spatial Pyramid Pooling for Musical Instrument Identification

Open Access
|Apr 2022

References

  1. 1. Ribeiro, A. C. M., R. C. Scharlach, M. M. C. Pinheiro. Assessment of Temporal Aspects in Popular Singers. – CODAS, Vol. 27, 2015. https://doi.org/10.1590/2317-1782/2015201423410.1590/2317-1782/2015201423426691615
  2. 2. Bai, T., Y. Pang, J. Wang, K. Han, J. Luo, H. Wang, J. Lin, J. Wu, H. Zhang. An Optimized Faster R-CNN Method Based on DRNet and RoI Align for Building Detection in Remote Sensing Images. – Remote Sens., Vol. 12, 2020. https://doi.org/10.3390/rs1205076210.3390/rs12050762
  3. 3. Wetzel, J., A. Laubenheimer, M. Heizmann. Joint Probabilistic People Detection in Overlapping Depth Images. – IEEE Access, Vol. 8, 2020. https://doi.org/10.1109/ACCESS.2020.297205510.1109/ACCESS.2020.2972055
  4. 4. Dewi, C., R. C. Chen, H. Yu. Weight Analysis for Various Prohibitory Sign Detection and Recognition Using Deep Learning. Multimed. – Tools Appl. Vol. 79, 2020, pp. 32897-32915. https://doi.org/10.1007/s11042-020-09509-x10.1007/s11042-020-09509-x
  5. 5. Xi, X., Z. Yu, Z. Zhan, Y. Yin, C. Tian. Multi-Task Cost-Sensitive-Convolutional Neural Network for Car Detection. – IEEE Access, Vol. 7, 2019. https://doi.org/10.1109/ACCESS.2019.292786610.1109/ACCESS.2019.2927866
  6. 6. Dewi, C., R. C. Chen, Y. T. Liu. Wasserstein Generative Adversarial Networks for Realistic Traffic Sign Image Generation. – In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2021, pp. 479-493. https://doi.org/10.1007/978-3-030-73280-6_3810.1007/978-3-030-73280-6_38
  7. 7. Ju, M., S. Moon, C. D. Yoo. Object Detection for Similar Appearance Objects Based on Entropy. – In: Proc. of 7th International Conference on Robot Intelligence Technology and Applications (RiTA’19), 2019. https://doi.org/10.1109/RITAPP.2019.893279110.1109/RITAPP.2019.8932791
  8. 8. Jiang, Y., L. Chen, H. Zhang, X. Xiao. Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks with Small SE-ResNet Module. – PLoS One, Vol. 14, 2019. https://doi.org/10.1371/journal.pone.021458710.1371/journal.pone.0214587644062030925170
  9. 9. Yu, X., C. Kang, D. S. Guttery, S. Kadry, Y. Chen, Y. D. Zhang. ResNet-SCDA-50 for Breast Abnormality Classification. IEEE/ACM Trans. – Comput. Biol. Bioinforma, Vol. 18, 2021. https://doi.org/10.1109/TCBB.2020.298654410.1109/TCBB.2020.298654432287004
  10. 10. Yao, B., L. Fei-Fei. Grouplet: A Structured Image Representation for Recognizing Human and Object Interactions. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010. https://doi.org/10.1109/CVPR.2010.554023410.1109/CVPR.2010.5540234
  11. 11. Zhang, X., F. Wan, C. Liu, X. Ji, Q. Ye. Learning to Match Anchors for Visual Object Detection. – IEEE Trans. Pattern Anal. Mach. Intell., 2021. https://doi.org/10.1109/TPAMI.2021.305049410.1109/TPAMI.2021.305049433434120
  12. 12. Girshick, R. Fast R-CNN. – In: Proc. of IEEE International Conference on Computer Vision, 2015, pp. 1440-1448. https://doi.org/10.1109/ICCV.2015.16910.1109/ICCV.2015.169
  13. 13. Cheng, G., Y. Si, H. Hong, X. Yao, L. Guo. Cross-Scale Feature Fusion for Object Detection in Optical Remote Sensing Images. – IEEE Geosci. Remote Sens. Lett., Vol. 18, 2021. https://doi.org/10.1109/LGRS.2020.297554110.1109/LGRS.2020.2975541
  14. 14. Redmon, J., S. Divvala, R. Girshick, A. Farhadi. You Only Look Once: Unified, Real-Time Object Detection. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, pp. 779-788. https://doi.org/10.1109/CVPR.2016.9110.1109/CVPR.2016.91
  15. 15. Liu, W., D. Anguelov, D. Erhan, C. Szegedy, S. Reed, C. Y. Fu, A. C. Berg. SSD: Single Shot Multibox Detector. – In: Lecture Notes in Computer Science (Including Subseries Lecture Notes in Artificial Intelligence and Lecture Notes in Bioinformatics), 2016, pp. 21-37. https://doi.org/10.1007/978-3-319-46448-0_210.1007/978-3-319-46448-0_2
  16. 16. Srinivasan, K., P. Balamurugan, V. R. Azhaguramyaa. Survey on Similar Object Detection in H.264 Compressed Video. – In: Proc. of 2017 International Conference on Algorithms, Methodology, Models and Applications in Emerging Technologies (ICAMMAET’17), 2017. https://doi.org/10.1109/ICAMMAET.2017.818666310.1109/ICAMMAET.2017.8186663
  17. 17. Grauman, K., T. Darrell. The Pyramid Match Kernel: Discriminative Classification with Sets of Image Features. – In: Proc. of IEEE International Conference on Computer Vision, 2005, pp. 1458-1465. https://doi.org/10.1109/ICCV.2005.23910.1109/ICCV.2005.239
  18. 18. Lazebnik, S., C. Schmid, J. Ponce. Beyond Bags of Features: Spatial Pyramid Matching for Recognizing Natural Scene Categories. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2006, pp. 1-8. https://doi.org/10.1109/CVPR.2006.6810.1109/CVPR.2006.68
  19. 19. Dai, J., Y. Li, K. He, J. Sun. R-FCN: Object Detection via Region-Based Fully Convolutional Networks. – In: Advances in Neural Information Processing Systems, 2016, pp. 379-387.
  20. 20. Sivic, J., A. Zisserman. Video Google: A Text Retrieval Approach to Object Matching in Videos. – In: Proc. of IEEE International Conference on Computer Vision, 2003, pp. 1-8. https://doi.org/10.1109/iccv.2003.123866310.1109/ICCV.2003.1238663
  21. 21. Yang, J., K. Yu, Y. Gong, T. Huang. Linear Spatial Pyramid Matching Using Sparse Coding for Image Classification. – In: Proc. of 2009 IEEE Computer Society Conference on Computer Vision and Pattern Recognition Workshops, CVPR Workshops 2009, 2009, pp. 1794-1801. https://doi.org/10.1109/CVPRW.2009.520675710.1109/CVPR.2009.5206757
  22. 22. Wang, J., J. Yang, K. Yu, F. Lv, T. Huang, Y. Gong. Locality-Constrained Linear Coding for Image Classification. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2010, pp. 3360-3367. https://doi.org/10.1109/CVPR.2010.554001810.1109/CVPR.2010.5540018
  23. 23. Van de Sande, K. E. A., J. R. R. Uijlings, T. Gevers, A. W. M. Smeulders. Segmentation as Selective Search for Object Recognition. – In: Proc. of IEEE International Conference on Computer Vision, 2011, pp. 1879-1886. https://doi.org/10.1109/ICCV.2011.612645610.1109/ICCV.2011.6126456
  24. 24. He, K., X. Zhang, S. Ren, J. Sun. Spatial Pyramid Pooling in Deep Convolutional Networks for Visual Recognition. – IEEE Trans. Pattern Anal. Mach. Intell., Vol. 37, 2015, pp. 1904-1916. https://doi.org/10.1109/TPAMI.2015.238982410.1109/TPAMI.2015.238982426353135
  25. 25. He, K., X. Zhang, S. Ren, J. Sun. Deep Residual Learning for Image Recognition. – In: Proc. of IEEE Computer Society Conference on Computer Vision and Pattern Recognition, 2016, pp. 770-778. https://doi.org/10.1109/CVPR.2016.9010.1109/CVPR.2016.90
  26. 26. Chander, G., B. L. Markham, D. L. Helder. Summary of Current Radiometric Calibration Coefficients for Landsat MSS, TM, ETM+, and EO-1 ALI Sensors. – Remote Sens. Environ., Vol. 113, 2009, pp. 893-903. https://doi.org/10.1016/j.rse.2009.01.00710.1016/j.rse.2009.01.007
  27. 27. Fang, W., C. Wang, X. Chen, W. Wan, H. Li, S. Zhu, Y. Fang, B. Liu, Y. Hong. Recognizing Global Reservoirs from Landsat 8 Images: A Deep Learning Approach. – IEEE J. Sel. Top. Appl. Earth Obs. Remote Sens., Vol. 12, 2019, pp. 3168-3177. https://doi.org/10.1109/jstars.2019.292960110.1109/JSTARS.2019.2929601
  28. 28. Ibrahim, Y., H. Wang, M. Bai, Z. Liu, J. Wang, Z. Yang, Z. Chen. Soft Error Resilience of Deep Residual Networks for Object Recognition. – IEEE Access, Vol. 8, 2020, pp. 19490-19503. https://doi.org/10.1109/ACCESS.2020.296812910.1109/ACCESS.2020.2968129
  29. 29. Wen, L., X. Li, L. Gao. A Transfer Convolutional Neural Network for Fault Diagnosis Based on ResNet-50. – Neural Comput. Appl., Vol. 32, 2020. https://doi.org/10.1007/s00521-019-04097-w10.1007/s00521-019-04097-w
  30. 30. Fulton, L. V., D. Dolezel, J. Harrop, Y. Yan, C. P. Fulton. Classification of Alzheimer’s Disease with and without Imagery Using Gradient Boosted Machines and Resnet-50. – Brain Sci., Vol. 9, 2019. https://doi.org/10.3390/brainsci909021210.3390/brainsci9090212677093831443556
  31. 31. Dewi, C., R.-C. Chen, Y.-T. Liu, S.-K. Tai. Synthetic Data Generation Using DCGAN for Improved Traffic Sign Recognition. – Neural Comput. Appl., Vol. 33, 2021, pp. 1-15.10.1007/s00521-021-05982-z
  32. 32. Arcos-García, Á., J. A. Álvarez-García, L. M. Soria-Morillo. Evaluation of Deep Neural Networks for Traffic Sign Detection Systems. – Neurocomputing., Vol. 316, 2018, pp. 332-344. https://doi.org/10.1016/j.neucom.2018.08.00910.1016/j.neucom.2018.08.009
  33. 33. Dewi, C., R. C. Chen, H. Yu, X. Jiang. Robust Detection Method for Improving Small Traffic Sign Recognition Based on Spatial Pyramid Pooling. – J. Ambient Intell. Humaniz. Comput., Vol. 12, 2021. https://doi.org/10.1007/s12652-021-03584-010.1007/s12652-021-03584-0
  34. 34. Yang, H., L. Chen, M. Chen, Z. Ma, F. Deng, M. Li, X. Li. Tender Tea Shoots Recognition and Positioning for Picking Robot Using Improved YOLO-V3 Model. – IEEE Access., Vol. 7, 2019, pp. 180998-181011. https://doi.org/10.1109/ACCESS.2019.295861410.1109/ACCESS.2019.2958614
  35. 35. Tian, Y., G. Yang, Z. Wang, H. Wang, E. Li, Z. Liang. Apple Detection During Different Growth Stages in Orchards Using the Improved YOLO-V3 Model. – Comput. Electron. Agric., Vol. 157, 2019, pp. 417-426. https://doi.org/10.1016/j.compag.2019.01.01210.1016/j.compag.2019.01.012
DOI: https://doi.org/10.2478/cait-2022-0007 | Journal eISSN: 1314-4081 | Journal ISSN: 1311-9702
Language: English
Page range: 104 - 116
Submitted on: Nov 16, 2021
Accepted on: Feb 25, 2022
Published on: Apr 10, 2022
Published by: Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2022 Christine Dewi, Rung-Ching Chen, published by Bulgarian Academy of Sciences, Institute of Information and Communication Technologies
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.