Have a personal or library account? Click to login
An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks Cover

An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

Open Access
|Dec 2020

References

  1. [1] Yoshua Bengio, Patrice Simard, Paolo Frasconi, et al. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994.10.1109/72.279181
  2. [2] Stephen A Billings. Nonlinear system identification: NARMAX methods in the time, frequency, and spatio-temporal domains. John Wiley & Sons, 2013.10.1002/9781118535561
  3. [3] Armando Blanco, Miguel Delgado, and Maria C Pegalajar. A real-coded genetic algorithm for training recurrent neural networks. Neural networks, 14(1):93–105, 2001.10.1016/S0893-6080(00)00081-2
  4. [4] Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
  5. [5] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
  6. [6] Jan K Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. Attention-based models for speech recognition. In Advances in neural information processing systems, pages 577–585, 2015.
  7. [7] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
  8. [8] Jerome T Connor, R Douglas Martin, and Les E Atlas. Recurrent neural networks and robust time series prediction. IEEE transactions on neural networks, 5(2):240–254, 1994.10.1109/72.27918818267794
  9. [9] Jeffrey L Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.10.1207/s15516709cog1402_1
  10. [10] Ömer Faruk Ertugrul. Forecasting electricity load by a novel recurrent extreme learning machines approach. International Journal of Electrical Power & Energy Systems, 78:429–435, 2016.10.1016/j.ijepes.2015.12.006
  11. [11] Martín Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
  12. [12] Alex Graves, Navdeep Jaitly, and Abdel-rahman Mohamed. Hybrid speech recognition with deep bidirectional lstm. In 2013 IEEE workshop on automatic speech recognition and understanding, pages 273–278. IEEE, 2013.10.1109/ASRU.2013.6707742
  13. [13] Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645–6649. IEEE, 2013.10.1109/ICASSP.2013.6638947
  14. [14] Qing He, Tianfeng Shang, Fuzhen Zhuang, and Zhongzhi Shi. Parallel extreme learning machine for regression based on mapreduce. Neurocomputing, 102:52–58, 2013.10.1016/j.neucom.2012.01.040
  15. [15] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.10.1162/neco.1997.9.8.17359377276
  16. [16] Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew, et al. Extreme learning machine: a new learning scheme of feedforward neural networks. Neural networks, 2:985–990, 2004.
  17. [17] Shan Huang, Botao Wang, Junhao Qiu, Jitao Yao, Guoren Wang, and Ge Yu. Parallel ensemble of online sequential extreme learning machine based on mapreduce. Neurocomputing, 174:352–367, 2016.10.1016/j.neucom.2015.04.105
  18. [18] Weikuan Jia, Dean Zhao, Yuanjie Zheng, and Sujuan Hou. A novel optimized ga–elman neural network algorithm. Neural Computing and Applications, 31(2):449–459, 2019.10.1007/s00521-017-3076-7
  19. [19] Michael I Jordan. Serial order: A parallel distributed processing approach. In Advances in psychology, volume 121, pages 471–495. Elsevier, 1997.10.1016/S0166-4115(97)80111-2
  20. [20] Viacheslav Khomenko, Oleg Shyshkov, Olga Radyvonenko, and Kostiantyn Bokhan. Accelerating recurrent neural network training using sequence bucketing and multi-gpu data parallelization. In IEEE First International Conference on Data Stream Mining & Processing, pages 100–103. IEEE, 2016.10.1109/DSMP.2016.7583516
  21. [21] Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. Numba: A llvm-based python jit compiler. In Proceedings of the second Workshop on the LLVM Compiler Infrastructure in HPC, pages 1–6. ACM, 2015.
  22. [22] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.10.1038/nature14539
  23. [23] Jun Liu, Amir Shahroudy, Dong Xu, and Gang Wang. Spatio-temporal lstm with trust gates for 3d human action recognition. In European Conference on Computer Vision, pages 816–833. Springer, 2016.10.1007/978-3-319-46487-9_50
  24. [24] Jun Liu, Gang Wang, Ling-Yu Duan, Kamila Abdiyeva, and Alex C Kot. Skeleton-based human action recognition with global context-aware attention lstm networks. IEEE Transactions on Image Processing, 27(4):1586–1599, 2017.10.1109/TIP.2017.2785279
  25. [25] James Martens and Ilya Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1033–1040. Citeseer, 2011.
  26. [26] Travis Oliphant. Guide to NumPy. 01 2006.10.1142/S1793048006000136
  27. [27] Peng Ouyang, Shouyi Yin, and Shaojun Wei. A fast and power efficient architecture to parallelize lstm based rnn for cognitive intelligence applications. In Proceedings of the 54th Annual Design Automation Conference 2017, pages 1–6. ACM, 2017.10.1145/3061639.3062187
  28. [28] Yoh-Han Pao, Gwang-Hoon Park, and Dejan J Sobajic. Learning and generalization characteristics of the random vector functional-link net. Neurocomputing, 6(2):163–180, 1994.10.1016/0925-2312(94)90053-1
  29. [29] Jin-Man Park and Jong-Hwan Kim. Online recurrent extreme learning machine and its application to time-series prediction. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 1983–1990. IEEE, 2017.10.1109/IJCNN.2017.7966094
  30. [30] Yara Rizk and Mariette Awad. On extreme learning machines in sequential and time series prediction: A non-iterative and approximate training algorithm for recurrent neural networks. Neurocomputing, 325:1–19, 2019.10.1016/j.neucom.2018.09.012
  31. [31] Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.10.1016/j.neunet.2014.09.003
  32. [32] Wouter F Schmidt, Martin A Kraaijveld, and Robert PW Duin. Feedforward neural networks with random weights. In 11th IAPR International Conference on Pattern Recognition. Vol. II. Conference B: Pattern Recognition Methodology and Systems, pages 1–4. IEEE, 1992.
  33. [33] Xavier Sierra-Canto, Francisco Madera-Ramirez, and Victor Uc-Cetina. Parallel training of a back-propagation neural network using cuda. In 2010 Ninth International Conference on Machine Learning and Applications, pages 307–312. IEEE, 2010.10.1109/ICMLA.2010.52
  34. [34] Zhiyuan Tang, Ying Shi, Dong Wang, Yang Feng, and Shiyue Zhang. Memory visualization for gated recurrent neural networks in speech recognition. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2736–2740. IEEE, 2017.10.1109/ICASSP.2017.7952654
  35. [35] Hubert AB Te Braake and Gerrit Van Straten. Random activation weight neural net (rawn) for fast non-iterative training. Engineering Applications of Artificial Intelligence, 8(1):71–80, 1995.10.1016/0952-1976(94)00056-S
  36. [36] Mark Van Heeswijk, Yoan Miche, Erkki Oja, and Amaury Lendasse. Gpu-accelerated and parallelized elm ensembles for large-scale regression. Neurocomputing, 74(16):2430–2437, 2011.10.1016/j.neucom.2010.11.034
  37. [37] Botao Wang, Shan Huang, Junhao Qiu, Yu Liu, and Guoren Wang. Parallel online sequential extreme learning machine based on mapreduce. Neurocomputing, 149:224–232, 2015.10.1016/j.neucom.2014.03.076
  38. [38] Shang Wang, Yifan Bai, and Gennady Pekhimenko. Scaling back-propagation by parallel scan algorithm. arXiv preprint arXiv:1907.10134, 2019.
  39. [39] Xiaoyu Wang and Yong Huang. Convergence study in extended kalman filter-based training of recurrent neural networks. IEEE Transactions on Neural Networks, 22(4):588–600, 2011.10.1109/TNN.2011.210973721402512
  40. [40] Paul J Werbos et al. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10):1550–1560, 1990.10.1109/5.58337
  41. [41] Ronald J Williams and David Zipser. Gradient-based learning algorithms for recurrent. Backpropagation: Theory, architectures, and applications, 433, 1995.
  42. [42] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
  43. [43] Feng Zhang, Jidong Zhai, Marc Snir, Hai Jin, Hironori Kasahara, and Mateo Valero. Guest editorial: Special issue on network and parallel computing for emerging architectures and applications, 2019.10.1007/s10766-019-00634-1
  44. [44] Shunlu Zhang, Pavan Gunupudi, and Qi-Jun Zhang. Parallel back-propagation neural network training technique using cuda on multiple gpus. In IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization, pages 1–3. IEEE, 2015.10.1109/NEMO.2015.7415056
Language: English
Page range: 33 - 50
Submitted on: May 7, 2020
Accepted on: Sep 14, 2020
Published on: Dec 3, 2020
Published by: SAN University
In partnership with: Paradigm Publishing Services
Publication frequency: 4 issues per year

© 2020 Julia El Zini, Yara Rizk, Mariette Awad, published by SAN University
This work is licensed under the Creative Commons Attribution-NonCommercial-NoDerivatives 4.0 License.