An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

Zini, Julia El; Rizk, Yara; Awad, Mariette

doi:10.2478/jaiscr-2021-0003

References

[1] Yoshua Bengio, Patrice Simard, Paolo Frasconi, et al. Learning long-term dependencies with gradient descent is difficult. IEEE transactions on neural networks, 5(2):157–166, 1994.10.1109/72.279181
Search in Google Scholar Back to article
[2] Stephen A Billings. Nonlinear system identification: NARMAX methods in the time, frequency, and spatio-temporal domains. John Wiley & Sons, 2013.10.1002/9781118535561
Search in Google Scholar Back to article
[3] Armando Blanco, Miguel Delgado, and Maria C Pegalajar. A real-coded genetic algorithm for training recurrent neural networks. Neural networks, 14(1):93–105, 2001.10.1016/S0893-6080(00)00081-2
Search in Google Scholar Back to article
[4] Kyunghyun Cho, Bart Van Merriënboer, Dzmitry Bahdanau, and Yoshua Bengio. On the properties of neural machine translation: Encoder-decoder approaches. arXiv preprint arXiv:1409.1259, 2014.
Search in Google Scholar Back to article
[5] Kyunghyun Cho, Bart Van Merriënboer, Caglar Gulcehre, Dzmitry Bahdanau, Fethi Bougares, Holger Schwenk, and Yoshua Bengio. Learning phrase representations using rnn encoder-decoder for statistical machine translation. arXiv preprint arXiv:1406.1078, 2014.
Search in Google Scholar Back to article
[6] Jan K Chorowski, Dzmitry Bahdanau, Dmitriy Serdyuk, Kyunghyun Cho, and Yoshua Bengio. Attention-based models for speech recognition. In Advances in neural information processing systems, pages 577–585, 2015.
Search in Google Scholar Back to article
[7] Junyoung Chung, Caglar Gulcehre, KyungHyun Cho, and Yoshua Bengio. Empirical evaluation of gated recurrent neural networks on sequence modeling. arXiv preprint arXiv:1412.3555, 2014.
Search in Google Scholar Back to article
[8] Jerome T Connor, R Douglas Martin, and Les E Atlas. Recurrent neural networks and robust time series prediction. IEEE transactions on neural networks, 5(2):240–254, 1994.10.1109/72.27918818267794
Search in Google Scholar Back to article
[9] Jeffrey L Elman. Finding structure in time. Cognitive science, 14(2):179–211, 1990.10.1207/s15516709cog1402_1
Search in Google Scholar Back to article
[10] Ömer Faruk Ertugrul. Forecasting electricity load by a novel recurrent extreme learning machines approach. International Journal of Electrical Power & Energy Systems, 78:429–435, 2016.10.1016/j.ijepes.2015.12.006
Search in Google Scholar Back to article
[11] Martín Abadi et al. TensorFlow: Large-scale machine learning on heterogeneous systems, 2015. Software available from tensorflow.org.
Search in Google Scholar Back to article
[12] Alex Graves, Navdeep Jaitly, and Abdel-rahman Mohamed. Hybrid speech recognition with deep bidirectional lstm. In 2013 IEEE workshop on automatic speech recognition and understanding, pages 273–278. IEEE, 2013.10.1109/ASRU.2013.6707742
Search in Google Scholar Back to article
[13] Alex Graves, Abdel-rahman Mohamed, and Geoffrey Hinton. Speech recognition with deep recurrent neural networks. In 2013 IEEE international conference on acoustics, speech and signal processing, pages 6645–6649. IEEE, 2013.10.1109/ICASSP.2013.6638947
Search in Google Scholar Back to article
[14] Qing He, Tianfeng Shang, Fuzhen Zhuang, and Zhongzhi Shi. Parallel extreme learning machine for regression based on mapreduce. Neurocomputing, 102:52–58, 2013.10.1016/j.neucom.2012.01.040
Search in Google Scholar Back to article
[15] Sepp Hochreiter and Jürgen Schmidhuber. Long short-term memory. Neural computation, 9(8):1735–1780, 1997.10.1162/neco.1997.9.8.17359377276
Search in Google Scholar Back to article
[16] Guang-Bin Huang, Qin-Yu Zhu, Chee-Kheong Siew, et al. Extreme learning machine: a new learning scheme of feedforward neural networks. Neural networks, 2:985–990, 2004.
Search in Google Scholar Back to article
[17] Shan Huang, Botao Wang, Junhao Qiu, Jitao Yao, Guoren Wang, and Ge Yu. Parallel ensemble of online sequential extreme learning machine based on mapreduce. Neurocomputing, 174:352–367, 2016.10.1016/j.neucom.2015.04.105
Search in Google Scholar Back to article
[18] Weikuan Jia, Dean Zhao, Yuanjie Zheng, and Sujuan Hou. A novel optimized ga–elman neural network algorithm. Neural Computing and Applications, 31(2):449–459, 2019.10.1007/s00521-017-3076-7
Search in Google Scholar Back to article
[19] Michael I Jordan. Serial order: A parallel distributed processing approach. In Advances in psychology, volume 121, pages 471–495. Elsevier, 1997.10.1016/S0166-4115(97)80111-2
Search in Google Scholar Back to article
[20] Viacheslav Khomenko, Oleg Shyshkov, Olga Radyvonenko, and Kostiantyn Bokhan. Accelerating recurrent neural network training using sequence bucketing and multi-gpu data parallelization. In IEEE First International Conference on Data Stream Mining & Processing, pages 100–103. IEEE, 2016.10.1109/DSMP.2016.7583516
Search in Google Scholar Back to article
[21] Siu Kwan Lam, Antoine Pitrou, and Stanley Seibert. Numba: A llvm-based python jit compiler. In Proceedings of the second Workshop on the LLVM Compiler Infrastructure in HPC, pages 1–6. ACM, 2015.
Search in Google Scholar Back to article
[22] Yann LeCun, Yoshua Bengio, and Geoffrey Hinton. Deep learning. Nature, 521(7553):436, 2015.10.1038/nature14539
Search in Google Scholar Back to article
[23] Jun Liu, Amir Shahroudy, Dong Xu, and Gang Wang. Spatio-temporal lstm with trust gates for 3d human action recognition. In European Conference on Computer Vision, pages 816–833. Springer, 2016.10.1007/978-3-319-46487-9_50
Search in Google Scholar Back to article
[24] Jun Liu, Gang Wang, Ling-Yu Duan, Kamila Abdiyeva, and Alex C Kot. Skeleton-based human action recognition with global context-aware attention lstm networks. IEEE Transactions on Image Processing, 27(4):1586–1599, 2017.10.1109/TIP.2017.2785279
Search in Google Scholar Back to article
[25] James Martens and Ilya Sutskever. Learning recurrent neural networks with hessian-free optimization. In Proceedings of the 28th International Conference on Machine Learning (ICML-11), pages 1033–1040. Citeseer, 2011.
Search in Google Scholar Back to article
[26] Travis Oliphant. Guide to NumPy. 01 2006.10.1142/S1793048006000136
Search in Google Scholar Back to article
[27] Peng Ouyang, Shouyi Yin, and Shaojun Wei. A fast and power efficient architecture to parallelize lstm based rnn for cognitive intelligence applications. In Proceedings of the 54th Annual Design Automation Conference 2017, pages 1–6. ACM, 2017.10.1145/3061639.3062187
Search in Google Scholar Back to article
[28] Yoh-Han Pao, Gwang-Hoon Park, and Dejan J Sobajic. Learning and generalization characteristics of the random vector functional-link net. Neurocomputing, 6(2):163–180, 1994.10.1016/0925-2312(94)90053-1
Search in Google Scholar Back to article
[29] Jin-Man Park and Jong-Hwan Kim. Online recurrent extreme learning machine and its application to time-series prediction. In 2017 International Joint Conference on Neural Networks (IJCNN), pages 1983–1990. IEEE, 2017.10.1109/IJCNN.2017.7966094
Search in Google Scholar Back to article
[30] Yara Rizk and Mariette Awad. On extreme learning machines in sequential and time series prediction: A non-iterative and approximate training algorithm for recurrent neural networks. Neurocomputing, 325:1–19, 2019.10.1016/j.neucom.2018.09.012
Search in Google Scholar Back to article
[31] Jürgen Schmidhuber. Deep learning in neural networks: An overview. Neural networks, 61:85–117, 2015.10.1016/j.neunet.2014.09.003
Search in Google Scholar Back to article
[32] Wouter F Schmidt, Martin A Kraaijveld, and Robert PW Duin. Feedforward neural networks with random weights. In 11th IAPR International Conference on Pattern Recognition. Vol. II. Conference B: Pattern Recognition Methodology and Systems, pages 1–4. IEEE, 1992.
Search in Google Scholar Back to article
[33] Xavier Sierra-Canto, Francisco Madera-Ramirez, and Victor Uc-Cetina. Parallel training of a back-propagation neural network using cuda. In 2010 Ninth International Conference on Machine Learning and Applications, pages 307–312. IEEE, 2010.10.1109/ICMLA.2010.52
Search in Google Scholar Back to article
[34] Zhiyuan Tang, Ying Shi, Dong Wang, Yang Feng, and Shiyue Zhang. Memory visualization for gated recurrent neural networks in speech recognition. In 2017 IEEE International Conference on Acoustics, Speech and Signal Processing (ICASSP), pages 2736–2740. IEEE, 2017.10.1109/ICASSP.2017.7952654
Search in Google Scholar Back to article
[35] Hubert AB Te Braake and Gerrit Van Straten. Random activation weight neural net (rawn) for fast non-iterative training. Engineering Applications of Artificial Intelligence, 8(1):71–80, 1995.10.1016/0952-1976(94)00056-S
Search in Google Scholar Back to article
[36] Mark Van Heeswijk, Yoan Miche, Erkki Oja, and Amaury Lendasse. Gpu-accelerated and parallelized elm ensembles for large-scale regression. Neurocomputing, 74(16):2430–2437, 2011.10.1016/j.neucom.2010.11.034
Search in Google Scholar Back to article
[37] Botao Wang, Shan Huang, Junhao Qiu, Yu Liu, and Guoren Wang. Parallel online sequential extreme learning machine based on mapreduce. Neurocomputing, 149:224–232, 2015.10.1016/j.neucom.2014.03.076
Search in Google Scholar Back to article
[38] Shang Wang, Yifan Bai, and Gennady Pekhimenko. Scaling back-propagation by parallel scan algorithm. arXiv preprint arXiv:1907.10134, 2019.
Search in Google Scholar Back to article
[39] Xiaoyu Wang and Yong Huang. Convergence study in extended kalman filter-based training of recurrent neural networks. IEEE Transactions on Neural Networks, 22(4):588–600, 2011.10.1109/TNN.2011.210973721402512
Search in Google Scholar Back to article
[40] Paul J Werbos et al. Backpropagation through time: what it does and how to do it. Proceedings of the IEEE, 78(10):1550–1560, 1990.10.1109/5.58337
Search in Google Scholar Back to article
[41] Ronald J Williams and David Zipser. Gradient-based learning algorithms for recurrent. Backpropagation: Theory, architectures, and applications, 433, 1995.
Search in Google Scholar Back to article
[42] Yonghui Wu, Mike Schuster, Zhifeng Chen, Quoc V Le, Mohammad Norouzi, Wolfgang Macherey, Maxim Krikun, Yuan Cao, Qin Gao, Klaus Macherey, et al. Google’s neural machine translation system: Bridging the gap between human and machine translation. arXiv preprint arXiv:1609.08144, 2016.
Search in Google Scholar Back to article
[43] Feng Zhang, Jidong Zhai, Marc Snir, Hai Jin, Hironori Kasahara, and Mateo Valero. Guest editorial: Special issue on network and parallel computing for emerging architectures and applications, 2019.10.1007/s10766-019-00634-1
Search in Google Scholar Back to article
[44] Shunlu Zhang, Pavan Gunupudi, and Qi-Jun Zhang. Parallel back-propagation neural network training technique using cuda on multiple gpus. In IEEE MTT-S International Conference on Numerical Electromagnetic and Multiphysics Modeling and Optimization, pages 1–3. IEEE, 2015.10.1109/NEMO.2015.7415056
Search in Google Scholar Back to article

An Optimized Parallel Implementation of Non-Iteratively Trained Recurrent Neural Networks

References

Paradigm

My account