Accelerate the Training Process of BP neural Network with CUDA Technology

Volume 18, Issue 1, pp 1--10 http://dx.doi.org/10.22436/jmcs.018.01.01

Publication Date: December 22, 2017 Submission Date: April 14, 2017

Download PDF

Download XML

2420 Downloads
4169 Views

Authors

Yinfen Xie - School of Information Science and Technology, Linyi University, Linyi, Shandong 276005, P. R. China

Abstract

NVIDIA GPUs is a typical Stream Processor device, and have a high performance of floating-point operations. CUDA uses a bran-new computing architecture, and provides greater computing ability for large scale data computing application than CPU. The learning algorithm of BP neural network has a high compute-intensive and rules, and be very suitable for the Stream Processor architecture. Using CUDA technology, the CUBLAS mathematical library and self-Kernels library, supported by NV Geforce GTX280 as hardware, modify the study algorithm ecome parallel, definite a parallel data structure, and describe the mapping mechanism for computing tasks on CUDA and the key algorithm. Compare the parallel study algorithm achieved on GTX280 with the serial algorithm on CPU in a simulation experiment. Improve the training time by as much as nearly 15 times.

Share and Cite

ISRP Style

Yinfen Xie, Accelerate the Training Process of BP neural Network with CUDA Technology, Journal of Mathematics and Computer Science, 18 (2018), no. 1, 1--10

AMA Style

Xie Yinfen, Accelerate the Training Process of BP neural Network with CUDA Technology. J Math Comput SCI-JM. (2018); 18(1):1--10

Chicago/Turabian Style

Xie, Yinfen. "Accelerate the Training Process of BP neural Network with CUDA Technology." Journal of Mathematics and Computer Science, 18, no. 1 (2018): 1--10

Keywords

Stream processor
GTX280
CUDA
CUBLAS
BP neural network

References

[1] X. Chu, K. Zhao, M. Wang , Massively parallel network coding on GPUs, In Proceedings of IEEE Int’l Symp. On Performance Computing and Communications Conference, Austin, Texas, (2008), 144–151
- View Article
- Google Scholar

[2] K. Huang, Z. Xu, Extensible Parallel Computing: Technique, Structure and Programming, Beijing Industry Press, Beijing (2000)

[3] D. Luebke , CUDA: Scalable parallel programming for high-performance scientific computing , In Proceedings of the 5th IEEE Int’l Symp. on Biomedical Imaging: From Nano to Macro. Paris, France, (2008), 836–838
- View Article
- Google Scholar

[4] NVIDIA Corporation, NVIDIA CUDA Compute Unified Device Architecture Programming Guide 2.0[EB/OL], http://developer.download.nvidia.com/compute/cuda/2 0/docs/NVIDIA CUDA Programming Guide 2.0.pdf. , (2008)

[5] J. D. Owens, M. Houston, D. Luebke, S. Green, J. E. Stone, J. C. Phillips, GPU computing, Proceedings of the IEEE, 96 (2008), 879–899
- View Article
- Google Scholar

[6] M. Wen, Research on key technology of stream architecture, National University of Defense Technology, Changsha, 9 (2006), 1–15

[7] X. Yang, X. Yan, T. Tang , Research and development of Stream Processor technology, Comput. Eng. Sci., 30 (2008), 114–117

[8] J. Yi, Y. Hou, Intelligent control technique, Beijing Industry Press, Beijing (2004)

[9] C. Yu, Y. Tang, To improve the training time of BP neural networks, In Proceedings of IEEE Int’l Symp. on Info-tech and Info-net. Beijing, China, 3 (2001), 473–479
- View Article
- Google Scholar

[10] Y. Zhang, X.-J. Yang, G.-B. Wang, I. Rogers, G. Li, Y. Deng, X.-B. Yan , Scientific computing applications on a stream processor, IEEE International Symposium on Performance Analysis of Systems and software (ISPASS), Austin, TX, USA, (2008), 105–114
- View Article
- Google Scholar