阅读319 返回首页    go iPhone_iPad_Mac_手机_平板_苹果apple


TensorFlow最佳实践__深度学习最佳实践_高性能计算-阿里云

阿里云HPC服务器在交付时已经安装好TensorFlow (版本 0.8rc),用户无需做任何额外工作即可直接运行。

  1. # /disk1/deeplearning/anaconda2/bin/python -m
  2. "tensorflow.models.image.mnist.convolutional"
  3. I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcublas.so locally
  4. I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcudnn.so locally
  5. I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcufft.so locally
  6. I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcuda.so.1 locally
  7. I tensorflow/stream_executor/dso_loader.cc:105] successfully opened CUDA library libcurand.so locally
  8. Successfully downloaded train-images-idx3-ubyte.gz 9912422 bytes.
  9. Successfully downloaded train-labels-idx1-ubyte.gz 28881 bytes.
  10. Successfully downloaded t10k-images-idx3-ubyte.gz 1648877 bytes.
  11. Successfully downloaded t10k-labels-idx1-ubyte.gz 4542 bytes.
  12. Extracting data/train-images-idx3-ubyte.gz
  13. Extracting data/train-labels-idx1-ubyte.gz
  14. Extracting data/t10k-images-idx3-ubyte.gz
  15. Extracting data/t10k-labels-idx1-ubyte.gz
  16. I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 0 with properties:
  17. name: Tesla M40
  18. major: 5 minor: 2 memoryClockRate (GHz) 1.112
  19. pciBusID 0000:06:00.0
  20. Total memory: 11.25GiB
  21. Free memory: 11.09GiB
  22. I tensorflow/stream_executor/cuda/cuda_gpu_executor.cc:900] successful NUMA node read from SysFS had negative value (-1), but there must be at least one NUMA node, so returning NUMA node zero
  23. I tensorflow/core/common_runtime/gpu/gpu_init.cc:102] Found device 1 with properties:
  24. name: Tesla M40
  25. major: 5 minor: 2 memoryClockRate (GHz) 1.112
  26. pciBusID 0000:87:00.0
  27. Total memory: 11.25GiB
  28. Free memory: 11.08GiB
  29. I tensorflow/core/common_runtime/gpu/gpu_init.cc:126] DMA: 0 1
  30. I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 0: Y Y
  31. I tensorflow/core/common_runtime/gpu/gpu_init.cc:136] 1: Y Y
  32. I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:0) -> (device: 0, name: Tesla M40, pci bus id: 0000:06:00.0)
  33. I tensorflow/core/common_runtime/gpu/gpu_device.cc:755] Creating TensorFlow device (/gpu:1) -> (device: 1, name: Tesla M40, pci bus id: 0000:87:00.0)
  34. Initialized!
  35. Step 0 (epoch 0.00), 54.9 ms
  36. Minibatch loss: 12.054, learning rate: 0.010000
  37. Minibatch error: 90.6%
  38. Validation error: 84.6%
  39. Step 100 (epoch 0.12), 7.2 ms
  40. Minibatch loss: 3.269, learning rate: 0.010000
  41. Minibatch error: 6.2%
  42. Validation error: 6.7%
  43. Step 200 (epoch 0.23), 7.1 ms
  44. Minibatch loss: 3.488, learning rate: 0.010000
  45. Minibatch error: 14.1%
  46. Validation error: 3.8%
  47. Step 300 (epoch 0.35), 7.1 ms
  48. Minibatch loss: 3.186, learning rate: 0.010000
  49. Minibatch error: 7.8%
  50. Validation error: 3.3%
  51. Step 400 (epoch 0.47), 7.1 ms
  52. Minibatch loss: 3.229, learning rate: 0.010000
  53. Minibatch error: 9.4%
  54. Validation error: 2.7%
  55. Step 500 (epoch 0.58), 7.1 ms
  56. Minibatch loss: 3.296, learning rate: 0.010000
  57. Minibatch error: 7.8%
  58. Validation error: 2.7%
  59. Step 600 (epoch 0.70), 7.1 ms
  60. Minibatch loss: 3.166, learning rate: 0.010000
  61. Minibatch error: 7.8%
  62. Validation error: 2.8%
  63. Step 700 (epoch 0.81), 7.0 ms
  64. Minibatch loss: 2.999, learning rate: 0.010000
  65. Minibatch error: 3.1%
  66. Validation error: 2.2%
  67. Step 800 (epoch 0.93), 7.0 ms
  68. Minibatch loss: 3.076, learning rate: 0.010000
  69. Minibatch error: 6.2%
  70. Validation error: 2.1%
  71. Step 900 (epoch 1.05), 7.0 ms
  72. Minibatch loss: 2.937, learning rate: 0.009500
  73. Minibatch error: 3.1%
  74. Validation error: 1.5%
  75. Step 1000 (epoch 1.16), 7.0 ms
  76. Minibatch loss: 2.864, learning rate: 0.009500
  77. Minibatch error: 1.6%
  78. Validation error: 1.8%
  79. Step 1100 (epoch 1.28), 7.0 ms
  80. Minibatch loss: 2.812, learning rate: 0.009500
  81. Minibatch error: 0.0%
  82. Validation error: 1.6%
  83. Step 1200 (epoch 1.40), 7.0 ms
  84. Minibatch loss: 2.943, learning rate: 0.009500
  85. Minibatch error: 4.7%
  86. Validation error: 1.5%
  87. Step 1300 (epoch 1.51), 7.0 ms
  88. Minibatch loss: 2.767, learning rate: 0.009500
  89. Minibatch error: 0.0%
  90. Validation error: 1.6%
  91. Step 1400 (epoch 1.63), 7.0 ms
  92. Minibatch loss: 2.787, learning rate: 0.009500
  93. Minibatch error: 3.1%
  94. Validation error: 1.4%
  95. Step 1500 (epoch 1.75), 7.0 ms
  96. Minibatch loss: 2.875, learning rate: 0.009500
  97. Minibatch error: 6.2%
  98. Validation error: 1.2%
  99. Step 1600 (epoch 1.86), 7.0 ms
  100. Minibatch loss: 2.712, learning rate: 0.009500
  101. Minibatch error: 1.6%
  102. Validation error: 1.4%
  103. Step 1700 (epoch 1.98), 7.0 ms
  104. Minibatch loss: 2.652, learning rate: 0.009500
  105. Minibatch error: 0.0%
  106. Validation error: 1.5%
  107. Step 1800 (epoch 2.09), 7.0 ms
  108. Minibatch loss: 2.670, learning rate: 0.009025
  109. Minibatch error: 1.6%
  110. Validation error: 1.4%
  111. Step 1900 (epoch 2.21), 7.0 ms
  112. Minibatch loss: 2.657, learning rate: 0.009025
  113. Minibatch error: 3.1%
  114. Validation error: 1.2%
  115. Step 2000 (epoch 2.33), 7.0 ms
  116. Minibatch loss: 2.648, learning rate: 0.009025
  117. Minibatch error: 3.1%
  118. Validation error: 1.2%
  119. Step 2100 (epoch 2.44), 7.0 ms
  120. Minibatch loss: 2.575, learning rate: 0.009025
  121. Minibatch error: 1.6%
  122. Validation error: 1.1%
  123. Step 2200 (epoch 2.56), 7.0 ms
  124. Minibatch loss: 2.565, learning rate: 0.009025
  125. Minibatch error: 0.0%
  126. Validation error: 1.2%
  127. Step 2300 (epoch 2.68), 7.0 ms
  128. Minibatch loss: 2.561, learning rate: 0.009025
  129. Minibatch error: 1.6%
  130. Validation error: 1.1%
  131. Step 2400 (epoch 2.79), 7.0 ms
  132. Minibatch loss: 2.508, learning rate: 0.009025
  133. Minibatch error: 0.0%
  134. Validation error: 1.1%
  135. Step 2500 (epoch 2.91), 7.0 ms
  136. Minibatch loss: 2.472, learning rate: 0.009025
  137. Minibatch error: 0.0%
  138. Validation error: 1.2%
  139. Step 2600 (epoch 3.03), 7.0 ms
  140. Minibatch loss: 2.460, learning rate: 0.008574
  141. Minibatch error: 0.0%
  142. Validation error: 1.2%
  143. Step 2700 (epoch 3.14), 7.0 ms
  144. Minibatch loss: 2.488, learning rate: 0.008574
  145. Minibatch error: 1.6%
  146. Validation error: 1.0%
  147. Step 2800 (epoch 3.26), 7.0 ms
  148. Minibatch loss: 2.420, learning rate: 0.008574
  149. Minibatch error: 1.6%
  150. Validation error: 1.2%
  151. Step 2900 (epoch 3.37), 7.0 ms
  152. Minibatch loss: 2.433, learning rate: 0.008574
  153. Minibatch error: 4.7%
  154. Validation error: 1.1%
  155. Step 3000 (epoch 3.49), 7.0 ms
  156. Minibatch loss: 2.398, learning rate: 0.008574
  157. Minibatch error: 1.6%
  158. Validation error: 1.2%
  159. Step 3100 (epoch 3.61), 7.0 ms
  160. Minibatch loss: 2.376, learning rate: 0.008574
  161. Minibatch error: 1.6%
  162. Validation error: 1.0%
  163. Step 3200 (epoch 3.72), 7.0 ms
  164. Minibatch loss: 2.335, learning rate: 0.008574
  165. Minibatch error: 0.0%
  166. Validation error: 1.1%
  167. Step 3300 (epoch 3.84), 7.1 ms
  168. Minibatch loss: 2.320, learning rate: 0.008574
  169. Minibatch error: 0.0%
  170. Validation error: 1.2%
  171. Step 3400 (epoch 3.96), 7.1 ms
  172. Minibatch loss: 2.297, learning rate: 0.008574
  173. Minibatch error: 1.6%
  174. Validation error: 1.2%
  175. Step 3500 (epoch 4.07), 7.1 ms
  176. Minibatch loss: 2.274, learning rate: 0.008145
  177. Minibatch error: 0.0%
  178. Validation error: 1.2%
  179. Step 3600 (epoch 4.19), 7.1 ms
  180. Minibatch loss: 2.258, learning rate: 0.008145
  181. Minibatch error: 0.0%
  182. Validation error: 1.0%
  183. Step 3700 (epoch 4.31), 7.1 ms
  184. Minibatch loss: 2.232, learning rate: 0.008145
  185. Minibatch error: 0.0%
  186. Validation error: 0.9%
  187. Step 3800 (epoch 4.42), 7.0 ms
  188. Minibatch loss: 2.244, learning rate: 0.008145
  189. Minibatch error: 1.6%
  190. Validation error: 0.9%
  191. Step 3900 (epoch 4.54), 7.0 ms
  192. Minibatch loss: 2.315, learning rate: 0.008145
  193. Minibatch error: 3.1%
  194. Validation error: 0.9%
  195. Step 4000 (epoch 4.65), 7.0 ms
  196. Minibatch loss: 2.200, learning rate: 0.008145
  197. Minibatch error: 0.0%
  198. Validation error: 1.0%
  199. Step 4100 (epoch 4.77), 7.1 ms
  200. Minibatch loss: 2.201, learning rate: 0.008145
  201. Minibatch error: 1.6%
  202. Validation error: 0.9%
  203. Step 4200 (epoch 4.89), 7.1 ms
  204. Minibatch loss: 2.208, learning rate: 0.008145
  205. Minibatch error: 1.6%
  206. Validation error: 1.1%
  207. Step 4300 (epoch 5.00), 7.1 ms
  208. Minibatch loss: 2.185, learning rate: 0.007738
  209. Minibatch error: 1.6%
  210. Validation error: 1.0%
  211. Step 4400 (epoch 5.12), 7.1 ms
  212. Minibatch loss: 2.147, learning rate: 0.007738
  213. Minibatch error: 1.6%
  214. Validation error: 1.0%
  215. Step 4500 (epoch 5.24), 7.1 ms
  216. Minibatch loss: 2.191, learning rate: 0.007738
  217. Minibatch error: 6.2%
  218. Validation error: 1.0%
  219. Step 4600 (epoch 5.35), 7.1 ms
  220. Minibatch loss: 2.102, learning rate: 0.007738
  221. Minibatch error: 0.0%
  222. Validation error: 0.9%
  223. Step 4700 (epoch 5.47), 7.1 ms
  224. Minibatch loss: 2.102, learning rate: 0.007738
  225. Minibatch error: 1.6%
  226. Validation error: 0.8%
  227. Step 4800 (epoch 5.59), 7.1 ms
  228. Minibatch loss: 2.051, learning rate: 0.007738
  229. Minibatch error: 0.0%
  230. Validation error: 1.0%
  231. Step 4900 (epoch 5.70), 7.1 ms
  232. Minibatch loss: 2.039, learning rate: 0.007738
  233. Minibatch error: 0.0%
  234. Validation error: 0.9%
  235. Step 5000 (epoch 5.82), 7.1 ms
  236. Minibatch loss: 2.134, learning rate: 0.007738
  237. Minibatch error: 3.1%
  238. Validation error: 1.0%
  239. Step 5100 (epoch 5.93), 7.1 ms
  240. Minibatch loss: 2.007, learning rate: 0.007738
  241. Minibatch error: 0.0%
  242. Validation error: 1.0%
  243. Step 5200 (epoch 6.05), 7.1 ms
  244. Minibatch loss: 2.097, learning rate: 0.007351
  245. Minibatch error: 4.7%
  246. Validation error: 0.9%
  247. Step 5300 (epoch 6.17), 7.1 ms
  248. Minibatch loss: 1.987, learning rate: 0.007351
  249. Minibatch error: 0.0%
  250. Validation error: 1.0%
  251. Step 5400 (epoch 6.28), 7.1 ms
  252. Minibatch loss: 1.958, learning rate: 0.007351
  253. Minibatch error: 0.0%
  254. Validation error: 0.8%
  255. Step 5500 (epoch 6.40), 7.1 ms
  256. Minibatch loss: 1.959, learning rate: 0.007351
  257. Minibatch error: 1.6%
  258. Validation error: 0.9%
  259. Step 5600 (epoch 6.52), 7.0 ms
  260. Minibatch loss: 1.929, learning rate: 0.007351
  261. Minibatch error: 0.0%
  262. Validation error: 0.8%
  263. Step 5700 (epoch 6.63), 7.0 ms
  264. Minibatch loss: 1.914, learning rate: 0.007351
  265. Minibatch error: 0.0%
  266. Validation error: 1.0%
  267. Step 5800 (epoch 6.75), 7.1 ms
  268. Minibatch loss: 1.903, learning rate: 0.007351
  269. Minibatch error: 0.0%
  270. Validation error: 0.8%
  271. Step 5900 (epoch 6.87), 7.0 ms
  272. Minibatch loss: 1.887, learning rate: 0.007351
  273. Minibatch error: 0.0%
  274. Validation error: 0.8%
  275. Step 6000 (epoch 6.98), 7.1 ms
  276. Minibatch loss: 1.876, learning rate: 0.007351
  277. Minibatch error: 0.0%
  278. Validation error: 1.0%
  279. Step 6100 (epoch 7.10), 7.0 ms
  280. Minibatch loss: 1.858, learning rate: 0.006983
  281. Minibatch error: 0.0%
  282. Validation error: 0.9%
  283. Step 6200 (epoch 7.21), 7.1 ms
  284. Minibatch loss: 1.843, learning rate: 0.006983
  285. Minibatch error: 0.0%
  286. Validation error: 0.9%
  287. Step 6300 (epoch 7.33), 7.1 ms
  288. Minibatch loss: 1.840, learning rate: 0.006983
  289. Minibatch error: 0.0%
  290. Validation error: 0.9%
  291. ^^Step 6400 (epoch 7.45), 7.1 ms
  292. Minibatch loss: 1.871, learning rate: 0.006983
  293. Minibatch error: 3.1%
  294. Validation error: 0.8%
  295. Step 6500 (epoch 7.56), 7.1 ms
  296. Minibatch loss: 1.807, learning rate: 0.006983
  297. Minibatch error: 0.0%
  298. Validation error: 0.9%
  299. Step 6600 (epoch 7.68), 7.1 ms
  300. Minibatch loss: 1.810, learning rate: 0.006983
  301. Minibatch error: 1.6%
  302. Validation error: 1.0%
  303. Step 6700 (epoch 7.80), 7.1 ms
  304. Minibatch loss: 1.782, learning rate: 0.006983
  305. Minibatch error: 0.0%
  306. Validation error: 0.8%
  307. Step 6800 (epoch 7.91), 7.1 ms
  308. Minibatch loss: 1.769, learning rate: 0.006983
  309. Minibatch error: 0.0%
  310. Validation error: 0.9%
  311. Step 6900 (epoch 8.03), 7.1 ms
  312. Minibatch loss: 1.762, learning rate: 0.006634
  313. Minibatch error: 0.0%
  314. Validation error: 0.9%
  315. Step 7000 (epoch 8.15), 7.1 ms
  316. Minibatch loss: 1.769, learning rate: 0.006634
  317. Minibatch error: 1.6%
  318. Validation error: 0.9%
  319. Step 7100 (epoch 8.26), 7.0 ms
  320. Minibatch loss: 1.739, learning rate: 0.006634
  321. Minibatch error: 0.0%
  322. Validation error: 0.9%
  323. Step 7200 (epoch 8.38), 7.0 ms
  324. Minibatch loss: 1.757, learning rate: 0.006634
  325. Minibatch error: 1.6%
  326. Validation error: 0.9%
  327. Step 7300 (epoch 8.49), 7.1 ms
  328. Minibatch loss: 1.740, learning rate: 0.006634
  329. Minibatch error: 1.6%
  330. Validation error: 0.8%
  331. Step 7400 (epoch 8.61), 7.0 ms
  332. Minibatch loss: 1.702, learning rate: 0.006634
  333. Minibatch error: 0.0%
  334. Validation error: 0.8%
  335. Step 7500 (epoch 8.73), 7.1 ms
  336. Minibatch loss: 1.696, learning rate: 0.006634
  337. Minibatch error: 0.0%
  338. Validation error: 0.8%
  339. Step 7600 (epoch 8.84), 7.1 ms
  340. Minibatch loss: 1.806, learning rate: 0.006634
  341. Minibatch error: 1.6%
  342. Validation error: 0.8%
  343. Step 7700 (epoch 8.96), 7.1 ms
  344. Minibatch loss: 1.670, learning rate: 0.006634
  345. Minibatch error: 0.0%
  346. Validation error: 0.9%
  347. Step 7800 (epoch 9.08), 7.1 ms
  348. Minibatch loss: 1.657, learning rate: 0.006302
  349. Minibatch error: 0.0%
  350. Validation error: 0.8%
  351. Step 7900 (epoch 9.19), 7.1 ms
  352. Minibatch loss: 1.648, learning rate: 0.006302
  353. Minibatch error: 0.0%
  354. Validation error: 0.9%
  355. Step 8000 (epoch 9.31), 7.1 ms
  356. Minibatch loss: 1.666, learning rate: 0.006302
  357. Minibatch error: 0.0%
  358. Validation error: 0.8%
  359. Step 8100 (epoch 9.43), 7.1 ms
  360. Minibatch loss: 1.626, learning rate: 0.006302
  361. Minibatch error: 0.0%
  362. Validation error: 0.9%
  363. Step 8200 (epoch 9.54), 7.1 ms
  364. Minibatch loss: 1.630, learning rate: 0.006302
  365. Minibatch error: 0.0%
  366. Validation error: 0.9%
  367. Step 8300 (epoch 9.66), 7.1 ms
  368. Minibatch loss: 1.609, learning rate: 0.006302
  369. Minibatch error: 0.0%
  370. Validation error: 0.8%
  371. Step 8400 (epoch 9.77), 7.1 ms
  372. Minibatch loss: 1.598, learning rate: 0.006302
  373. Minibatch error: 0.0%
  374. Validation error: 0.7%
  375. Step 8500 (epoch 9.89), 7.1 ms
  376. Minibatch loss: 1.605, learning rate: 0.006302
  377. Minibatch error: 1.6%
  378. Validation error: 0.9%
  379. Test error: 0.8%

最后更新:2016-11-23 17:16:10

  上一篇:go MXNet 最佳实践__深度学习最佳实践_高性能计算-阿里云
  下一篇:go 名词解释__产品简介_弹性伸缩-阿里云