{"id":761,"date":"2020-02-03T16:14:11","date_gmt":"2020-02-04T00:14:11","guid":{"rendered":"http:\/\/209.126.2.187\/?p=761"},"modified":"2020-02-03T16:14:11","modified_gmt":"2020-02-04T00:14:11","slug":"compute-capabilities-and-thoughputs-on-nvidias-gpus","status":"publish","type":"post","link":"https:\/\/nanzhou.cc\/index.php\/2020\/02\/03\/compute-capabilities-and-thoughputs-on-nvidias-gpus\/","title":{"rendered":"Compute Capabilities and Thoughputs on NVIDIA&#8217;s GPUs"},"content":{"rendered":"<h2>Summary<\/h2>\n<p>In this post, I will introduce the thoughputs and compute capabilities on NVIDIA&#8217;s GPUs. The post doesn&#8217;t contain hardware details.<\/p>\n<h2>Conclusion<\/h2>\n<p>It might be a common sense that half precision floats will run faster on GPUs, like <a href=\"https:\/\/software.intel.com\/en-us\/articles\/performance-benefits-of-half-precision-floats\">this post<\/a> by Intel. <\/p>\n<p><img decoding=\"async\" src=\"https:\/\/software.intel.com\/sites\/default\/files\/m\/5\/e\/2\/8\/5\/45632-f-4.jpg\" alt=\"Intel Half Precision Floats\" title=\"Intel Half Precision Floats\" \/><\/p>\n<p>However, it is a different story on NVIDIA&#8217;s GPUs. For example, you may find that the GeForce 10 series have high GFlops using single precision floats, but poor GFlops using half floats. <\/p>\n<p><img decoding=\"async\" src=\"http:\/\/161.97.122.139\/wp-content\/uploads\/2020\/02\/Screenshot-from-2020-02-03-16-00-57.png\" alt=\"\" \/><\/p>\n<p>But GeForce 20 series increase the performance of half floats. <\/p>\n<p><img decoding=\"async\" src=\"http:\/\/161.97.122.139\/wp-content\/uploads\/2020\/02\/Screenshot-from-2020-02-03-16-05-09.png\" alt=\"\" \/><\/p>\n<p>Actually compute capabilities defines thoughputs. This version numbers of compute capabilities identify the features supported by the GPU hardware and are used by applications at runtime to determine which hardware features and\/or instructions are available on the present GPUs.<\/p>\n<p><img decoding=\"async\" src=\"http:\/\/161.97.122.139\/wp-content\/uploads\/2020\/02\/Screenshot-from-2020-02-03-16-09-39-1024x141.png\" alt=\"\" \/><\/p>\n<p>See more details <a href=\"https:\/\/docs.nvidia.com\/cuda\/cuda-c-programming-guide\/index.html#arithmetic-instructions__throughput-native-arithmetic-instructions\">in this official doc<\/a>. <\/p>\n","protected":false},"excerpt":{"rendered":"<p>Summary In this post, I will introduce the thoughputs and compute capabilities on NVIDIA&#8217;s GPUs. The post doesn&#8217;t contain hardware details. Conclusion It might be a common sense that half precision floats will run faster on GPUs, like this post by Intel. However, it is a different story on NVIDIA&#8217;s GPUs. For example, you may&#8230;<\/p>\n","protected":false},"author":1,"featured_media":0,"comment_status":"open","ping_status":"open","sticky":false,"template":"","format":"standard","meta":{"footnotes":""},"categories":[29,31],"tags":[],"class_list":["post-761","post","type-post","status-publish","format-standard","hentry","category-gpu","category-parallel-computation"],"_links":{"self":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts\/761","targetHints":{"allow":["GET"]}}],"collection":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts"}],"about":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/types\/post"}],"author":[{"embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/users\/1"}],"replies":[{"embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/comments?post=761"}],"version-history":[{"count":1,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts\/761\/revisions"}],"predecessor-version":[{"id":765,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/posts\/761\/revisions\/765"}],"wp:attachment":[{"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/media?parent=761"}],"wp:term":[{"taxonomy":"category","embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/categories?post=761"},{"taxonomy":"post_tag","embeddable":true,"href":"https:\/\/nanzhou.cc\/index.php\/wp-json\/wp\/v2\/tags?post=761"}],"curies":[{"name":"wp","href":"https:\/\/api.w.org\/{rel}","templated":true}]}}