{
  "version": "https://jsonfeed.org/version/1", 
  "title": "CUDA", 
  "description": "\u6765\u81ea NVIDIA \u7684\u5e76\u884c\u8fd0\u7b97\u6846\u67b6", 
  "home_page_url": "https://www.v2ex.com/go/cuda", 
  "feed_url": "https://www.v2ex.com/feed/cuda.json", 
  "icon": "https://cdn.v2ex.com/navatar/0614/12e4/896_large.png?m=1597325130", 
  "favicon": "https://cdn.v2ex.com/navatar/0614/12e4/896_normal.png?m=1597325130", 
  "items": [
    {
      "author": {
        "url": "https://www.v2ex.com/member/zoe1016aaa", 
        "name": "zoe1016aaa", 
        "avatar": "https://cdn.v2ex.com/avatar/343f/f1da/447597_large.png?m=1769069846"
      }, 
      "url": "https://www.v2ex.com/t/1082256", 
      "title": "[\u82f1\u4f1f\u8fbe] [\u5317\u4eac\u6216\u8005\u4e0a\u6d77] [\u6df1\u5ea6\u5b66\u4e60\u6027\u80fd\u4f18\u5316-CUDA]", 
      "id": "https://www.v2ex.com/t/1082256", 
      "date_published": "2024-10-21T09:05:57+00:00", 
      "content_html": "<p>\u6211\u4eec\u76ee\u524d\u6b63\u5728\u5bfb\u627e\u4e00\u540d\u6df1\u5ea6\u5b66\u4e60\u6027\u80fd\u8f6f\u4ef6\u5de5\u7a0b\u5e08\uff01\u6211\u4eec\u6b63\u5728\u6269\u5c55\u6211\u4eec\u7684\u63a8\u7406\u7814\u7a76\u4e0e\u5f00\u53d1\u3002\u6211\u4eec\u5bfb\u6c42\u4f18\u79c0\u7684\u8f6f\u4ef6\u5de5\u7a0b\u5e08\u548c\u9ad8\u7ea7\u8f6f\u4ef6\u5de5\u7a0b\u5e08\u52a0\u5165\u6211\u4eec\u7684\u56e2\u961f\u3002\u6211\u4eec\u4e13\u6ce8\u4e8e\u5f00\u53d1 GPU \u52a0\u901f\u7684\u6df1\u5ea6\u5b66\u4e60\u8f6f\u4ef6\u3002\u5168\u7403\u7684\u7814\u7a76\u4eba\u5458\u6b63\u5728\u4f7f\u7528 NVIDIA GPU \u63a8\u52a8\u6df1\u5ea6\u5b66\u4e60\u7684\u9769\u547d\uff0c\u8fd9\u5728\u4f17\u591a\u9886\u57df\u5b9e\u73b0\u4e86\u7a81\u7834\u3002\u52a0\u5165\u6211\u4eec\u7684\u56e2\u961f\uff0c\u6784\u5efa\u4f7f\u65b0\u89e3\u51b3\u65b9\u6848\u6210\u4e3a\u53ef\u80fd\u7684\u8f6f\u4ef6\u3002\u4e0e\u6df1\u5ea6\u5b66\u4e60\u793e\u533a\u5408\u4f5c\uff0c\u5728 Tensor-RT \u4e2d\u5b9e\u73b0\u6700\u65b0\u7b97\u6cd5\u7684\u516c\u5f00\u53d1\u5e03\u3002\u6211\u4eec\u9700\u8981\u4f60\u80fd\u591f\u5728\u5feb\u8282\u594f\u3001\u4ee5\u5ba2\u6237\u4e3a\u4e2d\u5fc3\u7684\u56e2\u961f\u4e2d\u5de5\u4f5c\uff0c\u5e76\u4e14\u5177\u5907\u51fa\u8272\u7684\u6c9f\u901a\u6280\u5de7\u3002</p>\n<p>\u4f60\u5c06\u8981\u505a\u7684\u5de5\u4f5c\u5305\u62ec\uff1a</p>\n<ul>\n<li>\u5f00\u53d1\u9ad8\u5ea6\u4f18\u5316\u7684\u63a8\u7406\u6df1\u5ea6\u5b66\u4e60\u5185\u6838</li>\n<li>\u8fdb\u884c\u6027\u80fd\u4f18\u5316\u3001\u5206\u6790\u548c\u8c03\u6574</li>\n<li>\u4e0e\u6c7d\u8f66\u3001\u56fe\u50cf\u7406\u89e3\u548c\u8bed\u97f3\u7406\u89e3\u7b49\u9886\u57df\u7684\u8de8\u534f\u4f5c\u56e2\u961f\u5408\u4f5c\uff0c\u5f00\u53d1\u521b\u65b0\u89e3\u51b3\u65b9\u6848</li>\n<li>\u5076\u5c14\u51fa\u5dee\u53c2\u52a0\u4f1a\u8bae\u548c\u4e3a\u5ba2\u6237\u8fdb\u884c\u6280\u672f\u54a8\u8be2\u548c\u57f9\u8bad</li>\n</ul>\n<p>\u6211\u4eec\u5e0c\u671b\u770b\u5230\u7684\u8d44\u8d28\uff1a</p>\n<ul>\n<li>\u76f8\u5173\u5b66\u79d1\uff08\u8ba1\u7b97\u673a\u5de5\u7a0b\u3001\u8ba1\u7b97\u673a\u79d1\u5b66\u4e0e\u5de5\u7a0b\u3001\u8ba1\u7b97\u673a\u79d1\u5b66\u3001\u4eba\u5de5\u667a\u80fd\uff09\u7684\u7855\u58eb\u6216\u535a\u58eb\u5b66\u4f4d\u6216\u540c\u7b49\u7ecf\u9a8c</li>\n<li>\u6709\u5e2e\u52a9\u7684\u8f6f\u4ef6\u654f\u6377\u6280\u80fd</li>\n<li>\u51fa\u8272\u7684 C/C++\u7f16\u7a0b\u548c\u8f6f\u4ef6\u8bbe\u8ba1\u6280\u80fd</li>\n<li>Python \u7ecf\u9a8c\u8005\u4f18\u5148</li>\n<li>\u5177\u6709\u6027\u80fd\u5efa\u6a21\u3001\u5206\u6790\u3001\u8c03\u8bd5\u548c\u4ee3\u7801\u4f18\u5316\u7684\u77e5\u8bc6\uff0c\u6216\u5bf9 CPU \u548c GPU \u7684\u67b6\u6784\u77e5\u8bc6</li>\n<li>\u5e0c\u671b\u6709 GPU \u7f16\u7a0b\u7ecf\u9a8c\uff08 CUDA \u6216 OpenCL \uff09</li>\n<li>5 \u5e74\u76f8\u5173\u5de5\u4f5c\u7ecf\u9a8c</li>\n</ul>\n<p>\u5982\u679c\u611f\u5174\u8da3\u8bf7\u8054\u7cfb\uff1a</p>\n<p>\u5fae\u4fe1\uff1a18867144803\n\u7b80\u5386\u6295\u9012\uff1a <a href=\"mailto:xiaozhao@nvidia.com\">xiaozhao@nvidia.com</a></p>\n<p>\u8bf7\u5907\u6ce8\u6295\u9012\u7684\u5c97\u4f4d\u65b9\u5411\u5982\uff1a\u59d3\u540d+\u6df1\u5ea6\u5b66\u4e60\u6027\u80fd\u4f18\u5316</p>\n"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/cwjwgg", 
        "name": "cwjwgg", 
        "avatar": "https://cdn.v2ex.com/gravatar/5bf0a4af3e7eca9f77152858ac3dfa26?s=73&d=retro"
      }, 
      "url": "https://www.v2ex.com/t/1050060", 
      "title": "\u8bad\u7ec3 SVC \u58f0\u97f3\u6a21\u578b 2060 12g \u548c 8G \u7684 3060TI \u54ea\u4e2a\u5feb", 
      "id": "https://www.v2ex.com/t/1050060", 
      "date_published": "2024-06-17T01:41:01+00:00", 
      "content_html": "\u6211\u770b 2060 \u62e5\u6709 1920 \u4e2a CUDA \u6838\u5fc3  \u4f46 12G<br />3060TI \u642d\u8f7d\u4e86 4864 \u4e2a CUDA \u6838  8G<br />\u5e94\u8be5\u662f 3060TI \u66f4\u5feb\u5427  \u80fd\u5757\u591a\u5c11\u5462"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/luckyc", 
        "name": "luckyc", 
        "avatar": "https://cdn.v2ex.com/avatar/4105/08ab/73070_large.png?m=1762737894"
      }, 
      "url": "https://www.v2ex.com/t/998477", 
      "title": "win11 \u4f7f\u7528 cuda \u8c03\u7528\u4e24\u4e2a gpu \u8ba1\u7b97\u65f6\uff0c\u81ea\u5e26\u4efb\u52a1\u7ba1\u7406\u5668\u770b\u4e0d\u5230 gpu2 \u7684\u4f7f\u7528\u7387\uff1f", 
      "id": "https://www.v2ex.com/t/998477", 
      "date_published": "2023-12-07T12:53:35+00:00", 
      "content_html": "\u7528 nvdia-smi \u53ef\u770b\u5230\u529f\u8017\u660e\u663e\u589e\u52a0\u4e86\uff0c<br />\u7528 gpu-z \u4e5f\u53ef\u4ee5\u770b\u5230<br /><br />\u662f\u4e0d\u662f win11 \u7684 bug \uff1f"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/TimeNewRome", 
        "name": "TimeNewRome", 
        "avatar": "https://cdn.v2ex.com/gravatar/06d23ac29662f44881e2f64ea971973d?s=73&d=retro"
      }, 
      "url": "https://www.v2ex.com/t/990897", 
      "date_modified": "2023-11-11T03:57:35+00:00", 
      "content_html": "\u201cstatic inline CUresult cuGetProcAddress_v2_ptsz(const char *symbol, void **funcPtr, int driverVersion, cuuint64_t flags, CUdriverProcAddressQueryResult *symbolStatus) {<br />    const int procAddressMask = (CU_GET_PROC_ADDRESS_LEGACY_STREAM|<br />                                 CU_GET_PROC_ADDRESS_PER_THREAD_DEFAULT_STREAM);<br />    if ((flags &amp; procAddressMask) == 0) {<br />        flags |= CU_GET_PROC_ADDRESS_PER_THREAD_DEFAULT_STREAM;<br />    }<br />    return cuGetPr\u201d<br />    <br />\u8fd9\u4e2a\u5c31\u662f\u65b0\u589e\u7684\u51fd\u6570\u7ed3\u6784\u3002\u4f3c\u4e4e\u8ddf\u4e4b\u524d\u7684 cuGetProcAddress \u51fd\u6570\u5dee\u4e0d\u591a\uff0c\u53ea\u662f\u65b0\u589e\u4e86 CUdriverProcAddressQueryResult \u8fd9\u4e2a\u7ed3\u6784\u4f53\u3002\u8bf7\u95ee\u8fd9\u4e2a\u51fd\u6570\u8be5\u5982\u4f55\u52ab\u6301\u5462\uff1f", 
      "date_published": "2023-11-11T03:41:57+00:00", 
      "title": "[cuda \u51fd\u6570\u52ab\u6301] cuda12.2 \u7248\u672c\u65b0\u589e\u4e86\u4e00\u4e2a\u51fd\u6570 cuGetProcAddress_v2\uff0c\u8bf7\u95ee\u5982\u4f55\u8fdb\u884c\u52ab\u6301\uff1f", 
      "id": "https://www.v2ex.com/t/990897"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/GopherDaily", 
        "name": "GopherDaily", 
        "avatar": "https://cdn.v2ex.com/gravatar/89208b8c3493547fe544b2d5142dc131?s=73&d=retro"
      }, 
      "url": "https://www.v2ex.com/t/959842", 
      "title": "Set Max_split_size_mb To Avoid Oom In Pytorch", 
      "id": "https://www.v2ex.com/t/959842", 
      "date_published": "2023-07-26T04:25:01+00:00", 
      "content_html": "<p>\u6211\u4eec\u5728 k8s \u4e2d\u90e8\u7f72\u4e86 <a href=\"https://github.com/AUTOMATIC1111/stable-diffusion-webui\" rel=\"nofollow\">stable-diffusion-webui</a>\n\u4f9b\u4efb\u4f55\u60f3\u8981\u4f53\u9a8c\u7684 Stable Diffusion Model \u7684\u7528\u6237\u4f7f\u7528.\n\u968f\u7740\u4e00\u4e2a\u53c8\u4e00\u4e2a\u7684\u8bf7\u6c42, \u6211\u4eec\u9891\u7e41\u7684\u9047\u5230 CUDA \u7684 OOM \u9519\u8bef.\n\u5176\u4e2d\u7684\u4e00\u5c0f\u90e8\u5206\u786e\u5b9e\u662f\u56e0\u4e3a\u7528\u6237\u8bf7\u6c42\u9700\u8981\u7684\u8d44\u6e90\u8d85\u8fc7\u4e86\u5bf9\u5e94 GPU \u80fd\u591f\u63d0\u4f9b\u7684\u5185\u5b58.</p>\n<p>\u5269\u4e0b\u7684, \u5360\u5927\u90e8\u5206\u7684, \u662f\u7c7b\u4f3c\u5982\u4e0b\u7684\u4ee4\u4eba\u56f0\u60d1\u7684\u573a\u666f.</p>\n<pre><code class=\"language-json\">{\"error\": \"OutOfMemoryError\", \"detail\": \"\", \"body\": \"\", \"errors\": \"CUDA out of memory. Tried to allocate 1024.00 MiB (GPU 0; 11.76 GiB total capacity; 7.92 GiB already allocated; 784.31 MiB free; 10.63 GiB reserved in total by PyTorch) If reserved memory is &gt;&gt; allocated memory try setting max_split_size_mb to avoid fragmentation.  See documentation for Memory Management and PYTORCH_CUDA_ALLOC_CONF\"}\n</code></pre>\n<p>\u6839\u636e\u5bf9 <a href=\"https://pytorch.org/docs/stable/generated/torch.cuda.memory_stats.html#torch.cuda.memory_stats\" rel=\"nofollow\">memory_stats</a> \u7684\u7406\u89e3:</p>\n<ul>\n<li>GPU \u7684\u5185\u5b58\u662f 11.76G</li>\n<li>pytorch \u5df2\u7ecf\u4ece GPU \u51fa\u8bf7\u6c42\u7684\u5185\u5b58\u662f 10.63G</li>\n<li>pytorch \u5df2\u7ecf\u5206\u914d\u7ed9\u7528\u6237\u7684\u5185\u5b58\u662f 7.92G</li>\n<li>pytorch \u8fd8\u53ef\u4ee5\u5206\u914d\u7684\u5185\u5b58\u4e3a 784.31M, \u8fdc\u5c0f\u4e8e reserved \u51cf allocated \u7684 2.71G</li>\n</ul>\n<p>\u8fd9\u90e8\u5206\u5185\u5b58\u53bb\u54ea\u513f\u4e86\u5462? \u4e3a\u4ec0\u4e48\u5728\u7528\u6237\u7533\u8bf7\u7684\u65f6\u5019\u4f9d\u7136\u6ca1\u6709\u88ab\u56de\u6536\u5462?</p>\n<h2>pytorch \u662f\u5982\u4f55\u5206\u914d\u5185\u5b58\u7684?.</h2>\n<p>\u5f53\u7528\u6237\u8bf7\u6c42\u5185\u5b58\u65f6, pytorch \u7684\u5904\u7406\u6d41\u7a0b\u53ef\u4ee5\u7b80\u5316\u4e3a:</p>\n<ol>\n<li>\u5c1d\u8bd5\u901a\u8fc7 <code>get_free_block</code> \u53bb\u5bfb\u627e\u6ee1\u8db3\u8981\u6c42\u7684\u7a7a\u95f2 Block</li>\n<li>\u5982\u679c\u5931\u8d25, \u5219\u901a\u8fc7 <code>trigger_free_memory_callbacks</code> \u53bb\u56de\u6536\u5df2\u5206\u914d\u4f46\u4e0d\u518d\u4f7f\u7528\u7684 Block \u540e, \u518d\u6b21\u5c1d\u8bd5 <code>get_free_block</code></li>\n<li>\u5982\u679c\u5931\u8d25, \u5219\u901a\u8fc7 <code>alloc_block</code> \u53bb\u5411 GPU \u7533\u8bf7\u65b0\u7684 Block</li>\n<li>\u5982\u679c\u5931\u8d25, \u5219\u901a\u8fc7 <code>release_available_cached_blocks</code> \u5c06\u5df2\u7533\u8bf7\u4f46\u672a\u5206\u914d\u7684 Block \u91ca\u653e\u540e\u518d\u6b21\u5c1d\u8bd5 <code>alloc_block</code></li>\n<li>\u5982\u679c\u5931\u8d25, \u5219\u901a\u8fc7 <code>release_cached_blocks</code> \u5c06\u6240\u6709\u5df2\u7533\u8bf7\u4f46\u672a\u5206\u914d\u7684 Block \u91ca\u653e, \u518d\u6b21\u5c1d\u8bd5 <code>alloc_block</code></li>\n</ol>\n<p>\u6211\u4eec\u6ce8\u610f\u5230 pytorch \u5411 GPU \u7533\u8bf7\u548c\u5206\u914d\u7ed9\u7528\u6237\u7684\u5185\u5b58\u90fd\u4ee5 Block \u4e3a\u5355\u4f4d.\npytorch \u5411 GPU \u7533\u8bf7\u7684 Block \u5927\u5c0f\u5e76\u4e0d\u56fa\u5b9a, \u53d7\u5f53\u65f6\u7528\u6237\u8bf7\u6c42\u5185\u5b58\u5927\u5c0f\u7684\u5f71\u54cd.\n\u7528\u6237\u91ca\u653e\u5185\u5b58\u540e, Block \u8fd4\u56de\u7ed9 pytorch \u5e76\u6210\u4e3a\u7a7a\u95f2\u72b6\u6001.\n\u7528\u6237\u4e0b\u6b21\u7533\u8bf7\u65f6\u4f18\u5148\u4f1a\u590d\u7528\u7a7a\u95f2 Block, \u800c\u4e0d\u662f\u76f4\u63a5\u5411 GPU \u7533\u8bf7.</p>\n<p>\u5982\u679c\u7528\u6237\u7533\u8bf7\u7684\u5185\u5b58\u5927\u5c0f\u5c0f\u4e8e\u6ee1\u8db3\u8981\u6c42\u7684\u7a7a\u95f2 Block, pytorch \u4f1a\u8fdb\u884c\u4e00\u6b21 split \u64cd\u4f5c.\n\u5c06 Block \u5206\u5272\u6210\u4e24\u4e2a Block, \u9664\u53bb\u7528\u6237\u8bf7\u6c42\u5927\u5c0f\u7684\u5185\u5b58\u4f1a\u88ab\u5206\u5272\u6210\u4e00\u4e2a\u72ec\u7acb\u7684 Block,\n\u7559\u5f85\u540e\u7528\u5e76\u901a\u8fc7\u53cc\u5411\u94fe\u8868\u548c\u5206\u914d\u7ed9\u7528\u6237\u7684 Block \u76f8\u5173\u8054.</p>\n<p><code>trigger_free_memory_callbacks</code> \u7684\u56de\u6536\u8fc7\u7a0b\u4f1a\u5c06\u76f8\u90bb\u7684\u7a7a\u95f2 Block \u5408\u5e76, \u63d0\u9ad8\u540e\u7eed\u5206\u914d\u7684\u7075\u6d3b\u6027.</p>\n<p>\u76f8\u8f83\u4e8e\u5176\u4ed6\u5185\u5b58\u7ba1\u7406\u673a\u5236, pytorch \u7684\u5185\u5b58\u7ba1\u7406\u76f8\u5bf9\u7b80\u7565:</p>\n<ul>\n<li>pytorch \u56de\u6536 Block, \u53ea\u5c1d\u8bd5\u5408\u5e76\u76f8\u90bb\u7684\u7a7a\u95f2\u7684 Block, \u5e76\u4e0d\u4f1a\u8fdb\u884c\u642c\u8fd0\u64cd\u4f5c\u6765\u5904\u7406\u4e0d\u76f8\u8fde\u7684\u7a7a\u95f2 Block</li>\n<li>\u4e00\u65e6 Block \u88ab\u5206\u5272, \u5219 pytorch \u65e0\u6cd5\u5c06\u5176\u91ca\u653e. cudaMalloc \u548c cudaFree \u662f\u5bf9\u79f0\u7684, \u4f60\u65e0\u6cd5\u4ec5\u91ca\u653e\u67d0\u6b21\u5206\u914d\u7684\u4e00\u90e8\u5206\u5185\u5b58.</li>\n</ul>\n<p>\u4e0a\u8ff0\u7684\u4e24\u70b9, \u9020\u6210\u4e86 pytorch \u53ef\u80fd\u56e0\u4e3a Block \u788e\u7247\u5316, \u5bfc\u81f4\u5927\u91cf\u5185\u5b58\u65e0\u6cd5\u88ab\u4f7f\u7528.</p>\n<p>\u5047\u8bbe\u5728\u67d0\u6b21\u5206\u914d\u5185\u5b58\u65f6, pytorch \u6839\u636e\u7528\u6237\u8bf7\u6c42\u5411 GPU \u7533\u8bf7\u4e86\u4e00\u4e2a 256M \u7684 Block.<br/>\n&lt;-------------------------- 256M -----------------------------&gt;</p>\n<p>\u7ecf\u8fc7\u591a\u6b21\u5206\u914d\u548c\u56de\u6536, \u5176\u4f7f\u7528\u60c5\u51b5\u53ef\u80fd\u53d8\u6210\u5982\u4e0b.<br/>\n&lt;-- 28M(allocated) --&gt;&lt;-- 100M(free) --&gt;&lt;-- 28M(allocated) --&gt;&lt;-- 100M(free) --&gt;</p>\n<p>\u6b64\u65f6\u5982\u679c\u7528\u6237\u7533\u8bf7 160M \u5185\u5b58:</p>\n<ul>\n<li>\u867d\u7136\u7a7a\u95f2\u7684\u603b\u5185\u5b58\u5927\u4e8e 160M, \u4f46\u662f\u56e0\u4e3a\u6ca1\u6709\u5927\u4e8e 160M \u7684 Block, \u6240\u4ee5\u65e0\u6cd5\u5206\u914d</li>\n<li>pytorch \u4e5f\u65e0\u6cd5\u5c06\u7a7a\u95f2\u7684 100M+100M \u5185\u5b58\u8fd4\u56de\u7ed9 GPU, \u5bfc\u81f4\u4e5f\u65e0\u6cd5\u5411 GPU \u7533\u8bf7 160M \u5185\u5b58.</li>\n</ul>\n<h2>max_split_size_mb \u7684\u4f5c\u7528</h2>\n<p>max_split_size_mb \u7684\u4f5c\u7528\u5728\u4e8e\u7981\u6b62 pytorch \u5bf9\u4efb\u4f55\u5927\u4e8e\u8be5\u5927\u5c0f\u7684 Block \u8fdb\u884c\u5206\u5272\u64cd\u4f5c, \u4ece\u800c\u63a7\u5236\u788e\u7247\u5316\u7684\u7a0b\u5ea6.\n\u6211\u4eec\u4e0a\u6587\u8bb2\u8bc9\u7684\u90fd\u662f\u5728\u672a\u4e3b\u52a8\u8bbe\u7f6e max_split_size_mb \u7684\u60c5\u51b5\u4e0b\u7684\u903b\u8f91, \u6b64\u65f6 max_split_size_mb \u53d6\u9ed8\u8ba4\u503c MAX_INT.</p>\n<p>\u6211\u4eec\u5e76\u6ca1\u6709\u627e\u5230\u5b98\u65b9\u63a8\u8350\u7684 max_split_size_mb, \u6211\u4eec\u4e5f\u4e0d\u719f\u6089 pytorch \u548c nvida, \u5f88\u96be\u7ed9\u51fa\u4e00\u4e2a\u5f88\u597d\u7684\u63a8\u8350\u503c.\n\u4ece\u5b9e\u9645\u4f7f\u7528\u6765\u548c\u76f4\u89c2\u903b\u8f91\u6765\u8bf4, 128/256/512 \u4e4b\u7c7b\u7684\u503c\u90fd\u662f\u53ef\u9009\u7684, \u5207\u5b9e\u7684\u907f\u514d\u4e86 OOM, \u4e5f\u6ca1\u6709\u5bfc\u81f4\u660e\u663e\u7684\u6027\u80fd\u8d1f\u62c5.</p>\n<h2>garbage_collection_threshold</h2>\n<p>pytorch \u9ed8\u8ba4\u4ec5\u5728\u65e0\u6cd5\u83b7\u53d6\u5230\u5408\u9002\u7684\u7a7a\u95f2 Block \u65f6\u89e6\u53d1\u56de\u6536,\n\u8fd9\u4e2a\u503c\u53ef\u4ee5\u63a7\u5236\u5f53 allocated/capacity \u8d85\u8fc7\u6b64\u503c\u65f6\u89e6\u53d1\u4e3b\u52a8\u7684\u56de\u6536.</p>\n<h2>Expandable Segments</h2>\n<p>pytorch \u6700\u65b0(&gt;v2.0.1)\u7684 master \u5206\u652f\u4e2d\u6dfb\u52a0\u4e86 <a href=\"https://github.com/pytorch/pytorch/blob/main/c10/cuda/CUDACachingAllocator.cpp#L267\" rel=\"nofollow\">Expandable Segments</a>,\n\u53ef\u80fd\u4e5f\u53ef\u4ee5\u7f13\u89e3\u788e\u7247\u5316\u7684\u95ee\u9898.</p>\n<h2>References</h2>\n<ul>\n<li><a href=\"https://civitai.com/articles/194/basic-things-you-might-not-know-how-to-avoid-cuda-out-of-memory\" rel=\"nofollow\">Basic things you might not know: How to avoid CUDA OUT OF MEMORY?</a></li>\n<li><a href=\"https://blog.csdn.net/MirageTanker/article/details/127998036\" rel=\"nofollow\">\u901a\u8fc7\u8bbe\u7f6e PYTORCH_CUDA_ALLOC_CONF \u4e2d\u7684 max_split_size_mb \u89e3\u51b3 Pytorch \u7684\u663e\u5b58\u788e\u7247\u5316\u5bfc\u81f4\u7684 CUDA:Out Of Memory \u95ee\u9898</a></li>\n<li><a href=\"https://zhuanlan.zhihu.com/p/486360176\" rel=\"nofollow\">\u4e00\u6587\u8bfb\u61c2 PyTorch \u663e\u5b58\u7ba1\u7406\u673a\u5236</a></li>\n<li><a href=\"https://pytorch.org/docs/stable/notes/cuda.html#memory-management\" rel=\"nofollow\">pytorch's Memory management</a></li>\n</ul>\n"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/zoe1016aaa", 
        "name": "zoe1016aaa", 
        "avatar": "https://cdn.v2ex.com/avatar/343f/f1da/447597_large.png?m=1769069846"
      }, 
      "url": "https://www.v2ex.com/t/947399", 
      "title": "[\u82f1\u4f1f\u8fbe NVIDIA] [\u4e0a\u6d77/\u5317\u4eac/\u6df1\u5733] [CUDA \u76f8\u5173\u5c97\u4f4d]", 
      "id": "https://www.v2ex.com/t/947399", 
      "date_published": "2023-06-09T10:09:29+00:00", 
      "content_html": "<p>[ \u5730\u70b9 ] \uff1aShanghai/Beijing/Shenzhen</p>\n<p>[ \u53d1\u9001\u7b80\u5386\u5230 ] :xiaozhao@nvidia.com</p>\n<p>[ WeChat \u53ef\u52a0\u5fae\u4fe1 ] \uff1a18867144803</p>\n<p>\u4ee3\u7801\u80fd\u529b\u3009\u5de5\u4f5c\u5e74\u9650</p>\n<p>Deep Learning Performance Architect-Compiler/LLM-TensorRT</p>\n<p>\u4e3b\u8981\u505a\u7684\u662f\u56f4\u7ed5\u6df1\u5ea6\u5b66\u4e60\u7aef\u5230\u7aef\u7684 AI \u8f6f\u4ef6\u5168\u6808\uff0c\u5305\u62ec\u4f46\u4e0d\u9650\u4e8e\u8bad\u7ec3\u6846\u67b6\u3001\u6838\u5fc3\u8ba1\u7b97\u5e93\u3001\u63a8\u7406\u4f18\u5316\u5de5\u5177\uff08\u6bd4\u5982 TensorRT \uff09\uff0cAI \u7f16\u8bd1\u5668\uff0c\u6a21\u578b\u538b\u7f29\u7b49\u5168\u6808\u8f6f\u4ef6\u6808\u3002\u4ee5\u53ca\u53ef\u4ee5\u5728 AI \u8f6f\u4ef6\u5168\u6808\u57fa\u7840\u4e0a\u5f71\u54cd\u5230\u4e0b\u4e00\u4ee3\u751a\u81f3\u4e0b\u4e24\u4ee3\u786c\u4ef6\u67b6\u6784\u7684\u7279\u6027\u8bbe\u8ba1\u3002</p>\n<p>Required skills: \u826f\u597d C++\u7f16\u7a0b\uff0c\u719f\u6089 AI \u8f6f\u4ef6\u6808\u5e95\u5c42\u6216\u8005\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\uff0c\u719f\u6089\u4e0a\u5c42\u7b97\u6cd5\u4e0e Python \u662f\u52a0\u5206\u9879\u3002</p>\n<p>\u5730\u70b9\uff1a\u5317\u4eac\u4e0e\u4e0a\u6d77</p>\n<p>Deep Learning Performance Architect-TensorRT</p>\n<p>\u8d1f\u8d23 NVIDIA \u6df1\u5ea6\u5b66\u4e60\u63a8\u7406\u5f15\u64ce TensorRT \u7684\u8bbe\u8ba1\u3001\u5f00\u53d1\u548c\u7ef4\u62a4\u5de5\u4f5c(e.g. TensorRT \u6a21\u578b\u5bfc\u5165\u7684\u6d41\u7a0b\u548c\u76f8\u5173\u5de5\u5177\uff0c\u56fe\u4f18\u5316\uff0c\u7b97\u5b50\u7684 CUDA \u5b9e\u73b0\u53ca\u4ee3\u7801\u751f\u6210\uff0c\u7b97\u5b50\u6027\u80fd\u4f18\u5316\u7b49)\uff0c\u4ee5\u53ca\u5bf9\u5f53\u524d\u4e3b\u6d41\u7684\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\u4f7f\u7528 TensorRT \u8fdb\u884c\u63a8\u7406\u7684\u6027\u80fd\u8fdb\u884c\u5206\u6790\u548c\u4f18\u5316\u3002\u540c\u65f6\uff0c\u8fd8\u5c06\u4e0e NVIDIA GPU \u4f53\u7cfb\u7ed3\u6784\u8bbe\u8ba1\u56e2\u961f\u5408\u4f5c\u6765\u63a8\u52a8 NVIDIA \u6df1\u5ea6\u5b66\u4e60\u89e3\u51b3\u65b9\u6848\u7684\u8f6f\u786c\u4ef6\u534f\u540c\u8bbe\u8ba1\u548c\u7814\u53d1\u3002</p>\n<p>\u5c97\u4f4d\u57fa\u672c\u8981\u6c42: \u719f\u7ec3\u638c\u63e1 C++\u7f16\u7a0b</p>\n<p>\u5176\u5b83\u5bc6\u5207\u76f8\u5173\u7684\u6280\u80fd /\u7ecf\u9a8c: \u6df1\u5ea6\u5b66\u4e60\u6846\u67b6 /\u6df1\u5ea6\u5b66\u4e60\u7f16\u8bd1\u5668\u5f00\u53d1\uff0c\u6027\u80fd\u5206\u6790 /\u5efa\u6a21 /\u4f18\u5316\u76f8\u5173\u7684\u65b9\u6cd5\u8bba /\u5de5\u5177\uff0c\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\u76f8\u5173\u77e5\u8bc6\uff0cCUDA kernel \u5f00\u53d1 /\u4f18\u5316</p>\n<p>\u5730\u70b9\uff1a\u5317\u4eac\u4e0e\u4e0a\u6d77</p>\n<p>Deep Learning Performance Architect-Operator</p>\n<p>\u4e3b\u8981\u505a\u7684\u662f\u9488\u5bf9\u4e0d\u540c GPU \u67b6\u6784\u4e3a TensorRT, cuDNN, cuBLAS, cuSPARSE \u7b49\u6df1\u5ea6\u5b66\u4e60\u7b97\u5b50\u5e93\u63d0\u4f9b\u9ad8\u6027\u80fd\u57fa\u7840\u7b97\u5b50\u4ee5\u53ca\u7b97\u5b50\u878d\u5408\u5b9e\u73b0\uff0c\u5305\u542b\u5728\u7ebf\u4ee3\u7801\u751f\u6210\uff0c\u4ee3\u7801\u878d\u5408\u7b49\u76f8\u5173\u5f00\u53d1\u5de5\u4f5c\uff0c\u4ee5\u53ca\u6839\u636e\u5f53\u4ee3 GPU \u4f18\u5316\u74f6\u9888\u5f71\u54cd\u540e\u7eed\u786c\u4ef6\u67b6\u6784\u7279\u5f81\u8bbe\u8ba1\u548c\u9a8c\u8bc1\u5de5\u4f5c\u3002</p>\n<p>Required skills: \u826f\u597d C++\u7f16\u7a0b\uff0c\u719f\u6089\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\uff0c \u6709 TVM, MLIR \u76f8\u5173\u5f00\u53d1\u7ecf\u9a8c\u662f\u52a0\u5206\u9879\u3002</p>\n<p>\u5730\u70b9\uff1a\u4e0a\u6d77\u4e0e\u5317\u4eac</p>\n<p>Deep Learning Performance Architect</p>\n<p>\u4e3b\u8981\u505a\u7684\u662f\u56f4\u7ed5\u8fd0\u7b97\u67b6\u6784\u7684\u5168\u6808\u4f18\u5316\uff0c\u5305\u62ec\u4f46\u4e0d\u9650\u4e8e\u6df1\u5ea6\u5b66\u4e60\u6a21\u578b\u5206\u6790\u4e0e\u9884\u6d4b\uff0c\u67b6\u6784\u7684\u6027\u80fd\u5206\u6790\uff0c\u7f16\u8bd1\u5668\u6027\u80fd\u5206\u6790\u4ee5\u53ca\u5bf9\u4e3b\u6d41\u8fd0\u7b97\u67b6\u6784\uff0c\u8f6f\u4ef6\u751f\u6001\u7684\u5206\u6790\u3002\u4f7f NVIDIA \u8f6f\u4ef6\u751f\u6001\u4e0e\u8ba1\u7b97\u67b6\u6784\u66f4\u597d\u7684\u652f\u6301\u4e3b\u6d41\u5e94\u7528\u3002</p>\n<p>Required skills: \u826f\u597d C++/Python \uff0c\u719f\u6089 AI \u8f6f\u4ef6\u6216\u8005\u8ba1\u7b97\u673a\u4f53\u7cfb\u7ed3\u6784\u3002</p>\n<p>\u5730\u70b9\uff1a\u5317\u4eac\u4e0e\u4e0a\u6d77</p>\n<p>Developer Technology Engineer-AI</p>\n<p>\u5ba2\u6237\u7684\u6df1\u5ea6\u5b66\u4e60\u548c\u9ad8\u80fd\u6027\u8ba1\u7b97\u5e94\u7528\u5728 NVIDIA \u751f\u6001\u4e0a\u7684\u79fb\u690d\u548c\u4f18\u5316\u3002\u8fd9\u4e9b\u5e94\u7528\u5305\u62ec\u5927\u8bed\u8a00\u6a21\u578b\uff0cCV \uff0cSpeech,\u63a8\u8350\u7cfb\u7edf\u548c\u5206\u5b50\u52a8\u529b\u5b66\uff0c\u8ba1\u7b97\u529b\u5b66\uff0c\u8ba1\u7b97\u91cf\u5b50\u5316\u5b66\u7b49\u3002\u901a\u8fc7\u7b97\u6cd5\u548c\u5de5\u7a0b\u4f18\u5316\uff0c\u63d0\u4f9b\u7cfb\u7edf\u7ea7\u7684\u4f18\u5316\u65b9\u6848\u3002\u6df1\u5ea6\u4e0e\u5185\u90e8\u67b6\u6784\u548c\u4ea7\u54c1\u56e2\u961f\u5408\u4f5c\uff0c\u6784\u5efa\u548c\u5b8c\u5584 NVIDIA \u8f6f\u786c\u4ef6\u52a0\u901f\u751f\u6001\u3002</p>\n<p>Required skills: Required Skills: \u826f\u597d C/C++\u7f16\u7a0b\u80fd\u529b\uff0c\u5206\u6790\u80fd\u529b\u548c\u6c9f\u901a\u80fd\u529b\uff0c\u719f\u6089\u6df1\u5ea6\u5b66\u4e60\u6216 GPU \u52a0\u901f\u8ba1\u7b97\u8f6f\u4ef6\u6808\uff0c\u624e\u5b9e\u7684\u6df1\u5ea6\u5b66\u4e60\u7406\u8bba\u57fa\u7840\u6216\u7cbe\u901a GPU \u67b6\u6784\u548c\u4f18\u5316\u3002</p>\n<p>\u5730\u70b9\uff1a\u5317\u4eac\uff0c\u4e0a\u6d77\u4e0e\u6df1\u5733</p>\n"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/leven87", 
        "name": "leven87", 
        "avatar": "https://cdn.v2ex.com/gravatar/b4497986025202f2280dc8497ab80cb7?s=73&d=retro"
      }, 
      "url": "https://www.v2ex.com/t/837601", 
      "title": "\u5982\u679c\u5b9e\u73b0 openmpi \u548c cuda \u7f16\u7a0b\u7684\u7ed3\u5408", 
      "id": "https://www.v2ex.com/t/837601", 
      "date_published": "2022-03-03T02:04:27+00:00", 
      "content_html": "<p>\u5404\u4f4d V \u53cb\u597d\uff0c\u6211\u521a\u63a5\u89e6 cuda \u7f16\u7a0b\u3002\u73b0\u5728\u53ef\u4ee5\u5b9e\u73b0\u5229\u7528\u5355 cpu \u548c gpu \u6765\u52a0\u901f\u8fd0\u7b97\u3002 \u73b0\u5728\u9700\u8981\u5b9e\u73b0\u591a cpu \u548c gpu \u6765\u8fdb\u4e00\u6b65\u52a0\u901f\u8fd0\u7b97\uff0c \u770b\u7f51\u4e0a\u4f8b\u5b50\uff0c\u9700\u8981\u7528\u5230 openmpi, \u8fd8\u8981\u5f00\u542f\u5b83\u7684 cuda \u652f\u6301\u3002 \u8bf7\u95ee\uff1a\n\u8fd9\u6761\u9053\u8def\u662f\u5426\u6b63\u786e\uff1f\n\u8fd8\u6709\u54ea\u4e9b\u9700\u8981\u6ce8\u610f\u7684\u5730\u65b9\uff0ccuda \u4ee3\u7801\u7684\u4fee\u6539\uff0c\u6216\u8005\u914d\u7f6e\u5565\u7684\uff1f</p>\n"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/gouchaoer", 
        "name": "gouchaoer", 
        "avatar": "https://cdn.v2ex.com/avatar/58d5/5587/189082_large.png?m=1480987620"
      }, 
      "url": "https://www.v2ex.com/t/831540", 
      "date_modified": "2022-01-31T12:29:28+00:00", 
      "content_html": "<p>\u95ee\u4e2a\u6280\u672f\u95ee\u9898\uff0c\u5f00\u59cb\u89c9\u5f97\u5f88\u7b80\u5355\uff0c\u4f46\u662f\u641c\u4e86\u5f88\u4e45\u6ca1\u7ed3\u679c\u3002\u3002\u3002\u6211\u7528 NVIDIA \u7684 nvdec \u628a 4 \u4e2a h264 \u7684\u89c6\u9891\u89e3\u7801\u51fa\u6765\u6210\u4e86 rgba \u7684 4 \u5f20 raw \u56fe\u50cf\uff0c\u8bf7\u95ee\u6211\u600e\u4e48\u628a\u5b83\u8f93\u51fa\u5230\u663e\u5361\u7684 4 \u4e2a dp \u53e3\uff1f\u6700\u597d\u662f NVIDIA \u7684 api \uff0cOpenGL \u554a drm \u4e4b\u7c7b\u7684\u5305\u88c5\u8fc7\u7684\u4e5f\u884c</p>\n", 
      "date_published": "2022-01-31T12:27:55+00:00", 
      "title": "\u600e\u4e48\u628a\u663e\u5361\u663e\u5b58\u4e2d\u7684 rgba \u56fe\u50cf\u6e32\u67d3\u8f93\u51fa\uff1f", 
      "id": "https://www.v2ex.com/t/831540"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/wangx0102", 
        "name": "wangx0102", 
        "avatar": "https://cdn.v2ex.com/avatar/2964/0b5a/432993_large.png?m=1695724205"
      }, 
      "url": "https://www.v2ex.com/t/794158", 
      "date_modified": "2021-08-06T11:44:19+00:00", 
      "content_html": "<p>\u5bfc\u5e08\u7ed9\u4e86\u4e00\u4e2a\u7a0b\u5e8f\uff0c\u5b9e\u73b0\u4e86\u4e00\u4e2a\u4e2d\u95f4\u4ef6\u53ef\u4ee5\u5b9e\u73b0 CPU \u548c GPU \u8fd0\u7b97\u7684\u8d1f\u8f7d\u5747\u8861\u3002</p>\n<p>\u6211\u7684\u521d\u6b65\u60f3\u6cd5\u662f\u628a CUDA \u7a0b\u5e8f\u6253\u5305\u6210 exe <a href=\"http://\u6216\u8005.so\" rel=\"nofollow\">\u6216\u8005.so</a> \u5565\u7684\uff0c\u7136\u540e\u7528 Python \u8c03\u7528\uff0c\u4f7f\u7528 Celery \u5b9e\u73b0\u5206\u5e03\u5f0f\u96c6\u7fa4\u3002</p>\n<p>\u5e0c\u671b\u5927\u5bb6\u80fd\u6709\u66f4\u597d\u7684\u60f3\u6cd5</p>\n", 
      "date_published": "2021-08-06T11:38:23+00:00", 
      "title": "\u5982\u4f55\u5b9e\u73b0 CUDA \u7684\u5206\u5e03\u5f0f\u5e76\u884c\u8fd0\u7b97\uff1f", 
      "id": "https://www.v2ex.com/t/794158"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/huzhikuizainali", 
        "name": "huzhikuizainali", 
        "avatar": "https://cdn.v2ex.com/avatar/1869/a390/522912_large.png?m=1752498684"
      }, 
      "url": "https://www.v2ex.com/t/775344", 
      "title": "\u6e38\u620f\u672c\u4e0a\u7528 cuda \u662f\u4ec0\u4e48\u4f53\u9a8c\uff1f", 
      "id": "https://www.v2ex.com/t/775344", 
      "date_published": "2021-05-07T01:48:50+00:00", 
      "content_html": "1 \u3001\u6709\u4eba\u5728\u6e38\u620f\u672c\u4e0a\u7528\u8fc7 cuda \u4e48\uff1f\u4f53\u9a8c\u5982\u4f55\uff1f\u8003\u8651\u5230\u91cd\u91cf\u589e\u5927\uff0c\u5f85\u673a\u53d8\u77ed\uff0c\u53d1\u70ed\u3002\u540c\u7b97\u529b\u7684\u589e\u5f3a\u76f8\u6bd4\u3002\u7efc\u5408\u5229\u5f0a\u5f97\u5931\uff0c\u5e26\u4e2a\u6e38\u620f\u672c\u8dd1 cuda \u662f\u5426\u503c\u5f97\uff1f<br /><br />2 \u3001\u8003\u8651\u5230\u79fb\u52a8\u529e\u516c\uff0c\u6570\u636e\u5b89\u5168\uff08\u6570\u636e\u548c\u4ee3\u7801\u4e0a\u4f20\u5230\u4e91\u7aef\u4e0d\u592a\u653e\u5fc3\uff09\uff0c\u5982\u679c\u7528\u6e38\u620f\u672c+cuda\uff0c\u662f\u5426\u8fd8\u6709\u5176\u4ed6\u4e1d\u6ed1\u4f53\u9a8c\u7684\u65b9\u6848\uff1f<br /><br />3 \u3001\u5404\u4f4d\u5927\u795e\u73b0\u5728\u662f\u600e\u4e48\u89e3\u51b3\u79fb\u52a8\u5f00\u53d1\u7684\u5462\uff1f<br /><br />4 \u3001\u5de5\u4f5c\u5f53\u4e2d\u7528 cuda \u548c\u4e0d\u7528 cuda \u5bf9\u6bd4\uff0c\u51fa\u7ed3\u679c\u7684\u65f6\u95f4\u80fd\u8282\u7701\u591a\u5c11\uff1f\u53ef\u5426\u5206\u4eab\u4e00\u4e0b\u201c\u771f\u5b9e\u201d\u5bf9\u6bd4\u6848\u4f8b\uff1f"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/BillHuang", 
        "name": "BillHuang", 
        "avatar": "https://cdn.v2ex.com/avatar/343f/5feb/460327_large.png?m=1610344655"
      }, 
      "url": "https://www.v2ex.com/t/743810", 
      "title": "Tesla k20m \u4f7f\u7528\u95ee\u9898", 
      "id": "https://www.v2ex.com/t/743810", 
      "date_published": "2021-01-11T05:59:26+00:00", 
      "content_html": "\u6709\u4eba\u7528\u8fc7\u8fd9\u663e\u5361\u5417\uff1f\u7f51\u4e0a\u82b1 300 \u6dd8\u5230\u7684\uff0c\u63d2\u4e0a\u540e\uff0c\u5f00\u673a\u4e3b\u677f\u62a5\u663e\u5361\u9519\u4e86\uff0c\u662f\u6211\u64cd\u4f5c\u4e0d\u5bf9\uff0c\u8fd8\u662f\u8fd9\u5361\u6709\u95ee\u9898\uff1f"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/Kai", 
        "name": "Kai", 
        "avatar": "https://cdn.v2ex.com/avatar/021b/bc7e/1024_large.png?m=1657391813"
      }, 
      "url": "https://www.v2ex.com/t/682893", 
      "title": "CUDA on WSL", 
      "id": "https://www.v2ex.com/t/682893", 
      "date_published": "2020-06-18T19:40:48+00:00", 
      "content_html": "<p><a href=\"https://devblogs.nvidia.com/announcing-cuda-on-windows-subsystem-for-linux-2/\" rel=\"nofollow\">https://devblogs.nvidia.com/announcing-cuda-on-windows-subsystem-for-linux-2/</a></p>\n"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/different", 
        "name": "different", 
        "avatar": "https://cdn.v2ex.com/avatar/94f6/7670/374456_large.png?m=1661142546"
      }, 
      "url": "https://www.v2ex.com/t/591013", 
      "title": "\u5173\u4e8e\u4f7f\u7528 GPU \u751f\u6210\u968f\u673a\u6570\uff08cuda/opencl\uff09", 
      "id": "https://www.v2ex.com/t/591013", 
      "date_published": "2019-08-11T14:16:02+00:00", 
      "content_html": "<p>\u7531\u4e8e\u7279\u6b8a\u539f\u56e0\uff08\u539f\u56e0\u5f88\u7279\u6b8a\uff08\u624b\u52a8\u72d7\u5934\uff09\uff09\u5e76\u4e0d\u80fd\u4f7f\u7528 cuda \u81ea\u5e26\u7684\u968f\u673a\u51fd\u6570\u3002</p>\n<p>\u56e0\u6b64\uff0c\u7ffb\u8f66\u4e86....\u3002</p>\n<p>\u76ee\u7684\uff1a\u5728\u4e0d\u4f7f\u7528 cuda \u81ea\u5e26\u7684\u968f\u673a\u51fd\u6570\u524d\u63d0\u4e0b\uff0c\u4f7f\u7528 cuda/opencl \u7684\u4e00\u4e2a\u5185\u6838\u51fd\u6570\u751f\u6210 10000 \u4e2a\u9ad8\u65af\u5206\u5e03\u7684\u968f\u673a\u6570\u3002</p>\n<p>\u672c\u4eba\u5df2\u5c1d\u8bd5\u4e00\u4e0b\u6b65\u9aa4\uff1a</p>\n<p>1.\u5728 cpu \u751f\u6210 10000 \u7684\u968f\u673a\u51fd\u6570\uff08\u5e94\u8be5\u662f\u7ebf\u6027\u540c\u4f59\u7b97\u6cd5\uff09</p>\n<p>2.\u5728 cpu \u4f7f\u7528 The Box \u2013 Muller transform \uff08\u542c\u8bf4\u548c\u7ebf\u6027\u540c\u4f59\u7b97\u6cd5\u4f7f\u7528\u8d77\u6765\u4f1a\u7ffb\u8f66..\uff09\u7b97\u6cd5\u5c06\u6b65\u9aa4 1 \u7684\u968f\u673a\u6570\u8f6c\u6210\u6b63\u6001\u5206\u5e03</p>\n<p>3.\u7136\u540e\u68c0\u9a8c\u662f\u5426\u4e3a\u6b63\u6001\u5206\u5e03\uff0c\u7ed3\u679c\u662f\u5bf9\u7684.</p>\n<p>4.\u81f3\u6b64\uff0c\u5df2\u7ecf\u751f\u6210\u4e86\u4e00\u4e2a 10000 \u4e2a\u670d\u4ece\u9ad8\u65af\u5206\u5e03\u7684\u968f\u673a\u6570\u5566\uff0c\u5c06\u5176\u4fdd\u5b58\u5230\u6570\u7ec4 a\u3002</p>\n<p>\u4e8b\u5b9e\u4e0a\u9700\u8981\u4e0d\u65ad\u751f\u6210\u5e76\u4f7f\u7528\u6570\u7ec4 a\u3002</p>\n<p>\u56e0\u6b64\u8003\u8651 GPU</p>\n<p>\u5206\u6790\uff1a\u4e0a\u8ff0\u7684 cpu \u4ee3\u7801\u662f\u5e8f\u5217\u8fdb\u884c\u7684\uff0c\u4e5f\u5c31\u662f\u53ea\u6709\u4e00\u4e2a\u968f\u673a\u79cd\u5b50\uff0c\u7136\u540e\u5728\u4e00\u4e2a\u7ebf\u7a0b\u5185\u5b8c\u6210\u4e86 10000 \u4e2a\u968f\u673a\u6570\u7684\u751f\u6210\u3002</p>\n<p>\u7136\u540e\u5c06\u4ee3\u7801\u6539\u6539\u653e\u5230 GPU \u4e0a\u9762\u6765\u751f\u6210\u3002(\u76ee\u6807\u662f\u5b9e\u73b0\u4e0e cuda \u7684\u51fd\u6570 curandGenerateNormal(cuda::generator, cudaRand, number, 0.0, 1.0); \u4e00\u6478\u4e00\u6837\u7684\u529f\u80fd)\u3002</p>\n<p>\u4e3a\u4e86\u5f97\u5230\u4e0e curandGenerateNormal \u51fd\u6570\u76f8\u540c\u7684\u7ed3\u679c\uff0c\u6211\u5c1d\u8bd5\u6bcf\u4e2a\u5185\u6838\u7ebf\u7a0b\u7ef4\u62a4\u4e00\u4e2a\u79cd\u5b50\uff0c\u4e5f\u5c31\u662f\u6709 10000 \u4e2a\u968f\u673a\u6570\u79cd\u5b50\u3002(\u8c03\u7528\u4e00\u6b21\u5185\u6838\uff0c\u7136\u540e\u6267\u884c\u4e00\u4e07\u4e2a\u7ebf\u7a0b\uff0c\u6bcf\u9694\u7ebf\u7a0b\u4f7f\u7528\u81ea\u5df1\u7684\u79cd\u5b50\u751f\u6210\u4e00\u4e2a\u968f\u673a\u6570\uff0c\u7136\u540e\u7ec4\u5408\u5230\u6570\u7ec4 a \u4e2d)\n\u4f46\u662f\u76ee\u524d\uff0c\u6211\u505a\u4e86\u8bd5\u9a8c\u4e2d\uff0c\u5982\u679c\u6bcf\u4e2a\u5185\u6838\u7ebf\u7a0b\u7ef4\u62a4\u4e00\u4e2a\u79cd\u5b50\uff0c\u6bcf\u4e2a\u7ebf\u7a0b\u7ef4\u62a4 a[i](i \u4e3a\u7ebf\u7a0b id),\u6700\u540e\u7684\u51fa\u6765\u7684\u5e76\u4e0d\u670d\u4ece\u9ad8\u65af\u5206\u5e03\u3002</p>\n<p>\u4e5f\u5c31\u662f\u8bf4\uff0c\u7eb5\u5411\u53bb\u770b\u7684\u8bdd\uff08 cpu \u4e32\u884c\uff09\u662f\u53ef\u4ee5\u5f97\u5230\u9ad8\u65af\u5206\u5e03\u7684\u968f\u673a\u6570\uff0c\u6a2a\u5411\u5e76\u4e0d\u884c\u3002</p>\n<p>\u4e5f\u5c31\u662f\u8bf4\uff0c\u5047\u5982\u6709 a \u6570\u7ec4\uff0cb \u6570\u7ec4....z \u6570\u7ec4\u4e2d\uff0c\u6bcf\u4e2a\u6570\u7ec4\u81ea\u4e2a\u662f\u9ad8\u65af\u5206\u5e03\uff0c\u4f46\u662f a...z \u4e2d\uff0c\u5404\u53d6\u4e00\u4e2a\u51fa\u6765\uff0c\u7ec4\u5408\u5728\u4e00\u8d77\uff0c\u5e76\u4e0d\u670d\u4ece\u9ad8\u65af\u5206\u5e03\u3002</p>\n<p>\u800c\u5982\u679c\u4ece\u76f4\u89c2\u4e0a\u51fa\u53d1\uff0c\u4e0a\u8ff0\u5e94\u8be5\u4e5f\u670d\u4ece\u9ad8\u65af\u5206\u5e03\uff0c\u4f46\u662f\u7531\u4e8e\u968f\u673a\u79cd\u5b50\u7684\u95ee\u9898\uff0c\u53ef\u80fd\u5bfc\u81f4\u5176 a....z \u53ef\u80fd\u6709\u76f8\u5173\u6027\u3002\u5177\u4f53\u539f\u56e0\u6211\u4e5f\u4e0d\u662f\u5f88\u6e05\u695a\u3002</p>\n<p>\u4e0d\u77e5\u9053\u8868\u8fbe\u6e05\u695a\u6ca1\uff0c\u5404\u4f4d\u5144\u53f0\u6709\u6ca1\u6709\u4e86\u89e3\u8fc7\u76f8\u5173\u7684\u4fe1\u606f\uff1f</p>\n<p>\u4e00\u53e5\u8bdd\u6982\u62ec\u5c31\u662f\uff1acurandGenerateNormal \u51fd\u6570\u76f8\u540c\u7684\u529f\u80fd...</p>\n<p>\u6240\u4ee5\u60f3\u95ee\u95ee\u5927\u4f19\u6709\u505a\u8fc7\u76f8\u5173\u7684\u7814\u7a76\u5417\uff1f</p>\n"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/different", 
        "name": "different", 
        "avatar": "https://cdn.v2ex.com/avatar/94f6/7670/374456_large.png?m=1661142546"
      }, 
      "url": "https://www.v2ex.com/t/580600", 
      "date_modified": "2019-07-06T09:05:15+00:00", 
      "content_html": "<p>\u6307\u7684\u662f\u53cc\u7cbe\u5ea6\u3002</p>\n<p>\u4e0d\u77e5\u9053\u662f\u4e0d\u662f\u7f16\u8bd1\u7684\u65f6\u5019\u53cc\u7cbe\u5ea6\u9700\u8981\u6dfb\u52a0\u4e00\u4e9b\u5176\u4ed6\u6307\u4ee4\uff1f</p>\n<p>\u4e0b\u9762\u662f kernel\u3002</p>\n<p>void CSR(int i,unsigned int N,\nunsigned int *xadj,unsigned int *adjncy,\ndouble *dataxx,double *datayy,double *datazz,\ndouble *Cspin,\ndouble *CHDemag,double *CH)</p>\n<p>{</p>\n<pre><code>if(i &lt; N)\n{\n\tdouble dot[3]={0,0,0};\n\tfor(int n = xadj[i] ; n &lt; xadj[i+1]; n++)\n\t{\n\t\tunsigned int neigh=adjncy[n];\n\t\tprintf(\"%d\\n\",n);\n\t\tprintf(\"%f,%f,%f\\n\",dataxx[n],datayy[n],datazz[n]);\n\t\tdouble val[3] = {dataxx[n],datayy[n],datazz[n]};\n\t\tfor(unsigned int co = 0 ; co &lt; 3 ; co++)\n\t\t{\n\t\t\tdot[co]+=(val[co]*Cspin[3*neigh+co]);\n\t\t}\n\t}\n\tdouble a=CHDemag[3*i];\n\tdouble b=CHDemag[3*i+1];\n\tdouble c=CHDemag[3*i+2];\n\tCH[3*i]=a+dot[0];\n\tCH[3*i+1]=b+dot[1];\n\tCH[3*i+2]=c+dot[2];\n}\n</code></pre>\n<p>}</p>\n<p>\u901a\u8fc7\u663e\u5361\u53c2\u6570\u6765\u770b\uff0crtx \u5e94\u8be5\u662f\u6ca1\u6709\u53cc\u7cbe\u5ea6\u8ba1\u7b97\u5355\u5143\u7684\u3002\u800c titan v \u7684\u53cc\u7cbe\u5ea6\u5e94\u8be5\u8fd8\u884c\u3002</p>\n<p>\u800c\u6211\u8dd1\u7684\u65f6\u5019\uff0ctitan v \u6bd4 rtx \u6162\u4e86\u4e09\u5206\u4e4b\u4e00\u3002\u3002</p>\n<p>\u6c42\u89e3</p>\n", 
      "date_published": "2019-07-06T08:59:28+00:00", 
      "title": "cuda \u8ba1\u7b97 titan v \u4e3a\u4f55\u6bd4 rtx2080ti \u66f4\u6162\uff1f", 
      "id": "https://www.v2ex.com/t/580600"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/Livid", 
        "name": "Livid", 
        "avatar": "https://cdn.v2ex.com/avatar/c4ca/4238/1_large.png?m=1775624785"
      }, 
      "url": "https://www.v2ex.com/t/528949", 
      "title": "DeOldify", 
      "id": "https://www.v2ex.com/t/528949", 
      "date_published": "2019-01-21T00:27:22+00:00", 
      "content_html": "<a target=\"_blank\" href=\"https://github.com/jantic/DeOldify/blob/master/README.md\" rel=\"nofollow\">https://github.com/jantic/DeOldify/blob/master/README.md</a><br /><br />\u4e00\u4e2a\u6709\u8da3\u7684 CUDA \u9879\u76ee\uff0c\u53ef\u4ee5\u5c06\u9ed1\u767d\u8001\u7167\u7247\u53d8\u6210\u5f69\u8272\u7684\u3002"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/Antidictator", 
        "name": "Antidictator", 
        "avatar": "https://cdn.v2ex.com/avatar/33e8/6eb3/201659_large.png?m=1485621704"
      }, 
      "url": "https://www.v2ex.com/t/369282", 
      "date_modified": "2018-06-14T05:50:48+00:00", 
      "content_html": "<p><a href=\"http://i.imgur.com/4s1hBfN.png\" rel=\"nofollow\">http://i.imgur.com/4s1hBfN.png</a></p>\n<p>\u4e0d\u6b7b\u5fc3\u95ee\u4e00\u4e0b\u7b14\u8bb0\u672c 1050 \u652f\u6301 cudnn \u5417\uff1f</p>\n<p>\u65e2\u7136\u652f\u6301 cuda\uff0c\u600e\u4e48\u4f1a\u4e0d\u6210\u529f\u5462\uff1f</p>\n<p><a href=\"http://i.imgur.com/mT99ID0.jpg\" rel=\"nofollow\">http://i.imgur.com/mT99ID0.jpg</a></p>\n", 
      "date_published": "2017-06-18T04:02:43+00:00", 
      "title": "\u4e0d\u6b7b\u5fc3\u95ee\u4e00\u4e0b\u7b14\u8bb0\u672c 1050 \u652f\u6301 cudnn \u5417\uff1f", 
      "id": "https://www.v2ex.com/t/369282"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/OldFinder", 
        "name": "OldFinder", 
        "avatar": "https://cdn.v2ex.com/avatar/3136/7816/191399_large.png?m=1473652706"
      }, 
      "url": "https://www.v2ex.com/t/312305", 
      "date_modified": "2016-10-12T11:51:58+00:00", 
      "content_html": "\u76ee\u524d\u9700\u6c42\u662f\u7528\u5230\u56fe\u5f62\u8bc6\u522b\u548c\u6570\u636e\u7684\u6574\u7406\u548c\u7edf\u8ba1\uff0c\u6570\u91cf\u7ea7\u4e5f\u5c31\u662f\u51e0\u5341\u4e07\u6761\u7684\uff0c\u4e0d\u7b97\u5f88\u5927\u3002", 
      "date_published": "2016-10-12T11:07:25+00:00", 
      "title": "Python+CUDA\uff0c\u5927\u5bb6\u6709\u4ec0\u4e48\u63a8\u8350\u7684\u503c\u5f97\u6df1\u5165\u5b66\u4e60\u4e86\u89e3\u7684\u9879\u76ee\u6216\u8005\u8457\u4f5c\u4e48\uff1f", 
      "id": "https://www.v2ex.com/t/312305"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/xiangtianxiao", 
        "name": "xiangtianxiao", 
        "avatar": "https://cdn.v2ex.com/avatar/0220/9919/93403_large.png?m=1422028256"
      }, 
      "url": "https://www.v2ex.com/t/297613", 
      "date_modified": "2018-06-14T05:50:45+00:00", 
      "content_html": "<p>\u6bd4\u5982\u8bf4\u4e09\u5343\u5757\u5de6\u53f3\u7684\u663e\u5361\uff0c GTX 1070 \u62e5\u6709 1920 \u4e2a\u6d41\u5904\u7406\u5355\u5143\uff0c 8G \u663e\u5b58\u3002\nQuadro M2000 \u53ea\u6709 768 \u4e2a\u5355\u5143\uff0c 4G \u663e\u5b58\u3002</p>\n<p>\u6e38\u620f\u5361\u7684\u8bf1\u60d1\u592a\u5927\u4e86\u554a\uff0c\u663e\u5b58\u5927\uff0c\u5355\u5143\u591a...\u6211\u77e5\u9053\u4e13\u4e1a\u5361\u5728 CAD \u65b9\u9762\u53ef\u80fd\u6709\u52a0\u6210\uff0c\u4f46\u662f\u4e0d\u77e5\u9053\u5bf9\u4e8e CUDA \u8fd9\u79cd\u5e76\u884c\u8ba1\u7b97\u6709\u6ca1\u6709\u4f18\u5316\uff0c\u6216\u8005\u8bf4\u53ef\u4ee5\u66f4\u52a0\u7a33\u5b9a\uff1f</p>\n", 
      "date_published": "2016-08-06T13:17:19+00:00", 
      "title": "\u5199 CUDA\uff0c\u4f7f\u7528\u4e13\u4e1a\u5361\u4e0e\u6e38\u620f\u5361\u6709\u4ec0\u4e48\u533a\u522b\uff1f", 
      "id": "https://www.v2ex.com/t/297613"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/Jolly23", 
        "name": "Jolly23", 
        "avatar": "https://cdn.v2ex.com/avatar/8f94/502f/172360_large.png?m=1603016148"
      }, 
      "url": "https://www.v2ex.com/t/294997", 
      "date_modified": "2016-07-26T04:59:07+00:00", 
      "content_html": "<p>\u4e0d\u662f\u7528\u6765\u6253\u6e38\u620f\uff0c\n\u662f\u641e\u6df1\u5ea6\u5b66\u4e60\u7528\u7684\uff08 CUDA \uff0f caffe \uff0f DIGITS \uff09</p>\n<p>\u4e4b\u524d\u82b1 30k \u4e00\u5757\u7684\u4ef7\u683c\uff0c\u4e70\u4e86\u51e0\u5757 tesla K40C \uff0c\u88c5 ubuntu \u8dd1\u6df1\u5ea6\u5b66\u4e60\u4e86\uff0c\u8fd0\u7b97\u80fd\u529b\u771f\u662f\u5f3a\u608d\uff0c\u4f46\u662f\u53e6\u4e00\u4f4d\u5bfc\u5e08\u63a5\u53d7\u4e0d\u4e86\u8fd9\u4e2a\u91c7\u8d2d\u4ef7\u683c\uff0c\u53ea\u80fd\u4e70 5k \u5de6\u53f3\u7684\u5361\uff0c\u6c42\u63a8\u8350\uff01\u80fd\u88c5 ubuntu \u5c31\u884c\uff0c\uff08\u4ed6\u4e4b\u524d\u6709\u5757\u6cf0\u5766\uff0c\u88c5 ubuntu \u663e\u793a unknown chipset maxwell \uff09\uff0c\u5e94\u8be5\u662f\u9ea6\u514b\u65af\u97e6\u67b6\u6784\u7684\u5361\u88c5\u4e0d\u4e86 ubuntu \uff0c\u6211\u4e5f\u4e0d\u592a\u6e05\u695a\u5177\u4f53\u60c5\u51b5\u3002</p>\n<p>\u8981\u6c42\uff1a\u5728\u8fd9\u5f20\u8868\u91cc\u7684\u5361\n<a href=\"https://developer.nvidia.com/cuda-gpus\" rel=\"nofollow\">https://developer.nvidia.com/cuda-gpus</a></p>\n<p>\u4ef7\u683c 5k \u5de6\u53f3\u5c31\u884c\uff0c\u7ed9\u63a8\u8350\u70b9\uff0c\u8c22\u8c22\u5404\u4f4d</p>\n", 
      "date_published": "2016-07-26T04:37:59+00:00", 
      "title": "\u6025\u6c42\u63a8\u8350\u4e2a 5k \u4eba\u6c11\u5e01\u5de6\u53f3\u7684\u8fd0\u7b97 GPU\uff0c\u80fd\u88c5 ubuntu \u5c31\u884c\uff0c\u8dd1\u6df1\u5ea6\u5b66\u4e60\u7528\u7684\uff0c\u5fc5\u987b\u5728 nvidia \u8fd0\u7b97\u80fd\u529b\u8868\u91cc\u9762\u7684\u5361", 
      "id": "https://www.v2ex.com/t/294997"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/andrewzhou", 
        "name": "andrewzhou", 
        "avatar": "https://cdn.v2ex.com/avatar/57e7/710f/42926_large.png?m=1668704021"
      }, 
      "url": "https://www.v2ex.com/t/286358", 
      "date_modified": "2016-10-12T15:40:34+00:00", 
      "content_html": "\u6700\u8fd1\u8003\u8651\u6362\u5de5\u4f5c\uff08\u6362\u884c\u4e1a\uff09\uff0c\u5e0c\u671b\u8f6c\u5230\u4eba\u5de5\u667a\u80fd\u7b49\u65b0\u9886\u57df\uff0c\u4f46\u662f\u5bf9\u8fd9\u4e9b\u9886\u57df\u5546\u4e1a\u5e94\u7528\u4e0a\u63a5\u89e6\u5f88\u5c11,\u76ee\u524d\u6bd4\u8f83\u4e2d\u610f CUDA \u548c EMC \u5b58\u50a8\u76f8\u5173\u7684\u5de5\u4f5c\u3002\u6c42\u6307\u5bfc\uff1f\r<br />1 \uff0c\u7b80\u5355\u4e86\u89e3\u4e86\u4e00\u4e0b CUDA/openCL \uff0c\u611f\u89c9\u786e\u5b9e\u6740\u4f24\u529b\u8fc5\u731b\uff0c\u5bf9 ML \u4e4b\u7c7b\u7684\u5e94\u7528\u786e\u5b9e\u6709\u50ac\u5316\u5242\u7684\u4f5c\u7528\uff0c\u4f46\u662f\u76ee\u524d\u771f\u6b63\u4f7f\u7528 OpenCL/CUDA \u4f5c\u4e3a\u5e95\u5c42\u57fa\u7840\u7684\u6210\u529f\u6848\u4f8b\u591a\u5417\uff1f\u672a\u6765\u5e02\u573a\u662f\u5426\u4f1a\u6301\u7eed\u6269\u5f20\u3002\r<br />2 \uff0c\u5b58\u50a8\u90e8\u5206\u5728\u4e91\u8ba1\u7b97 /\u4eba\u5de5\u667a\u80fd\u672a\u6765\u662f\u4ec0\u4e48\u89d2\u8272\uff1f\u76ee\u524d\u4e3b\u6d41\u5546\u4e1a\u5e02\u573a\u4e0a\u6e38\u6709\u54ea\u4e9b\u4ea7\u4e1a /\u516c\u53f8\uff1f\r<br />3 \uff0c\u524d\u4e9b\u65e5\u5b50\u611f\u89c9 FPGA/GPU/TPU \u6495 B \u4e86\u4e00\u987f\uff0c\u8fd9\u4e9b\u9ad8\u5927\u6df1\u7684\u9886\u57df\u6211\u4eec\u4e5f\u53ea\u662f\u770b\u5ba2\uff1f\r<br />   \u4e2a\u4eba\u5bf9\u8fd9\u4e9b\u8ba1\u7b97\u7ed3\u6784\u4e0d\u662f\u5f88\u4e86\u89e3\uff0c\u7b80\u5355\u8ba4\u4e3a FPGA \u4f5c\u4e3a\u5e95\u5c42\u4e2d\u95f4\u5b9e\u73b0\u4e86\u6807\u51c6\u7684 OpenCL \u4e0d\u77e5\u9053\u6709\u6ca1\u6709\u5927\u89c4\u6a21\u7684\u5546\u7528\u6848\u4f8b\u3002\u770b\u8d77\u6765\u786e\u5b9e\u4f1a\u6bd4 GPU \u7b49\u6548\u7387\u9ad8\u5f88\u591a\u3002\r<br />   GPU \u5e94\u8be5\u662f\u76ee\u524d\u6210\u529f\u7684\u5546\u4e1a\u6848\u4f8b\uff0c\u4f46\u662f CUDA \u81ea\u5df1\u5b98\u7f51\u4e0a\u8bf4\u5168\u7403\u6709 700+\u8ba1\u7b97\u96c6\u7fa4\uff0c\u8fd9\u4e2a 700 \u4e0d\u662f\u5230\u662f\u4ec0\u4e48\u6982\u5ff5\u3002\r<br />  TPU \u4e0d\u8bf4\u4e86\u6ca1\u6709\u516c\u5f00\u8d44\u6599\uff0c\u4e2a\u4eba\u8ba4\u4e3a\u5e94\u8be5\u5c40\u9650\u6027\u6bd4\u8f83\u5927\u3002\r<br />\r<br />\u5c0f\u767d\u6c42\u55b7. :)", 
      "date_published": "2016-06-17T01:29:17+00:00", 
      "title": "OpenCL/CUDA/\u4e91\u5b58\u50a8\u6c42\u725b\u4eba\u6307\u5bfc\uff0c\u5c0f\u4f19\u4f34\u4eec\u6765\u56f4\u89c2 \uff1a\uff09", 
      "id": "https://www.v2ex.com/t/286358"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/dqh3000", 
        "name": "dqh3000", 
        "avatar": "https://cdn.v2ex.com/avatar/1b72/837a/141112_large.png?m=1521240581"
      }, 
      "url": "https://www.v2ex.com/t/225518", 
      "date_modified": "2016-03-09T10:45:37+00:00", 
      "content_html": "<p>\u53d1\u73b0\u662f\u53ef\u4ee5\u7684\uff0c\u7528 canvas \u7684\u6bcf\u4e2a\u50cf\u7d20\u6a21\u62df\u4e00\u4e2a\u77e9\u9635\u7684\u503c\uff0c\u7136\u540e\u5728 fragment shader \u91cc\u9762\u8ba1\u7b97\u5c31\u53ef\u4ee5\u5b9e\u73b0\u77e9\u9635\u4e58\u6cd5\u4e86</p>\n\n<p>\u628a float16 \u7684\u8fd1\u4f3c\u7269\u653e\u5728 rgba \u4e2d\uff08 a \u51e0\u4e4e\u4e0d\u80fd\u7528\uff09\uff0c\u7136\u540e\u5230 fragment \u91cc\u9762\u8fd8\u539f\uff0c\u6700\u540e\u8f93\u51fa\u5230\u5c4f\u5e55\u4e0a\u7684\u56fe\u50cf\u5c31\u662f\u8ba1\u7b97\u7ed3\u679c</p>\n\n<p>\u5927\u6982\u5728 650m \u4e0b\u9762\u53ef\u4ee5\u6709 150Gflops+\u7684\u6210\u7ee9\uff08 float16 \uff09</p>\n\n<p>\u7136\u540e\u6211\u5728\u60f3\u8fd9\u4e1c\u897f\u6709\u4ec0\u4e48\u5375\u7528\u2026\u2026\uff08\u6211\u77e5\u9053\u8fd9\u4e2a\u4e16\u754c\u4e0a\u6709\u4e2a\u65e0\u4eba\u9e1f\u7684 webcl \uff09</p>\n", 
      "date_published": "2015-10-04T02:45:09+00:00", 
      "title": "\u60f3\u4e86\u60f3\u7528 WebGL \u80fd\u4e0d\u80fd\u505a\u79d1\u5b66\u8ba1\u7b97", 
      "id": "https://www.v2ex.com/t/225518"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/hardware", 
        "name": "hardware", 
        "avatar": "https://cdn.v2ex.com/gravatar/cc4fb30c125148f251f7345709336b55?s=73&d=retro"
      }, 
      "url": "https://www.v2ex.com/t/203204", 
      "date_modified": "2016-03-09T10:43:36+00:00", 
      "content_html": "<p>\u81ea\u5df1\u53ea\u6709\u4e24\u4e2amacbook pro retina\uff0c\u60f3\u5b66\u5b66caffe\u4ec0\u4e48\u7684\uff0c\u73b0\u5728\u5728\u7528parallel desktop\u8dd1linux\uff0c\u611f\u89c9\u633a\u6162\u7684\u3002\u6709\u6ca1\u6709\u4eba\u7528\u5916\u63a5\u663e\u5361\u505aGPU\u7f16\u7a0b\u7684\uff1f\u80fd\u5de5\u4f5c\u4e48\uff1f</p>\n", 
      "date_published": "2015-07-03T15:59:31+00:00", 
      "title": "\u6709\u4eba\u7528\u96f7\u7535\u8f6c PCI-e \u8bbe\u5907\u5916\u63a5\u663e\u5361\u8dd1\u8fc7 CUDA \u7684\u4e48\uff1f", 
      "id": "https://www.v2ex.com/t/203204"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/haoji", 
        "name": "haoji", 
        "avatar": "https://cdn.v2ex.com/avatar/ec8b/57b0/2879_large.png?m=1697947304"
      }, 
      "url": "https://www.v2ex.com/t/87690", 
      "date_modified": "2016-03-09T10:42:06+00:00", 
      "content_html": "\u6700\u8fd1\u5728\u7814\u7a76 GPU CUDA \u7f16\u7a0b\uff0c\u4e0d\u77e5\u9053\u6709\u6ca1\u6709 V2EXer \u5bf9\u8fd9\u65b9\u9762\u6bd4\u8f83\u4e86\u89e3\u7684\uff1f<br /><br />\u5728\u628a\u57fa\u7840\u7684 Hello\u3001vecAdd \u90a3\u4e9b\u7a0b\u5e8f\u770b\u4e86\u4e4b\u540e\uff0c\u7740\u624b\u5b9e\u73b0\u4e00\u4e2a\u7c7b grep \u547d\u4ee4\uff1b\u7136\u540e\u5229\u7528\u4e00\u4e2a\u5927\u6587\u4ef6\u6d4b\u8bd5\u6027\u80fd\u3002<br /><br />\u5bf9\u6bd4\u6d4b\u8bd5\u540e\u53d1\u73b0\uff0c\u6211\u7528 CUDA \u5b9e\u73b0\u7684 mygrep \u6bd4 linux \u81ea\u5e26\u7684 grep \u6162\u5f88\u591a\uff1a1m55s v.s. 0m3s<br /><br />\u6c57\u989c\u2026\u2026<br /><br />\u6709\u6ca1\u6709\u719f\u6089\u8fd9\u65b9\u9762\u7684\u670b\u53cb\u53ef\u4ee5\u7559\u4e2a\u8054\u7cfb\u65b9\u5f0f\u6307\u70b9+\u63a2\u8ba8\u4e00\u4e0b\uff1f", 
      "date_published": "2013-11-01T06:30:41+00:00", 
      "title": "\u5173\u4e8e GPU CUDA \u7f16\u7a0b\u7684\u4f18\u5316\u95ee\u9898", 
      "id": "https://www.v2ex.com/t/87690"
    }, 
    {
      "author": {
        "url": "https://www.v2ex.com/member/fanzeyi", 
        "name": "fanzeyi", 
        "avatar": "https://cdn.v2ex.com/avatar/a9a1/d531/585_large.png?m=1491194258"
      }, 
      "url": "https://www.v2ex.com/t/13337", 
      "date_modified": "2016-03-09T10:43:24+00:00", 
      "content_html": "RT..\r\n<br />\u9171\r\n<br />\u6211\u7528 OpenCL \u5c45\u7136\u6ca1\u901f\u5ea6..\r\n<br />\u7528 CUDA \u6700\u5feb  \u7528 CPU \u5f88\u6162..", 
      "date_published": "2011-05-22T04:48:43+00:00", 
      "title": "CPU or GPU? CUDA or OpenCL ?", 
      "id": "https://www.v2ex.com/t/13337"
    }
  ]
}