[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-cuda":3},{"tag":4,"articles":11,"peer_article_count":121},{"id":5,"name":6,"slug":7,"article_count":8,"description_zh":9,"description_en":10},"603dae7f-ab7d-4827-a3cb-4abe85e1f058","CUDA","cuda",15,"CUDA 是 NVIDIA GPU 的平行運算平台與程式模型，核心在 SM、warp、shared memory、HBM 延遲隱藏與資料搬移優化。它直接影響 AI 訓練、推論、科學模擬與高效能計算的效能上限。","CUDA is NVIDIA’s parallel computing platform and programming model, centered on SMs, warps, shared memory, and latency hiding with HBM. It shapes performance in AI training, inference, scientific simulation, and other GPU-heavy workloads.",[12,21,29,36,43,50,57,64,71,78,86,93,100,107,114],{"id":13,"slug":14,"title":15,"summary":16,"category":17,"image_url":18,"cover_image":18,"language":19,"created_at":20},"decd40da-6ddd-45f0-835f-7981d0f45111","cudf-turns-pandas-code-into-gpu-runs-en","cuDF turns pandas code into GPU runs","I break down cuDF’s GPU DataFrame stack and give you a copy-ready starter for pandas, Polars, and Dask on CUDA.","tools","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782058729869-s0tn.png","en","2026-06-21T16:18:27.628499+00:00",{"id":22,"slug":23,"title":24,"summary":25,"category":26,"image_url":27,"cover_image":27,"language":19,"created_at":28},"adf04097-64e9-416a-845e-3a376ed6289e","v100-raw-gguf-vs-prepacked-weight-cache-en","V100 raw GGUF vs prepacked weight cache","This compares raw GGUF Q4_K kernels and prepacked weight caches for V100 decode inference.","industry","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781441282199-hh84.png","2026-06-14T12:47:38.493638+00:00",{"id":30,"slug":31,"title":32,"summary":33,"category":26,"image_url":34,"cover_image":34,"language":19,"created_at":35},"638720e6-a425-485b-a9b9-3ff4e2f15399","rocm-vs-cuda-gpu-computing-comparison-en","ROCm vs CUDA: GPU Computing Comparison","ROCm and CUDA trade lower cost and openness against broader support and faster performance.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781439483900-gcea.png","2026-06-14T12:17:35.961195+00:00",{"id":37,"slug":38,"title":39,"summary":40,"category":17,"image_url":41,"cover_image":41,"language":19,"created_at":42},"908bd6d7-ba5a-40f0-8000-e785fd1372f5","cuda-oxide-rust-ptx-kernels-en","cuda-oxide turns Rust into PTX kernels","I break down cuda-oxide’s Rust-to-CUDA flow and give you a copyable template for writing PTX kernels in Rust.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781110153405-wqt4.png","2026-06-10T16:48:44.105254+00:00",{"id":44,"slug":45,"title":46,"summary":47,"category":17,"image_url":48,"cover_image":48,"language":19,"created_at":49},"741f86d7-bc7c-4ff6-8bc8-fbc0e7d780bd","gpu-programming-core-software-skill-en","GPU programming is becoming a core software skill","GPU programming should move from niche graphics work to a standard software skill.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781109179855-erus.png","2026-06-10T16:32:18.834923+00:00",{"id":51,"slug":52,"title":53,"summary":54,"category":17,"image_url":55,"cover_image":55,"language":19,"created_at":56},"2e2f6903-c431-447c-9bd6-cb6a4e3534a5","nvidia-research-gpu-template-en","NVIDIA research turns GPU docs into a template","I break down NVIDIA’s research page into a practical template for finding GPU tools, projects, and docs fast.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780567411737-vw5o.png","2026-06-04T10:02:58.56908+00:00",{"id":58,"slug":59,"title":60,"summary":61,"category":17,"image_url":62,"cover_image":62,"language":19,"created_at":63},"a7daef63-2e7d-4942-8bc1-7ebbe31ebb52","why-llama-cpp-release-notes-matter-more-than-bragging-en","Why llama.cpp’s release notes matter more than its model bragging","llama.cpp’s latest releases show that backend correctness drives real speed gains.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779769553066-1mx4.png","2026-05-26T04:25:24.65574+00:00",{"id":65,"slug":66,"title":67,"summary":68,"category":26,"image_url":69,"cover_image":69,"language":19,"created_at":70},"a75384ff-223f-4a34-9f86-ae5c2772a2d6","how-to-reduce-ai-model-serving-friction-en","How to Reduce AI Model Serving Friction","Reduce AI model serving friction by tightening exports, inputs, versions, and deployment checks.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1778922838163-oi8d.png","2026-05-16T09:13:32.742904+00:00",{"id":72,"slug":73,"title":74,"summary":75,"category":17,"image_url":76,"cover_image":76,"language":19,"created_at":77},"9f973836-4d14-4435-b3b7-fb180e57b5fc","cuda-architecture-sms-cores-memory-en","CUDA Architecture Explained: SMs, Cores, Memory","CUDA GPUs split work across SMs, thousands of cores, and layered memory. Here’s why that design beats CPUs on parallel tasks.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775197314080-mnf9.png","2026-04-03T06:21:38.505008+00:00",{"id":79,"slug":80,"title":81,"summary":82,"category":83,"image_url":84,"cover_image":84,"language":19,"created_at":85},"a15782d7-4678-4415-9a0b-4c642e46b022","nvidia-mlperf-software-inference-benchmarks-en","Nvidia’s MLPerf Gains Show Software Still Matters","Nvidia posted up to 2.77x MLPerf gains on GB300 NVL72, with software tricks like Dynamo and TensorRT-LLM doing heavy lifting.","research","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775185791842-obyu.png","2026-04-03T03:09:35.154603+00:00",{"id":87,"slug":88,"title":89,"summary":90,"category":17,"image_url":91,"cover_image":91,"language":19,"created_at":92},"a7f6594f-6643-4e71-b5c2-f0a5f44c0549","nvidia-forum-su7-cuda-lattice-engine-en","NVIDIA Forum Debates a SU(7) CUDA Lattice Engine","A CUDA forum thread on Anchor4 SU(7) mixes lattice theory, shared memory tuning, and warp-level tricks for GPU synchronization.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775178407338-4vh2.png","2026-04-03T01:06:28.835722+00:00",{"id":94,"slug":95,"title":96,"summary":97,"category":83,"image_url":98,"cover_image":98,"language":19,"created_at":99},"68bfa04a-94c4-4c8a-921c-61e93ab207aa","cuda-cp-async-ampere-hbm-latency-en","cp.async on Ampere: Hide HBM Latency on A100","Ampere’s cp.async moves data without stalling warps, cutting HBM waits from 450–600 cycles into overlapped compute on A100.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775167612143-4qvu.png","2026-04-02T22:06:36.521272+00:00",{"id":101,"slug":102,"title":103,"summary":104,"category":17,"image_url":105,"cover_image":105,"language":19,"created_at":106},"e05a606a-88b9-45cd-8c3e-7ad0b30b7b5d","cuda-in-2025-why-gpus-still-win-en","CUDA in 2025: Why GPUs Still Win","CUDA powers NVIDIA GPUs across AI, science, and simulation, with up to 10x weather-model speedups and deep learning gains in the thousands.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775149432831-x799.png","2026-04-02T17:03:38.270396+00:00",{"id":108,"slug":109,"title":110,"summary":111,"category":17,"image_url":112,"cover_image":112,"language":19,"created_at":113},"5dda57f2-dfb7-4970-98ec-2e6ad298dd8c","cuda-asinf-accuracy-no-performance-hit-en","CUDA asinf() Gets More Accurate Without Slowing Down","A developer tuned asinf() for CUDA 12.8 and kept the 26-instruction baseline while improving accuracy, a rare win for GPU math.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1775142952141-rcb7.png","2026-04-02T15:15:33.15066+00:00",{"id":115,"slug":116,"title":117,"summary":118,"category":26,"image_url":119,"cover_image":119,"language":19,"created_at":120},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","Explore the latest AI advancements from NVIDIA's GTC 2026, including new platforms, partnerships, and innovative AI applications.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1774496823463-j3oi.png","2026-03-25T16:22:47.882615+00:00",17]