[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-nvidia-hugging-face-ai-pipelines-en":3,"article-related-nvidia-hugging-face-ai-pipelines-en":35,"series-industry-a3dc08d5-311b-4d76-990f-4f3add2133c9":89},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":27,"views":31,"created_at":32,"published_at":33,"topic_cluster_id":34},"a3dc08d5-311b-4d76-990f-4f3add2133c9","nvidia-hugging-face-ai-pipelines-en","NVIDIA’s Hugging Face hub is built for AI pipelines","\u003Cp data-speakable=\"summary\">\u003Ca href=\"\u002Ftag\u002Fnvidia\">NVIDIA\u003C\u002Fa>’s Hugging Face collection groups models and datasets for reasoning, speech, vision, RAG, and physical AI.\u003C\u002Fp>\u003Cp>NVIDIA’s Hugging Face collection is a practical map of where its open models fit in real systems: RLHF, \u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa>-as-a-Judge, speech pipelines, document parsing, and robotics. The catalog includes 74 model entries in one visible segment and spans sizes from 120M to 550B parameters.\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Item\u003C\u002Fth>\u003Cth>Model size\u003C\u002Fth>\u003Cth>Notable spec\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Nemotron 3 Nano\u003C\u002Ftd>\u003Ctd>30B total \u002F 3B active\u003C\u002Ftd>\u003Ctd>1M-token context, up to 4× faster inference\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Nemotron 3 Super\u003C\u002Ftd>\u003Ctd>120B total \u002F 12B active\u003C\u002Ftd>\u003Ctd>1M-token context, up to 5× higher throughput\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Nemotron 3 Ultra\u003C\u002Ftd>\u003Ctd>550B total \u002F 55B active\u003C\u002Ftd>\u003Ctd>Frontier-scale reasoning for code, math, science\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Nemotron 3.5 Content Safety\u003C\u002Ftd>\u003Ctd>4B\u003C\u002Ftd>\u003Ctd>Multimodal safety moderation\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Parakeet Realtime EOU\u003C\u002Ftd>\u003Ctd>120M\u003C\u002Ftd>\u003Ctd>80–160ms latency, end-of-utterance detection\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>1. Nemotron 3 for long-context reasoning\u003C\u002Fh2>\u003Cp>The Nemotron 3 family is the clearest sign that NVIDIA is aiming at production reasoning, not just \u003Ca href=\"\u002Ftag\u002Fbenchmark\">benchmark\u003C\u002Fa> demos. The lineup covers on-device agents, heavy multi-step orchestration, and ultra-large reasoning workloads, all with open weights and reproducible recipes.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781337773588-31s6.png\" alt=\"NVIDIA’s Hugging Face hub is built for AI pipelines\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Pick \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fnvidia\">NVIDIA\u003C\u002Fa>’s Nemotron 3 models when you need a model that can keep state across long sessions and still fit different deployment budgets.\u003C\u002Fp>\u003Cul>\u003Cli>Nemotron 3 Nano: 30B total \u002F 3B active, 1M-token context\u003C\u002Fli>\u003Cli>Nemotron 3 Super: 120B total \u002F 12B active, LatentMoE, MTP layers\u003C\u002Fli>\u003Cli>Nemotron 3 Ultra: 550B total \u002F 55B active, built for code, math, science\u003C\u002Fli>\u003Cli>Served via vLLM and SGLang for deployment flexibility\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>2. Safety models for moderation and policy checks\u003C\u002Fh2>\u003Cp>If your pipeline needs content filtering before generation or evaluation, NVIDIA’s safety models are built for that layer. The 3.5 Content Safety model is multimodal and multilingual, which matters when moderation has to cover text and images together.\u003C\u002Fp>\u003Cp>This is the part of the catalog that fits enterprise review flows, custom policy enforcement, and judge-style guardrails without forcing you to bolt on a separate safety stack.\u003C\u002Fp>\u003Cul>\u003Cli>Nemotron 3.5 Content Safety: 4B parameters\u003C\u002Fli>\u003Cli>Supports text and image inputs\u003C\u002Fli>\u003Cli>Includes reasoning traces for policy decisions\u003C\u002Fli>\u003Cli>Works for taxonomy-based and custom-policy moderation\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>3. Speech models for ASR and voice agents\u003C\u002Fh2>\u003Cp>NVIDIA’s speech section is broader than a single ASR checkpoint. It covers transcription, translation, streaming, diarization, and turn-taking, which makes it useful for voice agents that need both speed and structure.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781337773109-yxrf.png\" alt=\"NVIDIA’s Hugging Face hub is built for AI pipelines\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>For low-latency systems, the standout detail is the streaming setup: chunk sizes can be tuned from 80ms to 1120ms, and the Parakeet Realtime EOU model detects end-of-utterance at 80–160ms latency.\u003C\u002Fp>\u003Cul>\u003Cli>Parakeet: FastConformer-based ASR with low WER\u003C\u002Fli>\u003Cli>Canary: multilingual transcription and translation across 25 languages\u003C\u002Fli>\u003Cli>Nemotron Speech Streaming: cache-aware streaming ASR with punctuation and capitalization\u003C\u002Fli>\u003Cli>Parakeet Realtime EOU: 120M parameters, fast turn-taking support\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>4. Vision and document intelligence for messy inputs\u003C\u002Fh2>\u003Cp>When your source material is not clean text, NVIDIA’s vision models are aimed at extracting structure from PDFs, scans, charts, and images. Nemotron Parse is especially useful because it focuses on layout understanding, not just raw OCR.\u003C\u002Fp>\u003Cp>That makes this section relevant for document AI teams, search indexing, and multimodal Q&A systems that need tables, bounding boxes, and semantic labels instead of plain text dumps.\u003C\u002Fp>\u003Cul>\u003Cli>Nemotron Parse: structured output from unstructured PDFs and images\u003C\u002Fli>\u003Cli>Extract models: charts, tables, scanned documents\u003C\u002Fli>\u003Cli>Embed models: shared vector spaces for text, images, audio\u003C\u002Fli>\u003Cli>Rerank models: cross-encoder rescoring for retrieval pipelines\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>5. Cosmos and physical AI for robotics\u003C\u002Fh2>\u003Cp>Cosmos is NVIDIA’s answer to simulated physical interaction, with generative world models, tokenizers, and data curation tools for robotics and autonomous systems. It is the most specialized part of the collection, but also the most interesting if you are building agents that need to understand motion and environment dynamics.\u003C\u002Fp>\u003Cp>The most concrete numbers here are worth noting: Cosmos Tokenizer claims up to 2048× total compression and up to 12× faster performance than prior SOTA, while Cosmos Predict 2.5 ships in 2B and 14B variants.\u003C\u002Fp>\u003Cul>\u003Cli>Cosmos Tokenizer: continuous and discrete variants\u003C\u002Fli>\u003Cli>Cosmos Predict 2.5: text, image, or video inputs\u003C\u002Fli>\u003Cli>Built for simulation, robotics, and autonomous systems\u003C\u002Fli>\u003Cli>Targets high-fidelity, physics-aware generation\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>How to decide\u003C\u002Fh2>\u003Cp>Choose Nemotron 3 if your priority is long-context reasoning or \u003Ca href=\"\u002Ftag\u002Fagent\">agent\u003C\u002Fa> orchestration. Choose the speech models if your product lives in live audio, transcription, or voice agents. Choose Nemotron Parse and the RAG stack if your work starts with messy documents. Choose Cosmos if you are building robotics or other physical AI systems.\u003C\u002Fp>\u003Cp>If you want one starting point for general \u003Ca href=\"\u002Ftag\u002Fenterprise-ai\">enterprise AI\u003C\u002Fa>, begin with Nemotron 3 Super or the Llama-3.1-Nemotron collaboration models, then branch into safety, speech, or retrieval as your pipeline matures.\u003C\u002Fp>","NVIDIA’s Hugging Face collection groups 5 model families for reasoning, speech, vision, RAG, and physical AI.","huggingface.co","https:\u002F\u002Fhuggingface.co\u002Fnvidia\u002Fcollections",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781337773588-31s6.png","industry","en","ebbd7c3b-23a7-4b31-9bae-1a8fb4dc5eef",[17,18,19,20,21,22,23,24,25,26],"NVIDIA","Hugging Face","Nemotron","Parakeet","Cosmos","RAG","speech models","document intelligence","physical AI","RLHF",[28,29,30],"NVIDIA’s Hugging Face hub spans reasoning, safety, speech, vision, RAG, and robotics.","Nemotron 3 covers long-context work with variants from 3B active parameters to 55B active.","Speech, document parsing, and Cosmos each target a different production pipeline need.",0,"2026-06-13T08:02:23.733668+00:00","2026-06-13T08:02:23.725+00:00","d19fc184-5852-4c4d-9ec0-db0c4841ac17",{"tags":36,"relatedLang":48,"relatedPosts":52},[37,39,42,44,46],{"name":18,"slug":38},"hugging-face",{"name":40,"slug":41},"Nvidia","nvidia",{"name":19,"slug":43},"nemotron",{"name":20,"slug":45},"parakeet",{"name":21,"slug":47},"cosmos",{"id":15,"slug":49,"title":50,"language":51},"nvidia-hugging-face-ai-pipelines-zh","NVIDIA 的 Hugging Face 5 類模型最適合誰","zh",[53,59,65,71,77,83],{"id":54,"slug":55,"title":56,"cover_image":57,"image_url":57,"created_at":58,"category":13},"865212b4-7bd6-4bb3-a1f1-592960b5b7a3","google-gemini-outage-error-1076-june-2026-en","Google Gemini outage hits users with error 1076","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781338673852-kpqi.png","2026-06-13T08:17:27.75214+00:00",{"id":60,"slug":61,"title":62,"cover_image":63,"image_url":63,"created_at":64,"category":13},"d96ff33a-47a4-421f-b7d4-ded157b345b6","anthropic-public-record-ai-anxiety-policy-en","Anthropic’s survey turns AI anxiety into policy","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781327893716-5hv3.png","2026-06-13T05:17:42.92009+00:00",{"id":66,"slug":67,"title":68,"cover_image":69,"image_url":69,"created_at":70,"category":13},"07f6818a-6612-4e79-a0b6-7b5014fadafc","chatgpt-grew-from-chatbot-to-platform-en","ChatGPT grew from chatbot to platform","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781325174493-j6tn.png","2026-06-13T04:32:28.006595+00:00",{"id":72,"slug":73,"title":74,"cover_image":75,"image_url":75,"created_at":76,"category":13},"c750890e-4ddf-4e1c-85d5-a5bd4433620f","openai-files-confidential-ipo-after-122b-round-en","OpenAI Files Confidential IPO After $122B Round","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781323367848-n0ns.png","2026-06-13T04:02:24.359675+00:00",{"id":78,"slug":79,"title":80,"cover_image":81,"image_url":81,"created_at":82,"category":13},"b0cb27e2-ca71-40a2-a012-73627f1c995c","government-access-orders-frontier-model-access-en","Government access orders should govern frontier model access","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781319762267-0x3b.png","2026-06-13T03:02:19.503078+00:00",{"id":84,"slug":85,"title":86,"cover_image":87,"image_url":87,"created_at":88,"category":13},"fac6f2b6-6a69-4fef-83c8-45eb5d323004","claude-code-cursor-copilot-2026-ai-agents-en","Claude Code, Cursor, and Copilot set the 2026 bar","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781317069662-0zc1.png","2026-06-13T02:17:22.342047+00:00",[90,95,100,105,110,115,120,125,130,135],{"id":91,"slug":92,"title":93,"created_at":94},"d35a1bd9-e709-412e-a2df-392df1dc572a","ai-impact-2026-developments-market-en","AI's Impact in 2026: Key Developments and Market Shifts","2026-03-25T16:20:33.205823+00:00",{"id":96,"slug":97,"title":98,"created_at":99},"5ed27921-5fd6-492e-8c59-78393bf37710","trumps-ai-legislative-framework-en","Trump's AI Legislative Framework: What's Inside?","2026-03-25T16:22:20.005325+00:00",{"id":101,"slug":102,"title":103,"created_at":104},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","2026-03-25T16:22:47.882615+00:00",{"id":106,"slug":107,"title":108,"created_at":109},"0ebb5b16-774a-4922-945d-5f2ce1df5a6d","claude-usage-diversifies-learning-curves-en","Claude Usage Diversifies, Learning Curves Emerge","2026-03-25T16:25:50.770376+00:00",{"id":111,"slug":112,"title":113,"created_at":114},"69934e86-2fc5-4280-8223-7b917a48ace8","openclaw-ai-commoditization-concerns-en","OpenClaw's Rise Raises Concerns of AI Model Commoditization","2026-03-25T16:26:30.582047+00:00",{"id":116,"slug":117,"title":118,"created_at":119},"b4b2575b-2ac8-46b2-b90e-ab1d7c060797","google-gemini-ai-rollout-2026-en","Google's Gemini AI Rollout Extended to 2026","2026-03-25T16:28:14.808842+00:00",{"id":121,"slug":122,"title":123,"created_at":124},"6e18bc65-42ae-4ad0-b564-67d7f66b979e","meta-llama4-fabricated-results-scandal-en","Meta's Llama 4 Scandal: Fabricated AI Test Results Unveiled","2026-03-25T16:29:15.482836+00:00",{"id":126,"slug":127,"title":128,"created_at":129},"bf888e9d-08be-4f47-996c-7b24b5ab3500","accenture-mistral-ai-deployment-en","Accenture and Mistral AI Team Up for AI Deployment","2026-03-25T16:31:01.894655+00:00",{"id":131,"slug":132,"title":133,"created_at":134},"5382b536-fad2-49c6-ac85-9eb2bae49f35","mistral-ai-high-stakes-2026-en","Mistral AI: Facing High Stakes in 2026","2026-03-25T16:31:39.941974+00:00",{"id":136,"slug":137,"title":138,"created_at":139},"9da3d2d6-b669-4971-ba1d-17fdb3548ed5","cursors-meteoric-rise-pressures-en","Cursor's Meteoric Rise Faces Industry Pressures","2026-03-25T16:32:21.899217+00:00"]