[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"tag-distillation":3},{"tag":4,"articles":10,"peer_article_count":7},{"id":5,"name":6,"slug":6,"article_count":7,"description_zh":8,"description_en":9},"a6bb7f9c-bf74-4e21-920b-019ddd1e2da3","distillation",3,"蒸餾是把大型模型的推理能力、排序偏好或生成行為，轉移到較小模型的訓練方法。它常用於降低推論成本、縮短延遲，並讓 SLM 在重排、生成與跨架構對齊上更實用。","Distillation transfers a larger model’s behavior—ranking preferences, generation patterns, or reasoning signals—into a smaller student model. It matters because teams use it to cut inference cost and latency while keeping SLMs useful for reranking, generation, and cross-architecture alignment.",[11,20,27,34,42,49,56],{"id":12,"slug":13,"title":14,"summary":15,"category":16,"image_url":17,"cover_image":17,"language":18,"created_at":19},"3efb3e20-b2da-4abd-b442-3babd8b0ed1e","opd-distillation-skills-without-bruteforce-rl-en","OPD lets you distill skills without brute-force RL","I break down On-Policy Distillation and turn the idea into a copy-ready post-training template.","research","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782730111097-6brq.png","en","2026-06-29T10:47:57.980973+00:00",{"id":21,"slug":22,"title":23,"summary":24,"category":16,"image_url":25,"cover_image":25,"language":18,"created_at":26},"696a4c45-6c7b-4a78-a947-2dee1ddc4a58","danceopd-on-policy-generative-field-distillation-en","DanceOPD distills image-editing skills into one model","DanceOPD trains flow-matching image models to combine text-to-image and editing skills without them fighting each other.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782453779169-rakb.png","2026-06-26T06:02:33.604728+00:00",{"id":28,"slug":29,"title":30,"summary":31,"category":16,"image_url":32,"cover_image":32,"language":18,"created_at":33},"6dc0410b-c9ec-4148-974b-0b5f7a14975c","uniego-proxy-teachers-egocentric-video-en","UNIEGO unifies egocentric video with proxy teachers","UNIEGO uses proxy models to distill nine teachers into one egocentric encoder.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781849887430-g735.png","2026-06-19T06:17:32.327109+00:00",{"id":35,"slug":36,"title":37,"summary":38,"category":39,"image_url":40,"cover_image":40,"language":18,"created_at":41},"afcb2a04-144f-4df0-bd66-6cf165e16446","apples-gemini-deal-turns-cloud-ai-into-local-ai-en","Apple’s Gemini deal turns cloud AI into local AI","Apple is using Google Gemini distillation and Nvidia confidential compute to push Siri toward local-first AI with cloud backup.","industry","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1780535909059-vvj6.png","2026-06-04T01:18:03.540145+00:00",{"id":43,"slug":44,"title":45,"summary":46,"category":16,"image_url":47,"cover_image":47,"language":18,"created_at":48},"9e4cc5d5-2a7b-4175-b42c-3f960810da34","carv-cuts-diffusion-teacher-gradient-variance-en","CARV cuts diffusion-teacher gradient variance","CARV reduces Monte Carlo variance in diffusion-teacher pipelines by reusing expensive upstream work and smarter noise sampling.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1779343556363-bary.png","2026-05-21T06:05:31.21684+00:00",{"id":50,"slug":51,"title":52,"summary":53,"category":16,"image_url":54,"cover_image":54,"language":18,"created_at":55},"5abc17e1-200d-4005-90a2-ba5abc1187bb","select-to-think-slms-local-sufficiency-en","Select-to-Think: Let SLMs Re-rank Themselves","A new method lets small language models re-rank their own candidates instead of calling an LLM at inference time.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777530657379-kuvy.png","2026-04-30T06:30:36.54762+00:00",{"id":57,"slug":58,"title":59,"summary":60,"category":16,"image_url":61,"cover_image":61,"language":18,"created_at":62},"2061a3d3-9d89-4722-ac8b-e359941b4573","tide-cross-architecture-diffusion-llm-distillation-en","TIDE distills diffusion LLMs across architectures","TIDE distills diffusion LLMs across architectures, adding noise-aware weighting and tokenizer-aware objectives to improve a 0.6B student.","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1777529449496-pbon.png","2026-04-30T06:10:34.03377+00:00"]