[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-dexcompose-reuses-dexterous-policies-across-tasks-en":3,"article-related-dexcompose-reuses-dexterous-policies-across-tasks-en":30,"series-research-46714aa0-3c43-4154-a9cf-f961865b6109":73},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":22,"views":26,"created_at":27,"published_at":28,"topic_cluster_id":29},"46714aa0-3c43-4154-a9cf-f961865b6109","dexcompose-reuses-dexterous-policies-across-tasks-en","DexCompose Reuses Dexterous Policies Across Tasks","\u003Cp data-speakable=\"summary\">DexCompose composes pretrained hand policies into multi-task manipulation by assigning finger-level action ownership.\u003C\u002Fp>\u003Cul>\u003Cli>\u003Cstrong>Research org\u003C\u002Fstrong>: Unspecified in arXiv abstract\u003C\u002Fli>\u003Cli>\u003Cstrong>Core data\u003C\u002Fstrong>: 77.4% average composite success rate\u003C\u002Fli>\u003Cli>\u003Cstrong>Breakthrough\u003C\u002Fstrong>: Role-aware residual composition with finger-level action ownership\u003C\u002Fli>\u003C\u002Ful>\u003Cp>Dexterous manipulation is hard enough when a policy only has to do one thing well. This paper tackles the messier real-world version: keeping one skill working while adding another on top of it, using the same hand, the same fingers, and the same contact constraints.\u003C\u002Fp>\u003Cp>For robotics developers, that matters because policy reuse is usually where the engineering pain starts. If you can compose \u003Ca href=\"\u002Ftag\u002Fskills\">skills\u003C\u002Fa> instead of retraining from scratch, you have a better shot at building manipulators that can hold, move, and interact without turning every new task into a full re-learning problem.\u003C\u002Fp>\u003Ch2>What problem this paper is trying to fix\u003C\u002Fh2>\u003Cp>The core issue is interference. A dexterous policy may already know how to preserve an object state or complete a manipulation skill, but adding a second task can conflict with the first one. In hands, those conflicts show up at the level of overlapping fingers and contact modes, where one action can easily ruin another.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782712971902-lykf.png\" alt=\"DexCompose Reuses Dexterous Policies Across Tasks\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The abstract frames this as a destructive tradeoff between preserving an existing manipulation outcome and executing a new task. That is a familiar systems problem in a new setting: the policy is not just choosing actions, it is also managing which parts of the hand are allowed to change the scene and which parts need to keep it stable.\u003C\u002Fp>\u003Cp>Instead of treating composition as simple policy chaining, DexCompose tries to make the hand’s control space more explicit. The paper’s premise is that if you can separate “who owns which action,” you can reduce interference between the old skill and the new one.\u003C\u002Fp>\u003Ch2>How DexCompose works in plain English\u003C\u002Fh2>\u003Cp>DexCompose is described as a role-aware residual composition framework. The key idea is to reuse two pretrained full-hand policies rather than building a single monolithic controller for every composite task.\u003C\u002Fp>\u003Cp>First, the method collects successful post-task states from the first skill. Then it runs release tests over candidate finger masks to figure out which fingers are actually needed to keep that skill state intact. In other words, it probes the hand to learn which fingers are doing essential stabilization work and which ones can be repurposed.\u003C\u002Fp>\u003Cp>That finger-level analysis becomes the basis for explicit action ownership. Some fingers are assigned to preserve the existing outcome, while others are assigned to the new task. This is the “role-aware” part: the model is not just blending policies, it is deciding which parts of the hand should protect the current state and which parts should adapt.\u003C\u002Fp>\u003Cp>The framework then trains two asymmetric residual modules. One is a bounded residual stabilizer for task preservation, which sounds like a safeguard that keeps the original skill from drifting too far. The other is a context-aware residual that adapts the frozen downstream policy, but only inside the action subspace assigned to the new task.\u003C\u002Fp>\u003Cp>That asymmetry matters. The paper is not saying both tasks should be updated equally. It is saying preservation and adaptation are different control problems, so they should get different residual mechanisms.\u003C\u002Fp>\u003Ch2>What the paper actually shows\u003C\u002Fh2>\u003Cp>The evaluation covers 16 composite dexterous manipulation tasks. Those tasks span four object-retention skills and four downstream interactions, which gives the method a broader test than a single demo or a narrowly tuned \u003Ca href=\"\u002Ftag\u002Fbenchmark\">benchmark\u003C\u002Fa>.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782712978145-ratu.png\" alt=\"DexCompose Reuses Dexterous Policies Across Tasks\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>The abstract reports a 77.4% average composite success rate. That is the only concrete metric provided in the source, so there are no per-task breakdowns, ablation numbers, or comparison tables to cite here.\u003C\u002Fp>\u003Cp>Even with that limitation, the result is enough to support the paper’s main claim: structural action ownership plus dual residuals can make skill composition work better than conventional policy chaining. The authors present this as a promising direction for composing dexterous skills beyond the usual “run one policy, then run another” approach.\u003C\u002Fp>\u003Cp>What the abstract does not tell us is just as important for engineers. It does not provide benchmark baselines, hardware details, training cost, latency, or failure modes. So while the success rate is encouraging, you should treat the method as a research direction rather than a drop-in production recipe.\u003C\u002Fp>\u003Ch2>Why developers and robotics engineers should care\u003C\u002Fh2>\u003Cp>If you build manipulation systems, the big promise here is reuse. Pretrained policies are expensive to get right, and every new task usually increases the chance of interference. A framework that can preserve one skill while adding another could reduce retraining effort and make skill libraries more practical.\u003C\u002Fp>\u003Cp>The finger-mask idea is also interesting from a control-design perspective. It suggests a way to make composition more interpretable: instead of hoping the network learns not to collide with itself, you explicitly assign control roles to different parts of the hand.\u003C\u002Fp>\u003Cp>That said, the method still depends on having pretrained full-hand policies and on being able to identify useful post-task states and release behavior. So the approach is best viewed as a structured way to compose existing skills, not a replacement for learning those skills in the first place.\u003C\u002Fp>\u003Ch2>Limitations and open questions\u003C\u002Fh2>\u003Cp>The abstract leaves several practical questions unanswered. How sensitive is DexCompose to the quality of the pretrained policies? How well do the finger masks generalize across objects, grasps, or hand morphologies? And what happens when the “stable” and “new” tasks compete more aggressively than the test cases covered here?\u003C\u002Fp>\u003Cp>There is also a broader systems question: how scalable is this idea as the number of skills grows? The paper describes composition for two pretrained policies, but real robot stacks often need longer chains or branching task graphs. The abstract does not say whether the same ownership mechanism extends cleanly beyond pairwise composition.\u003C\u002Fp>\u003Cp>Still, the direction is useful. A lot of robotics work focuses on making a single policy stronger; this paper focuses on making policies composable. For developers, that shift matters because composability is what turns one-off demos into maintainable manipulation systems.\u003C\u002Fp>\u003Ch2>Bottom line\u003C\u002Fh2>\u003Cp>DexCompose argues that dexterous manipulation gets more reliable when the hand’s action space is explicitly divided into preservation and adaptation roles. The paper’s main contribution is a finger-aware residual framework that reuses pretrained policies for composite tasks, and its reported 77.4% average success rate suggests the idea is worth watching.\u003C\u002Fp>\u003Cp>For teams working on robot hands, the takeaway is simple: if you want multi-skill manipulation without constant retraining, action ownership may be a better abstraction than policy chaining alone.\u003C\u002Fp>\u003Cul>\u003Cli>DexCompose targets interference when adding a new dexterous task on top of an existing one.\u003C\u002Fli>\u003Cli>It uses release tests and finger masks to assign action ownership at the finger level.\u003C\u002Fli>\u003Cli>It reports a 77.4% average composite success rate across 16 tasks, but no baseline table in the abstract.\u003C\u002Fli>\u003C\u002Ful>","DexCompose composes pretrained hand policies into multi-task manipulation by assigning finger-level action ownership.","arxiv.org","https:\u002F\u002Farxiv.org\u002Fabs\u002F2606.28323",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782712971902-lykf.png","research","en","89159fcf-2fbb-4b72-9e05-7928e609a925",[17,18,19,20,21],"dexterous manipulation","policy composition","robotic hands","residual learning","multi-task control",[23,24,25],"Finger-level action ownership is the paper’s main trick for reducing interference.","DexCompose reuses pretrained full-hand policies instead of retraining from scratch.","The abstract reports 77.4% average composite success, but omits baseline and cost details.",0,"2026-06-29T06:02:28.7043+00:00","2026-06-29T06:02:28.695+00:00","3103988e-c4fe-45e3-98ab-846500c9d507",{"tags":31,"relatedLang":32,"relatedPosts":36},[],{"id":15,"slug":33,"title":34,"language":35},"dexcompose-reuses-dexterous-policies-across-tasks-zh","DexCompose 讓手部技能可重用","zh",[37,43,49,55,61,67],{"id":38,"slug":39,"title":40,"cover_image":41,"image_url":41,"created_at":42,"category":13},"f3edd37b-2524-4d6d-b411-7ca0cce9eff0","google-deepmind-turns-science-into-tools-en","Google DeepMind turns science into tools","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782721105101-d4rm.png","2026-06-29T08:17:58.280652+00:00",{"id":44,"slug":45,"title":46,"cover_image":47,"image_url":47,"created_at":48,"category":13},"c522f9af-2862-4f1c-bbf9-99bc20c78544","measuring-llm-behavior-portability-en","Measuring when LLM behavior actually переносится","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782717476648-9gjo.png","2026-06-29T07:17:30.115953+00:00",{"id":50,"slug":51,"title":52,"cover_image":53,"image_url":53,"created_at":54,"category":13},"1a5d9d4d-4e21-4860-84b0-9b209ca4d7f5","prompt-injection-ai-security-problem-en","Prompt injection is now an AI security problem","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782716584463-r1ei.png","2026-06-29T07:02:36.642691+00:00",{"id":56,"slug":57,"title":58,"cover_image":59,"image_url":59,"created_at":60,"category":13},"fba917c8-939c-4457-a90e-4012d9a692df","solver-choice-nash-equilibrium-selection-en","Solver choice changes which Nash equilibrium wins","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782714784738-e4dj.png","2026-06-29T06:32:31.603116+00:00",{"id":62,"slug":63,"title":64,"cover_image":65,"image_url":65,"created_at":66,"category":13},"2ae76231-ec2d-48aa-a82f-1d26f1b36882","proper-positive-only-learning-characterization-en","Proper positive-only learning gets a full characterization","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782713877346-qgit.png","2026-06-29T06:17:34.38343+00:00",{"id":68,"slug":69,"title":70,"cover_image":71,"image_url":71,"created_at":72,"category":13},"ce859659-0b28-456b-8641-63f6d4c47cf9","hawor-hand-motion-mano-params-en","HaWoR turns hand motion into MANO params","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1782705791981-evnj.png","2026-06-29T04:02:46.964681+00:00",[74,79,84,89,94,99,104,109,114,119],{"id":75,"slug":76,"title":77,"created_at":78},"a2715e72-1fe8-41b3-abb1-d0cf1f710189","ai-predictions-2026-big-changes-en","AI Predictions for 2026: Brace for Big Changes","2026-03-26T01:25:07.788356+00:00",{"id":80,"slug":81,"title":82,"created_at":83},"8404bd7b-4c2f-4109-9ec4-baf29d88af2b","ml-papers-of-the-week-github-research-desk-en","ML Papers of the Week Turns GitHub Into a Research Desk","2026-03-27T01:11:39.480259+00:00",{"id":85,"slug":86,"title":87,"created_at":88},"87897a94-8065-4464-a016-1f23e89e17cc","ai-ml-conferences-to-watch-in-2026-en","AI\u002FML Conferences to Watch in 2026","2026-03-27T01:51:54.184108+00:00",{"id":90,"slug":91,"title":92,"created_at":93},"6f1987cf-25f3-47a4-b3e6-db0997695be8","openclaw-agents-manipulated-self-sabotage-en","OpenClaw Agents Can Be Manipulated Into Failure","2026-03-28T03:03:18.899465+00:00",{"id":95,"slug":96,"title":97,"created_at":98},"a53571ad-735a-4178-9f93-cb09b699d99c","vega-driving-language-instructions-en","Vega: Driving with Natural Language Instructions","2026-03-28T14:54:04.698882+00:00",{"id":100,"slug":101,"title":102,"created_at":103},"a34581d6-f36e-46da-88bb-582fb3e7425c","personalizing-autonomous-driving-styles-en","Drive My Way: Personalizing Autonomous Driving Styles","2026-03-28T14:54:26.148181+00:00",{"id":105,"slug":106,"title":107,"created_at":108},"2bc1ad7f-26ce-4f02-9885-803b35fd229d","training-knowledge-bases-writeback-rag-en","Training Knowledge Bases with WriteBack-RAG","2026-03-28T14:54:45.643433+00:00",{"id":110,"slug":111,"title":112,"created_at":113},"71adc507-3c54-4605-bbe2-c966acd6187e","packforcing-long-video-generation-en","PackForcing: Efficient Long-Video Generation Method","2026-03-28T14:55:02.646943+00:00",{"id":115,"slug":116,"title":117,"created_at":118},"675942ef-b9ec-4c5f-a997-381250b6eacb","pixelsmile-facial-expression-editing-en","PixelSmile Framework Enhances Facial Expression Editing","2026-03-28T14:55:20.633463+00:00",{"id":120,"slug":121,"title":122,"created_at":123},"6954fa2b-8b66-4839-884b-e46f89fa1bc3","adaptive-block-scaled-data-types-en","IF4: Smarter 4-Bit Quantization That Adapts to Your Data","2026-03-31T06:00:36.65963+00:00"]