[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-fine-tune-slm-emotion-recognition-en":3,"article-related-fine-tune-slm-emotion-recognition-en":31,"series-tools-4ed8f024-fcf6-493a-ac60-fff51479e92f":84},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":23,"views":27,"created_at":28,"published_at":29,"topic_cluster_id":30},"4ed8f024-fcf6-493a-ac60-fff51479e92f","fine-tune-slm-emotion-recognition-en","Fine-Tune an SLM for Emotion Recognition","\u003Cp data-speakable=\"summary\">Build an open-weight emotion classifier with ISMOTE, LoRA, and focal loss.\u003C\u002Fp>\u003Cp>This guide is for developers who want to turn an open-weight small language model into a multi-label emotion recognizer for texts like support tickets, social posts, and email threads.\u003C\u002Fp>\u003Cp>By the end, you will have a reproducible training pipeline that balances skewed emotion data, fine-tunes a Mistral Small model with LoRA, and evaluates per-emotion F1 scores on a held-out test set.\u003C\u002Fp>\u003Ch2>Before you start\u003C\u002Fh2>\u003Cul>\u003Cli>Python 3.10+.\u003C\u002Fli>\u003Cli>CUDA-capable GPU with at least 24 GB VRAM for Mistral Small 3.1 24B Instruct.\u003C\u002Fli>\u003Cli>PyTorch 2.4+ with CUDA support.\u003C\u002Fli>\u003Cli>Hugging Face account and a valid access token.\u003C\u002Fli>\u003Cli>Access to the \u003Ca href=\"https:\u002F\u002Fhuggingface.co\u002Fdatasets\u002Fgoogle-research-datasets\u002Fgo_emotions\" target=\"_blank\" rel=\"noopener noreferrer\">GoEmotions dataset\u003C\u002Fa> and the \u003Ca href=\"https:\u002F\u002Fgithub.com\u002Funslothai\u002Funsloth\" target=\"_blank\" rel=\"noopener noreferrer\">Unsloth repository\u003C\u002Fa>.\u003C\u002Fli>\u003Cli>Libraries: transformers, datasets, scikit-learn, numpy, pandas, and unsloth.\u003C\u002Fli>\u003Cli>Enough disk space for the model weights, augmented dataset, and checkpoints.\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>Step 1: Prepare the emotion labels\u003C\u002Fh2>\u003Cp>Your first outcome is a clean label map for the 15 emotions you want the model to predict. Start by selecting the target classes from GoEmotions, then define a consistent label order for training, validation, and test splits.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781231572194-ht3h.png\" alt=\"Fine-Tune an SLM for Emotion Recognition\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cpre>\u003Ccode>EMOTION_LABELS = [\n  \"fear\", \"sadness\", \"disgust\", \"disapproval\", \"annoyance\",\n  \"anger\", \"disappointment\", \"optimism\", \"amusement\", \"surprise\",\n  \"admiration\", \"excitement\", \"confusion\", \"joy\", \"love\"\n]\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify that every example uses the same label vector length. You should see a fixed 15-element multi-hot label format across all splits.\u003C\u002Fp>\u003Ch2>Step 2: Balance the training split\u003C\u002Fh2>\u003Cp>Your second outcome is a training set that does not let neutral examples dominate learning. Thin the majority class by randomly filtering neutral rows, then oversample the rare emotions with ISMOTE so the minority classes reach a usable sample count.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781231568864-5vq8.png\" alt=\"Fine-Tune an SLM for Emotion Recognition\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\u003Cp>Keep validation and test sets unchanged so your metrics reflect real-world imbalance. The article’s approach combines undersampling, synthetic expansion, and loss weighting to improve minority-class behavior without inflating evaluation scores.\u003C\u002Fp>\u003Cp>Verify the new distribution by plotting label frequencies before and after augmentation. You should see neutral shrink and the target emotions become much closer in count.\u003C\u002Fp>\u003Ch2>Step 3: Load the base Mistral model\u003C\u002Fh2>\u003Cp>Your third outcome is a local copy of the open-weight backbone ready for parameter-efficient tuning. Use Unsloth to load Mistral Small 3.1 24B Instruct in 4-bit mode so the model fits the available \u003Ca href=\"\u002Ftag\u002Fgpu\">GPU\u003C\u002Fa> memory.\u003C\u002Fp>\u003Cpre>\u003Ccode>from unsloth import FastLanguageModel\nimport torch\n\nMODEL_NAME = \"unsloth\u002FMistral-Small-3.1-24B-Instruct-2503\"\nbase_model, _ = FastLanguageModel.from_pretrained(\n    model_name=MODEL_NAME,\n    max_seq_length=2,\n    load_in_4bit=True,\n    dtype=torch.bfloat16,\n)\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the load by checking that the model initializes without out-of-memory errors and reports 4-bit weights. You should see the backbone available on your GPU and ready for adapter injection.\u003C\u002Fp>\u003Ch2>Step 4: Add LoRA and focal loss\u003C\u002Fh2>\u003Cp>Your fourth outcome is a lightweight training setup that can learn multi-label emotion patterns without full fine-tuning. Attach LoRA adapters to the attention and MLP projection layers, then wrap the backbone with a custom multi-label head and a focal-loss function that emphasizes the harder and rarer labels.\u003C\u002Fp>\u003Cpre>\u003Ccode>base_model = FastLanguageModel.get_peft_model(\n    base_model,\n    r=16,\n    lora_alpha=32,\n    lora_dropout=0,\n    bias=\"none\",\n    target_modules=[\"q_proj\", \"k_proj\", \"v_proj\", \"o_proj\", \"gate_proj\", \"up_proj\", \"down_proj\"],\n    use_gradient_checkpointing=\"unsloth\",\n    random_state=3407,\n    use_rslora=False,\n)\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the adapter layer count is small compared with the full model. You should see trainable parameters drop sharply while the model still accepts multi-label targets and returns logits for all 15 emotions.\u003C\u002Fp>\u003Ch2>Step 5: Train and score the classifier\u003C\u002Fh2>\u003Cp>Your fifth outcome is a trained emotion model with measurable performance on held-out data. Configure epoch-based evaluation, compute exact accuracy plus macro and micro F1, and train until the best checkpoint is saved. The source reports that this combination produced F1 above 0.7 for most target emotions.\u003C\u002Fp>\u003Cpre>\u003Ccode>from sklearn.metrics import f1_score, precision_score, recall_score, accuracy_score\n\n# train with epoch evaluation, then score on the test set\n# compute_metrics should threshold sigmoid outputs at 0.5\u003C\u002Fcode>\u003C\u002Fpre>\u003Cp>Verify the run by checking the evaluation logs and test-set report. You should see per-class metrics for each emotion and a best checkpoint selected from validation performance.\u003C\u002Fp>\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Metric\u003C\u002Fth>\u003Cth>Before\u002FBaseline\u003C\u002Fth>\u003Cth>After\u002FResult\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Minority-class balance\u003C\u002Ftd>\u003Ctd>Heavily skewed toward neutral\u003C\u002Ftd>\u003Ctd>Neutral reduced and rare emotions expanded with ISMOTE\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Training method\u003C\u002Ftd>\u003Ctd>Full fine-tuning would need more memory\u003C\u002Ftd>\u003Ctd>4-bit base model plus LoRA adapters\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Emotion F1\u003C\u002Ftd>\u003Ctd>Not reported for the raw baseline\u003C\u002Ftd>\u003Ctd>Most target emotions reached F1 &gt; 0.7\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\u003Ch2>Common mistakes\u003C\u002Fh2>\u003Cul>\u003Cli>Using the imbalanced dataset as-is. Fix: undersample neutral examples and oversample rare labels before training.\u003C\u002Fli>\u003Cli>Forgetting multi-label thresholds. Fix: use sigmoid outputs and a 0.5 cutoff, not softmax.\u003C\u002Fli>\u003Cli>Running out of GPU memory. Fix: keep 4-bit loading, LoRA, and gradient checkpointing enabled.\u003C\u002Fli>\u003C\u002Ful>\u003Ch2>What's next\u003C\u002Fh2>\u003Cp>From here, try exporting the model to the \u003Ca href=\"\u002Fnews\u002Funsloth-kimi-k25-gguf-hugging-face-en\">Hugging Face\u003C\u002Fa> Hub, tuning the threshold per emotion, and comparing ISMOTE with simpler oversampling methods on your own domain data.\u003C\u002Fp>","Build an open-weight emotion classifier with ISMOTE, LoRA, and focal loss.","towardsdatascience.com","https:\u002F\u002Ftowardsdatascience.com\u002Fhow-to-fine-tune-an-slm-for-emotion-recognition\u002F",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781231572194-ht3h.png","tools","en","851e075c-b22f-4425-a5c8-28132574da25",[17,18,19,20,21,22],"Mistral","Unsloth","LoRA","ISMOTE","GoEmotions","focal loss",[24,25,26],"Balance skewed emotion data before fine-tuning or the model will overpredict neutral and common labels.","Use 4-bit loading plus LoRA to adapt a large open-weight SLM on limited GPU memory.","Evaluate with per-class F1, precision, and recall so rare emotions do not disappear in aggregate metrics.",0,"2026-06-12T02:32:24.046317+00:00","2026-06-12T02:32:24.038+00:00","a7343b93-37cc-4634-a2bc-707f6275bdb6",{"tags":32,"relatedLang":43,"relatedPosts":47},[33,35,37,39,41],{"name":18,"slug":34},"unsloth",{"name":19,"slug":36},"lora",{"name":17,"slug":38},"mistral",{"name":21,"slug":40},"goemotions",{"name":20,"slug":42},"ismote",{"id":15,"slug":44,"title":45,"language":46},"fine-tune-slm-emotion-recognition-zh","情緒辨識 SLM 微調實作指南","zh",[48,54,60,66,72,78],{"id":49,"slug":50,"title":51,"cover_image":52,"image_url":52,"created_at":53,"category":13},"c3c6bd31-b523-431e-824c-8895d9a9eed5","vibe-coding-lets-you-ship-a-tiny-app-fast-en","Vibe coding lets you ship a tiny app fast","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781254112614-ohkp.png","2026-06-12T08:47:56.790888+00:00",{"id":55,"slug":56,"title":57,"cover_image":58,"image_url":58,"created_at":59,"category":13},"d84c9786-c0ff-4b40-a1f4-9efe5aad08c3","what-vibe-coding-means-for-developers-en","What Vibe Coding Means for Developers","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781253189214-z64z.png","2026-06-12T08:32:32.496531+00:00",{"id":61,"slug":62,"title":63,"cover_image":64,"image_url":64,"created_at":65,"category":13},"0b197e53-381b-4a4d-a398-d854704f3109","product-hunt-vibe-coding-tools-2026-en","Product Hunt’s vibe-coding stack for shipping faster","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781252320207-kh8p.png","2026-06-12T08:18:04.418879+00:00",{"id":67,"slug":68,"title":69,"cover_image":70,"image_url":70,"created_at":71,"category":13},"4b7af584-521a-4d95-a347-f52bad4a53fb","copilot-keeps-old-amd-linux-gpus-alive-en","Copilot keeps old AMD Linux GPUs alive","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781242407331-vze5.png","2026-06-12T05:32:54.597198+00:00",{"id":73,"slug":74,"title":75,"cover_image":76,"image_url":76,"created_at":77,"category":13},"83098f22-962f-45cd-81f1-4e5b15f2d524","midjourney-pricing-guide-2026-plans-costs-en","Midjourney Pricing Guide for 2026 Plans","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781230670512-79dr.png","2026-06-12T02:17:24.821512+00:00",{"id":79,"slug":80,"title":81,"cover_image":82,"image_url":82,"created_at":83,"category":13},"42164bdf-1cae-4f43-ba29-f54d449ae2b9","qvac-turns-consumer-hardware-into-local-ai-en","QVAC turns consumer hardware into local AI","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781228001859-nhza.png","2026-06-12T01:32:54.492313+00:00",[85,90,95,100,105,110,115,120,125,130],{"id":86,"slug":87,"title":88,"created_at":89},"8008f1a9-7a00-4bad-88c9-3eedc9c6b4b1","surepath-ai-mcp-policy-controls-en","SurePath AI's New MCP Policy Controls Enhance AI Security","2026-03-26T01:26:52.222015+00:00",{"id":91,"slug":92,"title":93,"created_at":94},"27e39a8f-b65d-4f7b-a875-859e2b210156","mcp-standard-ai-tools-2026-en","MCP Standard in 2026: Integrating AI Tools","2026-03-26T01:27:43.127519+00:00",{"id":96,"slug":97,"title":98,"created_at":99},"165f9a19-c92d-46ba-b3f0-7125f662921d","rag-2026-transforming-enterprise-ai-en","How RAG in 2026 is Transforming Enterprise AI","2026-03-26T01:28:11.485236+00:00",{"id":101,"slug":102,"title":103,"created_at":104},"6a2a8e6e-b956-49d8-be12-cc47bdc132b2","mastering-ai-prompts-2026-guide-en","Mastering AI Prompts: A 2026 Guide for Developers","2026-03-26T01:29:07.835148+00:00",{"id":106,"slug":107,"title":108,"created_at":109},"3ab2c67e-4664-4c67-a013-687a2f605814","garry-tan-open-sources-claude-code-toolkit-en","Garry Tan Open-Sources a Claude Code Toolkit","2026-03-26T08:26:20.245934+00:00",{"id":111,"slug":112,"title":113,"created_at":114},"66a7cbf8-7e76-41d4-9bbf-eaca9761bf69","github-ai-projects-to-watch-in-2026-en","20 GitHub AI Projects to Watch in 2026","2026-03-26T08:28:09.752027+00:00",{"id":116,"slug":117,"title":118,"created_at":119},"9f332fda-eace-448a-a292-2283951eee71","practical-github-guide-learning-ml-2026-en","A Practical GitHub Guide to Learning ML in 2026","2026-03-27T01:16:50.125678+00:00",{"id":121,"slug":122,"title":123,"created_at":124},"1b1f637d-0f4d-42bd-974b-07b53829144d","aiml-2026-student-ai-ml-lab-repo-review-en","AIML-2026 Is a Bare-Bones Student Lab Repo","2026-03-27T01:21:51.661231+00:00",{"id":126,"slug":127,"title":128,"created_at":129},"6d1bf3f6-e191-4d30-b55b-8a0722fa6afe","ai-trending-github-repos-and-research-feeds-en","AI Trending Tracks Repos and Research Feeds","2026-03-27T01:31:35.709532+00:00",{"id":131,"slug":132,"title":133,"created_at":134},"010539a1-4c3a-4bd3-937a-26616422ee0d","awesome-ai-for-science-research-tools-map-en","Awesome AI for Science Is Becoming a Real Research Map","2026-03-27T01:46:50.89513+00:00"]