[{"data":1,"prerenderedAt":-1},["ShallowReactive",2],{"article-llms-work-by-predicting-next-token-en":3,"article-related-llms-work-by-predicting-next-token-en":35,"series-industry-d0dd8e84-c799-4d99-b7ef-14f9eac9f7dc":83},{"id":4,"slug":5,"title":6,"content":7,"summary":8,"source":9,"source_url":10,"author":11,"image_url":12,"cover_image":12,"category":13,"language":14,"translated_content":11,"related_article_id":15,"keywords":16,"key_takeaways":27,"views":31,"created_at":32,"published_at":33,"topic_cluster_id":34},"d0dd8e84-c799-4d99-b7ef-14f9eac9f7dc","llms-work-by-predicting-next-token-en","LLMs work by predicting the next token","\u003Cp data-speakable=\"summary\">Large language models predict tokens, learn from huge text corpora, and adapt through fine-tuning.\u003C\u002Fp>\n\u003Cp>Large language models are easier to understand when you break them into five parts: training data, tokenization, transformer attention, parameter learning, and post-training alignment. IBM notes that some popular models now have billions or trillions of parameters, which explains both their power and their cost.\u003C\u002Fp>\n\u003Ctable>\u003Cthead>\u003Ctr>\u003Cth>Item\u003C\u002Fth>\u003Cth>What it does\u003C\u002Fth>\u003Cth>Key detail\u003C\u002Fth>\u003C\u002Ftr>\u003C\u002Fthead>\u003Ctbody>\u003Ctr>\u003Ctd>Tokenization\u003C\u002Ftd>\u003Ctd>Splits text into machine-readable units\u003C\u002Ftd>\u003Ctd>Words, subwords, or characters\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Transformer attention\u003C\u002Ftd>\u003Ctd>Tracks relationships between tokens\u003C\u002Ftd>\u003Ctd>Uses query, key, and value vectors\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Parameters\u003C\u002Ftd>\u003Ctd>Store learned model behavior\u003C\u002Ftd>\u003Ctd>Can reach billions or trillions\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Fine-tuning\u003C\u002Ftd>\u003Ctd>Adapts a base model for a task\u003C\u002Ftd>\u003Ctd>Includes RLHF for alignment\u003C\u002Ftd>\u003C\u002Ftr>\u003Ctr>\u003Ctd>Small language models\u003C\u002Ftd>\u003Ctd>Run with fewer resources\u003C\u002Ftd>\u003Ctd>Fit smaller devices and tighter budgets\u003C\u002Ftd>\u003C\u002Ftr>\u003C\u002Ftbody>\u003C\u002Ftable>\n\u003Ch2>1. Training data at massive scale\u003C\u002Fh2>\n\u003Cp>\u003Ca href=\"\u002Ftag\u002Fllms\">LLMs\u003C\u002Fa> start with huge text collections pulled from books, articles, websites, code, and other sources. The model is not memorizing a single document; it is learning patterns across a broad mix of language so it can generalize to new prompts.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781889467439-xtn8.png\" alt=\"LLMs work by predicting the next token\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\n\u003Cp>That training pipeline depends on careful cleanup. Data scientists remove duplication, errors, and unwanted content before the model ever sees the text. This matters because bad inputs can distort what the model learns and can also carry forward bias or noise.\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>Sources: books, articles, websites, code\u003C\u002Fli>\n  \u003Cli>Prep steps: cleaning, deduplication, filtering\u003C\u002Fli>\n  \u003Cli>Goal: expose the model to varied language patterns\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>2. Tokenization turns text into units\u003C\u002Fh2>\n\u003Cp>Before training, text is broken into tokens, which may be words, subwords, or characters. Tokenization gives the model a consistent way to process language, including rare terms and new words it has never seen in exactly that form.\u003C\u002Fp>\n\u003Cp>This is one of the most practical ideas in LLMs: the system does not read like a person does. It reads sequences of tokens, then predicts what \u003Ca href=\"\u002Ftag\u002Ftoken\">token\u003C\u002Fa> should come next based on patterns it learned during training.\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>Token types: word, subword, character\u003C\u002Fli>\n  \u003Cli>Benefit: consistent handling of rare and novel words\u003C\u002Fli>\n  \u003Cli>Result: language becomes machine-readable input\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>3. Transformers use self-attention\u003C\u002Fh2>\n\u003Cp>LLMs are built on transformer models, and the transformer’s self-attention mechanism is the core reason they work so well with language. Self-attention lets the model weigh relationships between tokens, including tokens that are far apart in a sentence or paragraph.\u003C\u002Fp>\n\u003Cfigure class=\"my-6\">\u003Cimg src=\"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781889460634-1c4i.png\" alt=\"LLMs work by predicting the next token\" class=\"rounded-xl w-full\" loading=\"lazy\" \u002F>\u003C\u002Ffigure>\n\n\u003Cp>IBM describes this as the model calculating which parts of the sequence matter most at each moment. Each token becomes a query, key, and value vector, and those vectors are compared to decide how much information should flow forward.\u003C\u002Fp>\n\u003Ccode>token -> embedding -> query\u002Fkey\u002Fvalue -> attention weights -> output\u003C\u002Fcode>\n\u003Ch2>4. Parameters store what the model learns\u003C\u002Fh2>\n\u003Cp>During training, the model adjusts internal weights called parameters. These are the configuration values that shape how the network processes input and produces output, and modern LLMs can contain billions or even trillions of them.\u003C\u002Fp>\n\u003Cp>That scale is why LLMs can pick up grammar, factual patterns, reasoning structures, and writing styles. IBM also notes that smaller language models exist for constrained environments, where fewer parameters make deployment easier on local or resource-limited hardware.\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>Parameters: internal weights learned during training\u003C\u002Fli>\n  \u003Cli>Scale: billions or trillions in large models\u003C\u002Fli>\n  \u003Cli>Small models: better for smaller devices and tighter compute budgets\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>5. Fine-tuning and RLHF shape behavior\u003C\u002Fh2>\n\u003Cp>After pretraining, an \u003Ca href=\"\u002Ftag\u002Fllm\">LLM\u003C\u002Fa> can be fine-tuned for a specific job, such as customer support, summarization, or code help. This second stage adjusts the base model so it performs better in a target setting instead of just being generally fluent.\u003C\u002Fp>\n\u003Cp>One common method is \u003Ca href=\"\u002Ftag\u002Freinforcement-learning\">reinforcement learning\u003C\u002Fa> from human feedback, or RLHF. Humans rank outputs, and the model learns to prefer responses people rate higher, which helps with alignment, meaning outputs are more useful, safe, and consistent with human values.\u003C\u002Fp>\n\u003Cul>\n  \u003Cli>Fine-tuning: adapts a general model to a task\u003C\u002Fli>\n  \u003Cli>RLHF: uses human rankings to improve outputs\u003C\u002Fli>\n  \u003Cli>Alignment goal: useful, safe, consistent responses\u003C\u002Fli>\n\u003C\u002Ful>\n\u003Ch2>How to decide\u003C\u002Fh2>\n\u003Cp>If you want the shortest mental model, remember this: LLMs are token predictors trained on massive text, powered by transformer attention, and shaped by post-training methods like fine-tuning and RLHF. That is the core path from raw data to useful output.\u003C\u002Fp>\n\u003Cp>If you care most about deployment, focus on parameters and model size. If you care most about behavior, focus on fine-tuning and alignment. If you care most about the engine under the hood, focus on tokenization and self-attention.\u003C\u002Fp>","A clear guide to how LLMs are trained, tuned, and used, with 5 practical pieces of the model pipeline.","www.ibm.com","https:\u002F\u002Fwww.ibm.com\u002Fthink\u002Ftopics\u002Flarge-language-models",null,"https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781889467439-xtn8.png","industry","en","c40b20df-d89a-43ae-bb11-11062dcd2cd2",[17,18,19,20,21,22,23,24,25,26],"LLMs","large language models","transformers","self-attention","tokenization","fine-tuning","RLHF","alignment","parameters","machine learning",[28,29,30],"LLMs learn from massive text corpora and predict the next token.","Transformers use self-attention to connect distant words in context.","Fine-tuning and RLHF adapt a base model for safer, more useful outputs.",0,"2026-06-19T17:17:21.316538+00:00","2026-06-19T17:17:21.307+00:00","5fe38f8a-dc8c-44bd-a3dc-82024f24ba0f",{"tags":36,"relatedLang":42,"relatedPosts":46},[37,38,40],{"name":21,"slug":21},{"name":18,"slug":39},"large-language-models",{"name":17,"slug":41},"llms",{"id":15,"slug":43,"title":44,"language":45},"llms-work-by-predicting-next-token-zh","5 個關鍵部件看懂 LLMs","zh",[47,53,59,65,71,77],{"id":48,"slug":49,"title":50,"cover_image":51,"image_url":51,"created_at":52,"category":13},"49323595-91fe-487a-af67-aa2bf8f84e3a","aibox-ax8850-hardware-first-integration-en","AIBOX 不是拼软件，关键在把 AX8850 的硬件吃满","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781900269609-7lpk.png","2026-06-19T20:17:24.024298+00:00",{"id":54,"slug":55,"title":56,"cover_image":57,"image_url":57,"created_at":58,"category":13},"18a6fbe6-aa25-4d9a-92c0-1164c91d3e72","ai-coding-assistant-roi-measured-en","AI coding assistant ROI is real, but only when you measure it","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781893068003-yy30.png","2026-06-19T18:17:20.224146+00:00",{"id":60,"slug":61,"title":62,"cover_image":63,"image_url":63,"created_at":64,"category":13},"44a50d6d-bec8-4b1e-a4f8-afab437292c8","red-hat-ai-mavenir-telco-ai-stack-en","Red Hat AI turns telco AI into a stack","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781885892078-r3ek.png","2026-06-19T16:17:38.760812+00:00",{"id":66,"slug":67,"title":68,"cover_image":69,"image_url":69,"created_at":70,"category":13},"7504e3ad-b725-46d1-91cb-78cff05a7d79","manus-ai-github-topics-clone-kits-en","Manus AI on GitHub Is Mostly Clone Kits","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781882276536-fn20.png","2026-06-19T15:17:27.250659+00:00",{"id":72,"slug":73,"title":74,"cover_image":75,"image_url":75,"created_at":76,"category":13},"0df0f0e7-5492-41e8-8b7d-4c72a133bebf","deepmind-gemini-atlas-robotics-update-en","DeepMind把Gemini装进Atlas后，机器人更像会思考了","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781878666774-fc7b.png","2026-06-19T14:17:21.457576+00:00",{"id":78,"slug":79,"title":80,"cover_image":81,"image_url":81,"created_at":82,"category":13},"bae2c76d-53a2-468f-9451-32112f760733","spacex-shou-gou-cursor-bu-hua-suan-ai-bian-cheng-en","SpaceX收购Cursor不划算，AI编程能力应自己做","https:\u002F\u002Fxxdpdyhzhpamafnrdkyq.supabase.co\u002Fstorage\u002Fv1\u002Fobject\u002Fpublic\u002Fcovers\u002Finline-1781877772236-jpfj.png","2026-06-19T14:02:19.96906+00:00",[84,89,94,99,104,109,114,119,124,129],{"id":85,"slug":86,"title":87,"created_at":88},"d35a1bd9-e709-412e-a2df-392df1dc572a","ai-impact-2026-developments-market-en","AI's Impact in 2026: Key Developments and Market Shifts","2026-03-25T16:20:33.205823+00:00",{"id":90,"slug":91,"title":92,"created_at":93},"5ed27921-5fd6-492e-8c59-78393bf37710","trumps-ai-legislative-framework-en","Trump's AI Legislative Framework: What's Inside?","2026-03-25T16:22:20.005325+00:00",{"id":95,"slug":96,"title":97,"created_at":98},"e454a642-f03c-4794-b185-5f651aebbaca","nvidia-gtc-2026-key-highlights-innovations-en","NVIDIA GTC 2026: Key Highlights and Innovations","2026-03-25T16:22:47.882615+00:00",{"id":100,"slug":101,"title":102,"created_at":103},"0ebb5b16-774a-4922-945d-5f2ce1df5a6d","claude-usage-diversifies-learning-curves-en","Claude Usage Diversifies, Learning Curves Emerge","2026-03-25T16:25:50.770376+00:00",{"id":105,"slug":106,"title":107,"created_at":108},"69934e86-2fc5-4280-8223-7b917a48ace8","openclaw-ai-commoditization-concerns-en","OpenClaw's Rise Raises Concerns of AI Model Commoditization","2026-03-25T16:26:30.582047+00:00",{"id":110,"slug":111,"title":112,"created_at":113},"b4b2575b-2ac8-46b2-b90e-ab1d7c060797","google-gemini-ai-rollout-2026-en","Google's Gemini AI Rollout Extended to 2026","2026-03-25T16:28:14.808842+00:00",{"id":115,"slug":116,"title":117,"created_at":118},"6e18bc65-42ae-4ad0-b564-67d7f66b979e","meta-llama4-fabricated-results-scandal-en","Meta's Llama 4 Scandal: Fabricated AI Test Results Unveiled","2026-03-25T16:29:15.482836+00:00",{"id":120,"slug":121,"title":122,"created_at":123},"bf888e9d-08be-4f47-996c-7b24b5ab3500","accenture-mistral-ai-deployment-en","Accenture and Mistral AI Team Up for AI Deployment","2026-03-25T16:31:01.894655+00:00",{"id":125,"slug":126,"title":127,"created_at":128},"5382b536-fad2-49c6-ac85-9eb2bae49f35","mistral-ai-high-stakes-2026-en","Mistral AI: Facing High Stakes in 2026","2026-03-25T16:31:39.941974+00:00",{"id":130,"slug":131,"title":132,"created_at":133},"9da3d2d6-b669-4971-ba1d-17fdb3548ed5","cursors-meteoric-rise-pressures-en","Cursor's Meteoric Rise Faces Industry Pressures","2026-03-25T16:32:21.899217+00:00"]