AI Regulations in China
AI Regulations in the European Union (EU)
AI Regulations in the US
AI Regulations in India
Model safety
Synthetic & Generative AI
MLOps
Model Performance
ML Monitoring
Explainable AI
Synthetic & Generative AI

Tokens

Smallest unit or chunks of text that a model processes

What is a token?

In the context of large language models, 'token' refers to the smallest unit or chunks of text that a model processes. Used by LLMs to process and generate language, tokens can be as short as one character, as long as a word, or even larger chunks of text-like phrases, depending on the model and its configuration.

Tokens serve as a connection between human language and a structure that AI models can understand. Many modern language models, such as GPT  models, are trained as token-based models. AI models are designed to handle a specific number of tokens at one go.

Each input provided to the model is broken down into tokens and analyzed, and the understanding is used to create a response. The exact process is followed for creating a response - the model generates one token at a time based on the previous token.

Types of tokens:

Here are some types of tokens used in AI Large Language Models:

  • Word Tokens: These represent individual words or phrases in the text, like "house."
  • Sub-word Tokens: Words can be divided into smaller sub-word units. For instance, "speaking" can be segmented into "speak" and "ing."
  • Punctuation Tokens: Tokens that signify various punctuation marks, such as commas (","), periods ("."), and others.
  • Special Tokens: Unique symbols like "[CLS]" (classification token), "[SEP]" (separator token), or "[MASK]" (mask token) have specific roles within the model.
  • Number Tokens: Textual numbers are transformed into numerical tokens. For example, "10" might be represented as a numerical token.

Liked the content? you'll love our emails!

Thank you! We will send you newest issues straight to your inbox!
Oops! Something went wrong while submitting the form.

See how AryaXAI improves
ML Observability

Learn how to bring transparency & suitability to your AI Solutions, Explore relevant use cases for your team, and Get pricing information for XAI products.