TOON vs JSON: How Token-Oriented Object Notation Reduces LLM Token Costs

TOON vs JSON: How Token-Oriented Object Notation Reduces LLM Token Costs

TOON is a token-efficient data format for LLMs. Learn how it compares to JSON, why it saves tokens, and how to use it with Semantic Kernel in .NET.

When you work with Large Language Models, structure matters. Tokens cost money. Clarity affects accuracy. And most of the time, the data we send to models is repetitive, table-like, and verbose when expressed in JSON. This is exactly the gap Token-Oriented Object Notation, or TOON, tries to fill.

TOON is not trying to replace JSON everywhere. Instead, it acts as a translation layer. You keep using JSON in your applications, and when it is time to send structured data to an LLM, you encode that JSON into a compact, model-friendly format.

What Is TOON

Token-Oriented Object Notation is a compact and human-readable encoding of the JSON data model. Its goal is simple: reduce token usage while keeping structure explicit and easy for language models to follow.

TOON borrows indentation-based nesting from YAML and combines it with a CSV-style layout for uniform arrays. This combination makes it especially strong when you have arrays of objects that share the same shape. Think lists of products, projects, users, or logs. Instead of repeating keys again and again, TOON declares them once and streams the values row by row.

The result is a format that is lossless, deterministic, and friendly to both humans and models.

Why JSON Becomes Expensive for LLMs

LLMs are token-based systems. Every brace, quote, comma, and repeated key adds to the total cost. A JSON structure like this:

{
  "projects": [
    { "name": "Alpha", "status": "active" },
    { "name": "Beta", "status": "pending" }
  ]
}

looks clean to us, but to a model it is filled with syntactic noise. The larger the dataset, the more this overhead grows. Even indentation and whitespace contribute to token count.

TOON reduces this overhead by flattening repetition and keeping syntax minimal, while still preserving structure.

Key Features of TOON

TOON brings a few ideas together that make a noticeable difference when working with LLM prompts.

  • Token efficient and accurate. In mixed-structure benchmarks across multiple models, TOON uses around 40 percent fewer tokens while slightly improving accuracy compared to JSON.
  • Full JSON data model support. Objects, arrays, and primitives map cleanly with deterministic round-trips.
  • LLM-friendly guardrails. Explicit array lengths and field headers give models a clear schema to follow.
  • Minimal syntax. Indentation replaces braces and quoting is kept to a minimum.
  • Tabular arrays. Uniform arrays collapse into tables where fields are declared once.
  • Multi-language ecosystem. Implementations exist in TypeScript, Python, Go, Rust, and .NET, with more in progress.

When TOON Shines and When It Does Not

TOON is excellent when your data looks like a table. It works best when:

  • You send large arrays of objects with the same structure
  • Token cost matters
  • You want to avoid repeating keys for every item

However, it is not a universal solution.

You should be cautious when:

  • Data is deeply nested and highly irregular
  • Arrays contain objects with different shapes
  • You are building APIs or long-term storage formats

In those cases, JSON or even compact JSON may be more practical.

Understanding TOON Syntax with an Example

The following example demonstrates the core syntax rules of TOON and how common data structures are represented.

name: chatgpt
address:
  floor: 2
  street: vashi
tags[3]: foo,bar,baz
items[3]{id,qty,name}:
  1,23,face gel
  2,12,serum
  3,45,vaseline

This example highlights the fundamental TOON syntax concepts.

  • Simple key-value pairs are written directly.
  • Nested objects are represented using indentation.
  • Arrays declare their length explicitly using square brackets.
  • Uniform arrays of objects declare their fields once and then list values in a compact, table-like format. Together, these rules form the core syntax that makes TOON both token-efficient and easy for LLMs to parse reliably.

Using TOON in Real LLM Pipelines

To understand whether TOON actually helps, we tested it in a real setup using Semantic Kernel and AzureOpenAI.

The idea was simple. Take unstructured text describing a platform and its projects. Ask the model to convert it either into TOON or into JSON. Then compare the output token usage.

The same prompt and the same model were used for both formats.

Code Used for the Experiment

Below is the exact C# code used to run the comparison. The prompt remains the same, and only the output format parameter changes between TOON and JSON.

IKernelBuilder kernelBuilder = Kernel.CreateBuilder();

kernelBuilder.AddAzureOpenAIChatCompletion(
    deploymentName: "gpt-4.1-mini",
    endpoint: "endpoint.com",
    apiKey: "azure-api-key"
);

Kernel kernel = kernelBuilder.Build();

IChatCompletionService chatCompletionService = kernel.GetRequiredService<IChatCompletionService>();

string prompt = """
     <message role=\"system\">You are Assistant for Helping to convert the below Data in The 'Toon' format or 'JSON' on User Request.
     Rules for Converting the Unstructured Data to Toon: 
         <rules> 
         - The Syntax contains How to format only key,value pair, Array, Array of object.
         - Using the Syntax try to convert the data in TOON format.
         - When returning the data do not include ``` like give the converted output directly.
         SYNTAX: 
         name: chatgpt
         address:
            floor: 2
            street: Vashi
         tags[3]: foo,bar,baz
         items[2]{id,qty,name}: 
           1,23,face gel
           2,12,serum
           3,45,vaseline
         </rules>
     </message>
     <message role=\"user\">
     ## Here is the Required Unstructured Data: 
     The environment is production, and the platform is currently active. It was generated by an AI content pipeline on January 24, 2026, and is running on version 1.0.0. The platform manages four projects in total. Alpha Project is active with high priority, owned by Rahul, and was created on November 12, 2025. Beta Module is pending with medium priority, owned by Sneha, and was created on November 18, 2025. Gamma Service has been completed with low priority, owned by Amit, and was created on December 1, 2025. Delta API is active with critical priority, owned by Neha, and was created on January 5, 2026. The platform focuses on SEO optimization, internal linking, keyword clustering, SERP analysis, content generation, AI embeddings, blog publishing, and handling rate limits.

     ## Convert in '{{format}}' Format
     </message>
""";

#region TOON
FunctionResult toonResult = await kernel.InvokePromptAsync(prompt, new KernelArguments()
{
    { "format", "TOON" }
});

Console.WriteLine(toonResult);

ChatTokenUsage? toonMetadata = (ChatTokenUsage)toonResult.Metadata.GetValueOrDefault("Usage", null);
if (toonMetadata != null)
{
    Console.WriteLine("TOON Result Output Tokens: {0}", toonMetadata.OutputTokenCount);
}
#endregion

Console.WriteLine("=============================================================");

#region JSON
FunctionResult jsonResult = await kernel.InvokePromptAsync(prompt, new KernelArguments()
{
    { "format", "JSON" }
});

Console.WriteLine(jsonResult);

ChatTokenUsage? jsonMetadata = (ChatTokenUsage)jsonResult.Metadata.GetValueOrDefault("Usage", null);
if (jsonMetadata != null)
{
    Console.WriteLine("JSON Result Output Tokens: {0}", jsonMetadata.OutputTokenCount);
}
#endregion

In the above code, we defined rules that instruct the model on how to convert unstructured data into TOON and how the response should be formatted. We also provided the unstructured input data.

TOON Output

environment: production
platform:
  status: active
  generated_by: AI content pipeline
  generation_date: 2026-01-24
  version: 1.0.0
  projects[4]{name,status,priority,owner,created_on}:
    Alpha Project,active,high,Rahul,2025-11-12
    Beta Module,pending,medium,Sneha,2025-11-18
    Gamma Service,completed,low,Amit,2025-12-01
    Delta API,active,critical,Neha,2026-01-05
  focus_areas[8]: SEO optimization,internal linking,keyword clustering,SERP analysis,content generation,AI embeddings,blog publishing,rate limits handling

Output tokens used: 160

Below is our data converted into structured data in the TOON format.

JSON Output

{
  "environment": "production",
  "platformStatus": "active",
  "generatedBy": "AI content pipeline",
  "generationDate": "2026-01-24",
  "version": "1.0.0",
  "projects": [
    {
      "name": "Alpha Project",
      "status": "active",
      "priority": "high",
      "owner": "Rahul",
      "createdOn": "2025-11-12"
    },
    {
      "name": "Beta Module",
      "status": "pending",
      "priority": "medium",
      "owner": "Sneha",
      "createdOn": "2025-11-18"
    },
    {
      "name": "Gamma Service",
      "status": "completed",
      "priority": "low",
      "owner": "Amit",
      "createdOn": "2025-12-01"
    },
    {
      "name": "Delta API",
      "status": "active",
      "priority": "critical",
      "owner": "Neha",
      "createdOn": "2026-01-05"
    }
  ],
  "focusAreas": [
    "SEO optimization",
    "internal linking",
    "keyword clustering",
    "SERP analysis",
    "content generation",
    "AI embeddings",
    "blog publishing",
    "handling rate limits"
  ]
}

Output tokens used: 297

That is a -44% percent reduction in output tokens.

Why the Token Difference Exists

The difference comes from repetition and syntax overhead.

In JSON, every object repeats its keys. Quotes, colons, braces, commas, and indentation all add tokens. Nested structures amplify this cost.

TOON removes most of that repetition. Fields are declared once. Values are streamed in rows. Indentation is minimal and there is no need for repeated quoting. The model sees a clear schema with far fewer tokens.

TOON Ecosystem and Tooling

TOON already has growing community support across multiple languages.

For .NET projects today, community libraries like ToonSharp can already be used to experiment and integrate TOON into LLM pipelines.

These libraries allow you to decode TOON output generated by LLMs back into structured data models, enabling reliable encoding, decoding, and downstream processing.

Conclusion

TOON is a useful format when working with large, structured data in LLM workflows. It helps reduce token usage while keeping the data easy to read and structured, which can lower costs and improve efficiency. As shown in the example, converting the same data to TOON used significantly fewer tokens than JSON without losing any information. While JSON is still better for APIs, storage, and deeply nested data, TOON works well when sending repetitive data to language models. Used in the right scenarios, it can make LLM pipelines more efficient and cost-effective.