Empowering the mass adoption of AI

tokes compare  is a pioneering user-friendly platform that allows users to compare the usage costs, quality and structured outputs of Large Language Models (LLMs), agents and multi agents systems, and Generative AI (GenAI) platforms. We are passionate that consumers getting the best value, and we are here to educate users on the models capabilities, outputs and price points so you can get the best value to meet your requirements.  

Featured models and platforms

Our top picks for LLMs, agents and GenAI platforms. Unleash AI power to transform your businesses and power your applications, driving success like never before!


Multi-Agent System

crewai is a built on Langchain and allows you to create crews of agents to work collaboratively to complete complex tasks with a focus on agents, tasks and tools. It has an easy-to-follow structure and is a great place to start when learning about mulit-agent systems.

  • Open Source


Claude 3.5 Sonnet

Claude 3.5 Sonnet is leading the way in industry benchmarks, and the new Artifacts is a beast and excellent for coding support. 

  • Commercial Models


Multi-Agent system

AutoGen provides a multi-agent conversation framework., where you can define a set of agents with specialised capabilities and roles and define interactions between agents. 

  • Beta Version


Llama 3

Meta Llama 3, the next generation of our state-of-the-art open source large language model. The 8B and 70B parameters models can support a broad range of use cases.

  • Open Source


Many Large Language Models operate on tokens - tokens are the building blocks of text in LLMs. A token refers to a basic unit of text representation used for processing and understanding the text. Tokens can be words, subwords, or characters, depending on the tokenization method used. 

You can use the free OpenAI tool below to understand how a piece of text would be tokenized in GPT-3.5 or GPT-4, and the total count of tokens in that piece of text. 


GenAI Open Playgrounds

Have fun, search, explore, and engage with a range of GenAI models (text, image, audio, video, code, 3D).

Summary of current models

LLMs and GenAI models today come in a variety of architectures and capabilities.  

This up-to-date interactive tool provides a visual overview of the most important LLMs, including their training data, size, release date, and whether they are openly accessible or not. 


Battle Chatbots!

Chatbot Arena allows users to prompt two large language models simultaneously and identify the one that delivers the best responses. The result is a leaderboard that includes both open source and proprietary models.

Model & API Provider Analysis

Understand the AI landscape and choose the best model and API provider for your use-case

Frequently Asked Questions

A lot of people are new to LLMs and generative AI platforms and we aim to address the most common questions customers have for all stages of their journey.
1What are Large Language Models (LLMs)?

A LLM is a language model consisting of a neural network with many parameters (typically billions of weights or more), trained on large quantities of unlabeled text using self-supervised learning or semi-supervised learning. They can be used for a variety of tasks such as text generation, text classification, question answering, and machine translation. Over time, these models have continued to improve, allowing for better accuracy and greater performance on a variety of tasks. We are now seeing the emergence of LLMs that can understand images, videos and audio as well (GPT-4, Gemini, Claude 3).

At tokes compare, we allow you to compare open source and commercial LLMs and understand which one will meet your requirements. 

2What is the difference between open source and commercial LLMs?
Open LLMs are AI-based models released under a license allowing free access, modification, and distribution (e.g., Llama 3). It is important to note that some open source LLMs may prohibit commercial use. Commercial LLMs, such as OpenAI's GPT-4 or Claude Anthropic's 3, are developed by private entities that restrict access and usage, often requiring payment. The main differences between these models lie in their access, usage policies, and the availability of source code and training data.
3What are Multimodal LLMs?
The concept of a multimodal LLM represents a significant evolution in the field of Natural Language Processing (NLP) and Artificial Intelligence (AI). Traditional language models, such as the initial versions of GPT, were primarily text-based and relied on extensive text corpora for their training. The new LLMs such as GPT-4 or Gemini are capable of multiple methods of input, such as photos, audio, and video, or what’s known as a multimodal model. The goal is to make the AI system more versatile and adaptable, allowing it to understand and generate information from various inputs beyond just text. By combining these different types of data, multimodal language models can achieve a more comprehensive understanding of the world, mirroring the way humans integrate information from different senses.
4Why is tracking usage important for pay-as-you-go LLMs? 
Tracking usage is essential for pay-as-you-go LLMs because it helps you monitor and manage your costs effectively. By understanding your usage patterns, you can optimize resource allocation and control spending on the LLM service. Additionally, monitoring usage can help you identify potential bottlenecks, optimize your application's performance, and ensure that you stay within the rate limits and usage allowances of your chosen pricing plan. Depending on usage, pay-as-you-go API pricing can be quite affordable compared to the subscription/purchasing plans but it is important to understand your usage. Some companies let customers set up an automatic notification to alert them when their credit is running short. You can choose the credit amount that will trigger the notification, and if it is reached at any point, you will be notified via email.
5What is a token in the context of a Large Language Model (LLM)?
A token is a basic unit of text representation used in LLMs for processing and understanding the text.  Tokens are the smallest units of text that an LLM can understand and process. Tokens can be words, subwords, or characters, depending on the tokenization method employed.
6What do people mean when they refer to a Large Language Model and generative AI platforms?

When people refer to a Large Language Model, they are usually talking about a type of artificial intelligence model designed to understand, generate, and manipulate human language. These models are built using deep learning techniques and are trained on massive amounts of text data. As a result, they can comprehend and produce text in a coherent and contextually relevant manner, often resembling human-like language understanding.

Generative AI platforms, on the other hand, are systems that utilize these LLMs or similar generative models to create content. These platforms can generate text, images, music, or even code based on given inputs or prompts. They are capable of completing tasks like generating responses in a conversation, summarizing text, creating human-like articles, and more.

Large Language Models, like OpenAI's GPT series are often used as the foundation for generative AI platforms, enabling a wide range of applications such as chatbots, content generation, translation services, and virtual assistants - hence why it is good to know more about how they can help you transform your applications or business.

7What is the difference between tokens and characters?
LLM APIs can be billed based on the volume of input they receive and the volume of output they generate (as well as the API flat rate call). Different providers measure these volumes differently, with some measuring tokens and others measuring characters (but the concepts remain the same). At tokes compare, we convert them all to a standard unit - 1 million (1/m) so you can see where you get the best value side-by-side.

Understanding the difference between a token and a character is essential to compare pricing across providers properly.

• A token is a unit of measure representing characters (see our blog section for more details on tokens and tokenization). A helpful rule of thumb is that one token generally corresponds to ~4 characters of text for common English text. This translates to roughly ¾ of a word (so 100 tokens ~= 75 words).

• A character is a single letter, number, or symbol. For example, the word “hello” has five characters and may be slightly larger than one token.
8What Is Retrieval-Augmented Generation, aka RAG?
Retrieval-augmented generation (RAG) is a technique for enhancing the accuracy and reliability of generative AI models with facts fetched from external sources. With RAG, users can essentially have conversations with data repositories, opening up new kinds of experiences. Once companies get familiar with RAG, they can combine a variety of off-the-shelf or custom LLMs with internal or external knowledge bases to create a wide range of assistants that help their employees and customers.
9What is an Agent and what is a Multi-Agent System?
In artificial intelligence, an agent is a computer program or system that is designed to perceive its environment, make decisions, and take actions to achieve a specific goal or set of goals. Technically, the new GPTs from OpenAI are agents. A Multi-Agent AI System consists of specialised agents interacting to achieve goals. It involves defining agents' roles and capabilities, and their interaction behaviors, including communication and decision-making processes, to efficiently achieve collective or individual objectives. This framework applies to autonomous vehicles, supply chains, smart grids, and customer service. Check out Autogen, crewai and Langraph to get started - these really are the future of AI.
10What is a Graphic Processing Unit and what is a Language Processing Unit?
GPUs, or Graphic Processing Units, are specialised circuits designed to speed up image, video, and animation creation, widely used in gaming and professional graphics. They're great for tasks that can run in parallel, making them ideal for machine learning and AI. The term "Language Processing Unit" (LPU), such as the new Groq LPU which can process token sequences much faster than GPUs, is hardware tailored for natural language tasks like parsing and generating human language.