Compare Models
-
EleutherAI
GPT-J
FREEEleutherAI is a leading non-profit research institute focused on large-scale artificial intelligence research. EleutherAI has trained and released several LLMs and the codebases used to train them. GPT-J can be used for code generation, making a chat bot, story writing, language translation and searching. GPT-J learns an inner representation of the English language that can be used to extract features useful for downstream tasks. The model is best at what it was pretrained for, which is generating text from a prompt. EleutherAI has a web page where you can test to see how the GPT-J works, or you can run GPT-J on google colab, or use the Hugging Face Transformers library. -
Microsoft, NVIDIA
MT-NLG
OTHERMT-NLG (Megatron-Turing Natural Language Generation) uses the architecture of the transformer-based Megatron to generate coherent and contextually relevant text for a range of tasks, including completion prediction, reading comprehension, commonsense reasoning, natural language inferences, and word sense disambiguation. MT-NLG is the successor to Microsoft Turing NLG 17B and NVIDIA Megatron-LM 8.3B. The MT-NLG model is three times larger than GPT-3 (530B vs 175B). Following the original Megatron work, NVIDIA and Microsoft trained the model on over 4,000 GPUs. NVIDIA has announced an Early Access program for its managed API service to the MT-NLG model for organizations and researchers.