Will OpenAI release a tokenizer with vocab size > 150k by end of 2024?
Basic
8
Ṁ90Dec 31
42%
chance
1D
1W
1M
ALL
The GPT-2 model used r50k_base: vocab size = 50k
The GPT-3 model used r50k_base: vocab size = 50k
The GPT-3.5 model used cl100k_base: vocab size = 100k
The GPT-4 model used cl100k_base: vocab size = 100k
This question is managed and resolved by Manifold.
Get
1,000
and3.00
Related questions
Related questions
Will there be an AI language model that strongly surpasses ChatGPT and other OpenAI models before the end of 2024?
3% chance
Will OpenAI release a tokenizer with more than 210000 tokens before 2026?
24% chance
Will OpenAI release o2 (or o3) before 2026?
98% chance
Will OpenAI release an AI product with a cool name by Jan 1, 2025?
40% chance
Will OpenAI release a version of Voice Engine by the end of 2024?
81% chance
Will the next major LLM by OpenAI use a new tokenizer?
77% chance
Will a OpenAI model have over 500k token capacity by the end of 2024.
6% chance
Will a flagship (>60T training bytes) open-weights LLM from Meta which doesn't use a tokenizer be released in 2025?
43% chance
Will OpenAI release a product with stateful AI agents by 2025?
80% chance
Will OpenAI name a year by which they expect to have achieved AGI by 01/01/2025?
14% chance