Andrej Karpathy, who previously worked at OpenAI and served as Tesla's former AI director, has named his newest endeavor the “best ChatGPT $100 can buy.”
Yesterday, EurekaAI, his AI education startup, launched an open-source project called “nanochat,”. It demonstrates how individuals with a single GPU server and roughly $100 can create their own miniature ChatGPT capable of answering basic questions and generating stories and poems.
Karpathy, referring to nanochat as a “micro model,”, stated on X that models similar to his ought to be considered “very young children” capable of “don’t have the raw intelligence of their larger cousins.”. Increasing your expenditure to $1,000, however, would enable such a model to “quickly becomes a lot more coherent and can solve simple math/code problems and take multiple choice tests.”.
The announcement received millions of views on X, with Tobi Lütke, CEO of Shopify, describing it as a “gift” for developers, researchers, and students. However, it also highlights a burgeoning trend: the development of smaller, more cost-effective, and specialized models featuring fewer parameters. These parameters, or the “knobs” within a model adjusted during training to interpret language, images, or data, are significantly reduced. While colossal large language models (LLMs) can possess trillions of parameters, necessitating cloud-based GPU access and substantial computing resources, these newer, compact models might only contain a few billion.
With fewer parameters, these small models don’t try to match the power of frontier models like GPT-5, Claude, and Gemini. But they are good enough for specific tasks, affordable to train, lightweight enough to use on devices like phones and laptops, and easy for startups, researchers, and hobbyists to build and deploy.
The small-model approach was echoed by researchers at Samsung AI Lab last week, who released a paper showing off their Tiny Recursive Model. It uses a new neural network architecture that shows remarkable efficiency on complex reasoning and puzzle tasks like sudoku, outperforming popular LLMs while using a minuscule fraction of the computational resources.
There has been a wave of other organizations releasing small AI models, showing that size isn’t everything when it comes to power. Last week, Israel’s AI21 unveiled Jamba Reasoning 3B, a 3-billion-parameter open-source model that can “remember” and reason over massive amounts of text, and run at high speed even on consumer devices. In September, UAE’s Mohamed bin Zayed University of Artificial Intelligence (MBZUAI) and G24 introduced K2 Think, an open-source reasoning model with only 32 billion parameters that in trials rivaled systems more than 20 times as large. Meanwhile, Big Tech companies like Google, Microsoft, IBM, and OpenAI have all joined the small-but-mighty club, with models that are a fraction of the size of their bigger counterparts.
Much of this momentum traces back to China’s DeepSeek, whose lean, low-cost models upended industry assumptions at the beginning of this year and kicked off a race to make AI smaller, faster, and smarter. But it’s important to note that these models, while impressive, aren’t designed to match the broad capabilities of frontier systems like GPT-5. Instead, they’re built for narrower, specialized tasks—and often shine in specific use cases.
This week, IBM Research, collaborating with NASA and other entities, unveiled open-source, “drastically smaller” iterations of its Prithvi and TerraMind Earth-observation models. These models are capable of operating on a wide range of devices, from orbiting satellites to personal smartphones, all while preserving robust performance. “These models could reshape how we think about doing science in regions far from the lab—whether that’s in the vacuum of space or the savanna,”, the company stated in a blog post.
None of this means the era of massive, trillion-parameter models is coming to an end. As companies like OpenAI, Google, and Anthropic push for artificial general intelligence, which requires more reasoning capabilities, those will be the models that push the frontier. But the rise of smaller, cheaper, and more efficient models shows that AI’s future won’t be defined by size alone.
