Top 5 This Week

Related Posts

AI2 closes the gap between closed-source and open-source post-training


Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More


The Allen Institute for AI (Ai2) claims to have narrowed the gap between closed-source and open-sourced post-training with the release of its new model training family, Tรผlu 3, bringing the argument that open-source models will thrive in the enterprise space.ย 

Tรผlu 3 brings open-source models up to par with OpenAIโ€™s GPT models, Claude from Anthropic and Googleโ€™s Gemini. It allows researchers, developers and enterprises to fine-tune open-source models without losing data and core skills of the model and get it close to the quality of closed-source models.ย 

Ai2 said it released Tรผlu 3 with all of the data, data mixes, recipes, code, infrastructure and evaluation frameworks. The company needed to create new datasets and training methods to improve Tรผluโ€™s performance, including โ€œtraining directly on verifiable problems with reinforcement learning.โ€

โ€œOur best models result from a complex training process that integrates partial details from proprietary methods with novel techniques and established academic research,โ€ Ai2 said in a blog post. โ€œOur success is rooted in careful data curation, rigorous experimentation, innovative methodologies and improved training infrastructure.โ€

Tรผlu 3 will be available in a range of sizes.ย 

Open-source for enterprises

Open-source models often lagged behind closed-sourced models in enterprise adoption, although more companies anecdotally reported choosing more open-source large language models (LLMs) for projects.ย 

Ai2โ€™s thesis is that improving fine-tuning with open-source models like Tรผlu 3 will increase the number of enterprises and researchers picking open-source models because they can be confident it can perform as well as a Claude or Gemini.ย 

The company points out that Tรผlu 3 and Ai2โ€™s other models are fully open source, noting that big model trainers like Anthropic and Meta, who claim to be open source, have โ€œnone of their training data nor training recipes are transparent to users.โ€ The Open Source Initiative recently published the first version of its open-source AI definition, but some organizations and model providers donโ€™t fully follow the definition in their licenses.ย 

Enterprises care about the transparency of models, but many choose open-source models not so much for research or data openness but because itโ€™s the best fit for their use cases.ย 

Tรผlu 3 offers enterprises more of a choice when looking for open-source models to bring into their stack and fine-tune with their data.ย 

Ai2โ€™s other models, OLMoE and Molmo, are also open source which the company said has started to outperform other leading models like GPT-4o and Claude.ย 

Other Tรผlu 3 features

Ai2 said Tรผlu 3 lets companies mix and match their data during fine-tuning.ย 

โ€œThe recipes help you balance the datasets, so if you want to build a model that can code, but also follow instructions precisely and speak in multiple languages, you just select the particular datasets and follow the steps in the recipe,โ€ Ai2 said.ย 

Mixing and matching datasets can make it easier for developers to move from a smaller model to a larger weighted one and keep its post-training settings. The company said the infrastructure code it released with Tรผlu 3 allows enterprises to build out that pipeline when moving through model sizes.ย 

The evaluation framework from Ai2 offers a way for developers to specify settings in what they want to see out of the model.ย 

#AI2 #closes #gap #closedsource #opensource #posttraining
source: https://venturebeat.com/ai/ai2-closes-the-gap-between-closed-source-and-open-source-post-training/

LEAVE A REPLY

Please enter your comment!
Please enter your name here

Popular Articles