Join our daily and weekly newsletters for the latest updates and exclusive content on industry-leading AI coverage. Learn More
While vector databases are now increasingly commonplace as a core element of an enterprise AI deployment for Retrieval Augmented Generation (RAG), thatโs not all thatโs needed.
Chris Latimer, the CEO and co-founder of startup Vectorize, spent several years working at DataStax where he helped to lead the database vendorโs cloud efforts. A recurring issue that he saw time and again was that the vector database wasnโt really the hard part of enabling enterprise RAG. The hard part of the problem was taking all the unstructured data and getting it into the vector database, in a way that was optimized and going to work well for generative AI.
Thatโs why Latimer started Vectorize just ten months ago, in a bid to help solve that challenge.ย
Today the company is announcing that it has raised $3.6 million in a seed round of funding, led by True Ventures. Alongside the funding, the company announced the general availability of its enterprise RAG platform. The Vectorize platform can enable an agentic RAG approach for near real-time data capability. Vectorize focuses on the data engineering side of AI. The platform helps companies prepare and maintain their data for use in vector databases and large language models. The Vectorize platform also enables enterprises to quickly build an RAG data pipeline through an intuitive interface. Another core capability is an RAG evaluation feature that allows enterprises to test different approaches.
โWe kept seeing people get to the end of the development cycle with their Gen AI projects and find out that they didnโt work really well,โ Chris Latimer, co-founder and CEO of Vectorize told VentureBeat in an exclusive interview. โThe context they were getting for their vector database wasnโt the most useful to the large language model, it was still hallucinating or it was misinterpreting the data.โ
How Vectorize fits into the enterprise RAG stack
Vectorize is not a vector database itself. Rather, itโs a platform that connects unstructured data sources to existing vector databases like Pinecone, DataStax, Couchbase and Elastic.
Latimer explained that Vectorize ingests and optimizes data from diverse sources for vector databases. The platform will provide a production-ready data pipeline that handles ingestion, synchronization, error handling and other data engineering best practices.
Vectorize itself is not a vector embedding technology either. The process of converting data, be it text, images or audio into vectors, is what vector embedding is all about. Vectorize helps users evaluate different embedding models and data chunking methods to determine the best configuration for the enterpriseโs specific use case and data.
Latimer explained that Vectorize allows users to choose from any number of different embedding models. The different models could include for example OpenAIโs ada, or even Voyage AI embeddings, which are now being adopted by Snowflake.
โWe do take into account innovative ways to vectorize the data so that you get the best results,โ Latimer said. โBut ultimately, where we see the value is in giving enterprises and developers a production-ready solution that they just donโt have to worry about the data engineering side.โ
Using agentic AI to power enterprise RAG
One of Vectorizeโs key innovations is its โagentic RAGโ approach. Itโs an approach that combines traditional RAG techniques with AI agent capabilities, allowing for more autonomous problem-solving in applications.
Agentic RAG isnโt a hypothetical concept either. Itโs already being used by one of Vectorizeโs early users, AI inference silicon startup Groq, which recently raised $640 million. Groq is using Vectorizeโs agentic RAG capabilities to power an AI support agent. The agent can autonomously solve customer problems using the data and context provided by Vectorizeโs data pipelines.
โIf a customer has a question thatโs been asked and answered before, you want that agent to be able to solve the customerโs problem without a human getting involved,โ Latimer said. โBut if thereโs something that the agent canโt solve, you do want to have a human in the loop where you can escalate, so this idea of being able to have an agent reason its way through solving a problem, is the whole idea behind an AI agent architecture.โ
Why real time data pipelines are essential to enterprise RAG
A primary reason why an enterprise will use RAG is to connect to its own sources of data. Whatโs equally important though is making sure that data is up to date.
โStale data is going to lead to stale decisions,โ Latimer said. Vectorize provides real-time and near-real-time data update capabilities, with the ability for customers to configure their tolerance for data staleness.
โWeโve actually let people configure the platform based on their tolerance for stale data and their need for real-time data,โ he said. โSo if all you need is to schedule your pipeline to run once a week, weโll let you do that, and then if you need to run real-time, weโll let you do that as well, and youโll have real-time updates as soon as theyโre available.โ
source: https://venturebeat.com/ai/vectorize-debuts-agentic-rag-platform-for-real-time-enterprise-data/
