Advertisement

DataStax Introduces RAG Solution Using NVIDIA Microservices and Astra DB

By on
Horoscope / Shutterstock.com

According to a new press release, DataStax, a leading data company focusing on generative AI applications, has announced its support for enterprise retrieval-augmented generation (RAG) use cases by integrating NVIDIA’s NIM inference microservices and NeMo Retriever microservices with Astra DB. This integration aims to deliver high-performance RAG data solutions, enhancing customer experiences by enabling users to create instantaneous vector embeddings 20 times faster than other cloud embedding services while benefiting from an 80% reduction in service costs.

Generative AI applications pose technological complexities, security concerns, and cost barriers related to vectorizing unstructured data for integration into large language models (LLMs). DataStax addresses this challenge by collaborating with NVIDIA. The integration of NVIDIA NeMo Retriever, capable of generating over 800 embeddings per second per GPU, with DataStax Astra DB, capable of ingesting new embeddings at more than 4000 transactions per second with single-digit millisecond latencies, offers a scalable solution. This deployment model significantly reduces total cost of ownership for users while achieving lightning-fast embedding generation and indexing.

The collaboration between DataStax and NVIDIA not only improves embedding generation speed but also enhances the performance of RAG use cases. Leveraging NVIDIA NeMo and Triton Inference Server software, Astra DB on NVIDIA H100 Tensor Core GPUs achieves a 20x improvement in latency for embedding and indexing documents. Additionally, DataStax introduces Vectorize, a feature enabling embedding generation at the database tier, which passes cost savings directly to customers. This integration provides enterprises with efficient, scalable, and cost-effective solutions for building generative AI applications, ultimately enhancing their ability to leverage unstructured data for real-time insights and improved user experiences.