Release v0.6.0 · NVIDIA/GenerativeAIExamples

This release adds ability to switch between API Catalog models and on-prem models using NIM-LLM and adds documentation on how to build an RAG application from scratch. It also releases a containerized end to end RAG evaluation application integrated with RAG chain-server APIs.

Added

Ability to switch between API Catalog models to on-prem models using NIM-LLM.
New API endpoint
- /health - Provides a health check for the chain server.
Containerized evaluation application for RAG pipeline accuracy measurement.
Observability support for langchain based examples.
New Notebooks
- Added Chat with NVIDIA financial data notebook.
- Added notebook showcasing langgraph agent handling.
A simple rag example template showcasing how to build an example from scratch.

Changed

Renamed example csv_rag to structured_data_rag
Model Engine name update
- nv-ai-foundation and nv-api-catalog llm engine are renamed to nvidia-ai-endpoints
- nv-ai-foundation embedding engine is renamed to nvidia-ai-endpoints
Embedding model update
- developer_rag example uses UAE-Large-V1 embedding model.
- Using ai-embed-qa-4 for api catalog examples instead of nvolveqa_40k as embedding model
Ingested data now persists across multiple sessions.
Updated langchain-nvidia-endpoints to version 0.0.11, enabling support for models like llama3.
File extension based validation to throw error for unsupported files.
The default output token length in the UI has been increased from 250 to 1024 for more comprehensive responses.
Stricter chain-server API validation support to enhance API security
Updated version of llama-index, pymilvus.
Updated pgvector container to pgvector/pgvector:pg16
LLM Model Updates
- Multiturn Chatbot now uses ai-mixtral-8x7b-instruct model for response generation.
- Structured data rag now uses ai-llama3-70b for response and code generation.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

v0.6.0

Added

Changed