Skip to content

v0.6.0

Latest
Compare
Choose a tag to compare
@sumitkbh sumitkbh released this 10 May 17:19
e711143

This release adds ability to switch between API Catalog models and on-prem models using NIM-LLM and adds documentation on how to build an RAG application from scratch. It also releases a containerized end to end RAG evaluation application integrated with RAG chain-server APIs.

Added

Changed

  • Renamed example csv_rag to structured_data_rag
  • Model Engine name update
    • nv-ai-foundation and nv-api-catalog llm engine are renamed to nvidia-ai-endpoints
    • nv-ai-foundation embedding engine is renamed to nvidia-ai-endpoints
  • Embedding model update
    • developer_rag example uses UAE-Large-V1 embedding model.
    • Using ai-embed-qa-4 for api catalog examples instead of nvolveqa_40k as embedding model
  • Ingested data now persists across multiple sessions.
  • Updated langchain-nvidia-endpoints to version 0.0.11, enabling support for models like llama3.
  • File extension based validation to throw error for unsupported files.
  • The default output token length in the UI has been increased from 250 to 1024 for more comprehensive responses.
  • Stricter chain-server API validation support to enhance API security
  • Updated version of llama-index, pymilvus.
  • Updated pgvector container to pgvector/pgvector:pg16
  • LLM Model Updates