You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I wanted to use LlamaParse for parsing a set of documents (PDF/Doc/Docx) and index them to be able to ask custom questions to those documents. I performed a bunch of tests where I simply parse documents using LlamaParse and perform indexing step with LlamaIndex, I encountered duplication in token count (duplicated event log with LLM generation).
To reproduce
fromllama_parseimportLlamaParsefromlangfuse.decoratorsimportobserve, langfuse_contextfromllama_index.coreimportSettings, VectorStoreIndexfromllama_index.core.callbacksimportCallbackManagerfromllama_index.core.node_parserimportMarkdownElementNodeParserfromllama_index.llms.openaiimportOpenAIfromllama_index.embeddings.openaiimportOpenAIEmbeddingfromllama_index.core.postprocessorimportSimilarityPostprocessorimportosfromlangfuseimportLangfusefrompathlibimportPathos.environ["LANGFUSE_PUBLIC_KEY"] =""os.environ["LANGFUSE_SECRET_KEY"] =""os.environ["LANGFUSE_HOST"] =""os.environ["OPENAI_API_KEY"] =""os.environ["LLAMA_CLOUD_API_KEY"] =""langfuse=Langfuse()
folder_path=Path("path do local documents")
num_workers=len([pathforpathinfolder_path .iterdir()])
parser=LlamaParse(
result_type="markdown",
verbose=True,
language="en",
num_workers=num_workers, # should be number of documents, limit 10
)
@observe()defindex_docs(docs):
Settings.llm=OpenAI(model="gpt-3.5-turbo")
Settings.embed_model=OpenAIEmbedding(model="text-embedding-ada-002")
node_parser=MarkdownElementNodeParser(
llm=OpenAI(model="gpt-3.5-turbo"), num_workers=num_workers# should be number of documents, limit 10
)
langfuse_context.update_current_observation(name="Indexing documents")
langfuse_callback_handler=langfuse_context.get_current_llama_index_handler()
Settings.callback_manager=CallbackManager([langfuse_callback_handler])
nodes=node_parser.get_nodes_from_documents(documents=docs)
base_nodes, objects=node_parser.get_nodes_and_objects(nodes=nodes)
index=VectorStoreIndex(nodes=base_nodes+objects)
engine=index.as_query_engine(
similarity_top_k=15,
node_postprocessors=[
SimilarityPostprocessor(similarity_cutoff=0.4)
],
verbose=True,
)
returnenginedefrun():
documents=parser.load_data([str(path) forpathinfolder_path.iterdir()])
engine=index_docs(docs=documents)
run()
Additional information
Focus on red rectangles - marked duplicated events openai_llm and OpenAI-generation. This is problematic in terms of token usage estimation using Langfuse. From OpenAI side token usage is only logged once, so Langfuse is logging additional event with the same query and token usage.
Here is a screen shot from Langfuse UI with trace. As you can see openai_llm and OpenAI-generation event are duplicated.
The text was updated successfully, but these errors were encountered:
Hi @marcklingen, reporting in that I do see the same duplication of traces when using the observe decorator with the openai sdk wrapper. Is this expected behaviour?
Anyhow, simple enough to use OpenAI's own sdk and setting up a function to log the token usage and cost.
Describe the bug
I wanted to use LlamaParse for parsing a set of documents (PDF/Doc/Docx) and index them to be able to ask custom questions to those documents. I performed a bunch of tests where I simply parse documents using LlamaParse and perform indexing step with LlamaIndex, I encountered duplication in token count (duplicated event log with LLM generation).
To reproduce
Additional information
Focus on red rectangles - marked duplicated events openai_llm and OpenAI-generation. This is problematic in terms of token usage estimation using Langfuse. From OpenAI side token usage is only logged once, so Langfuse is logging additional event with the same query and token usage.
Here is a screen shot from Langfuse UI with trace. As you can see
openai_llm
andOpenAI-generation
event are duplicated.The text was updated successfully, but these errors were encountered: