Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: Duplicate event log to Langfuse trace #1872

Open
michalwelna0 opened this issue Apr 26, 2024 · 4 comments
Open

bug: Duplicate event log to Langfuse trace #1872

michalwelna0 opened this issue Apr 26, 2024 · 4 comments

Comments

@michalwelna0
Copy link

Describe the bug

I wanted to use LlamaParse for parsing a set of documents (PDF/Doc/Docx) and index them to be able to ask custom questions to those documents. I performed a bunch of tests where I simply parse documents using LlamaParse and perform indexing step with LlamaIndex, I encountered duplication in token count (duplicated event log with LLM generation).

To reproduce

from llama_parse import LlamaParse
from langfuse.decorators import observe, langfuse_context
from llama_index.core import Settings, VectorStoreIndex
from llama_index.core.callbacks import CallbackManager
from llama_index.core.node_parser import MarkdownElementNodeParser
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.postprocessor import SimilarityPostprocessor

import os
from langfuse import Langfuse
from pathlib import Path

os.environ["LANGFUSE_PUBLIC_KEY"] = ""
os.environ["LANGFUSE_SECRET_KEY"] = ""
os.environ["LANGFUSE_HOST"] = ""
os.environ["OPENAI_API_KEY"] = ""
os.environ["LLAMA_CLOUD_API_KEY"] = ""

langfuse = Langfuse()

folder_path = Path("path do local documents")
num_workers = len([path for path in folder_path .iterdir()])
parser = LlamaParse(
    result_type="markdown",
    verbose=True,
    language="en",
    num_workers=num_workers,  # should be number of documents, limit 10
)


@observe()
def index_docs(docs):
    Settings.llm = OpenAI(model="gpt-3.5-turbo")
    Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
    node_parser = MarkdownElementNodeParser(
            llm=OpenAI(model="gpt-3.5-turbo"), num_workers=num_workers # should be number of documents, limit 10
        )
    langfuse_context.update_current_observation(name="Indexing documents")
    langfuse_callback_handler = langfuse_context.get_current_llama_index_handler()
    Settings.callback_manager = CallbackManager([langfuse_callback_handler])
 
    nodes = node_parser.get_nodes_from_documents(documents=docs)
    base_nodes, objects = node_parser.get_nodes_and_objects(nodes=nodes)
 
    index = VectorStoreIndex(nodes=base_nodes + objects)
 
    engine = index.as_query_engine(
        similarity_top_k=15,
        node_postprocessors=[
            SimilarityPostprocessor(similarity_cutoff=0.4)
        ],
        verbose=True,
    )
    return engine

def run():

    documents = parser.load_data([str(path) for path in folder_path.iterdir()])
    engine = index_docs(docs=documents)

run()

Additional information

Generations

Focus on red rectangles - marked duplicated events openai_llm and OpenAI-generation. This is problematic in terms of token usage estimation using Langfuse. From OpenAI side token usage is only logged once, so Langfuse is logging additional event with the same query and token usage.

Trace view
Here is a screen shot from Langfuse UI with trace. As you can see openai_llm and OpenAI-generation event are duplicated.

Copy link
Member

Thanks for reporting this. Do you also use the openai sdk wrapper within your application? i.e. from langfuse.openai import openai

@marcklingen
Copy link
Member

Note: this might be addressed when switching to instrumentation: #1931

Thanks for reporting this. Do you also use the openai sdk wrapper within your application? i.e. from langfuse.openai import openai

This'd be super helpful to know here

@jtha
Copy link

jtha commented May 6, 2024

Hi @marcklingen, reporting in that I do see the same duplication of traces when using the observe decorator with the openai sdk wrapper. Is this expected behaviour?

Anyhow, simple enough to use OpenAI's own sdk and setting up a function to log the token usage and cost.

@michalwelna0
Copy link
Author

Hi @marcklingen, thank you for your reply. To answer your question, yes I do use the SDK wrapper
from langfuse.openai import openai

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

3 participants