Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

bug: An error occurred in _handle_span_events: not enough values to unpack (expected 2, got 1) - error in indexing step #1795

Open
michalwelna0 opened this issue Apr 22, 2024 · 2 comments

Comments

@michalwelna0
Copy link

Describe the bug

I wanted to integrate LangFuse to my simple RAG creation using LlamaIndex and LlamaParse. Let me describe you the flow:

  1. I use LlamaParse to parse a few PDF/Doc/Docx files to return a list of Document objects.
  2. Then I put those documents to MarkdownElementNodeParser to calculate TextNodes and IndexNodes.
  3. Next, I calculate and Index and QueryEngine using LlamaIndex's VectorStoreIndex and as_query_engine() method.
  4. Then in a loop a try to ask some questions to the engine.

Each step is put in dedicated method, which are also decorated with observe() decorator from Langfuse.

However, once I run the main method I get bunch of errors:

0%| | 0/2 [00:00<?, ?it/s]An error occurred in _handle_span_events: not enough values to unpack (expected 2, got 1)
Traceback (most recent call last):
File "/.../venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 539, in parse_raw
obj = load_str_bytes(
File "/.../venv/lib/python3.10/site-packages/pydantic/v1/parse.py", line 37, in load_str_bytes
return json_loads(b)
File "/../.pyenv/versions/3.10.12/lib/python3.10/json/init.py", line 346, in loads
return _default_decoder.decode(s)
File "/../.pyenv/versions/3.10.12/lib/python3.10/json/decoder.py", line 337, in decode
obj, end = self.raw_decode(s, idx=_w(s, 0).end())
File "/.../.pyenv/versions/3.10.12/lib/python3.10/json/decoder.py", line 355, in raw_decode
raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
File "/../venv/lib/python3.10/site-packages/llama_index/core/callbacks/base.py", line 189, in as_trace
yield
File "/.../venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py", line 65, in aquery
query_result = await self._aquery(str_or_query_bundle)
File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
result = await func(*args, **kwargs)
File "/../venv/lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 206, in _aquery
response = await self._response_synthesizer.asynthesize(
File "/.../tescar/venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
result = await func(*args, **kwargs)
File "/...venv/lib/python3.10/site-packages/llama_index/core/response_synthesizers/base.py", line 308, in asynthesize
response_str = await self.aget_response(
File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
result = await func(*args, **kwargs)
File "/.../tescar/venv/lib/python3.10/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 23, in aget_response
return await super().aget_response(
File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
result = await func(*args, **kwargs)
File "/..../lib/python3.10/site-packages/llama_index/core/response_synthesizers/refine.py", line 375, in aget_response
response = self._output_cls.parse_raw(response)
File "/..../lib/python3.10/site-packages/pydantic/v1/main.py", line 548, in parse_raw
raise ValidationError([ErrorWrapper(e, loc=ROOT_KEY)], cls)
pydantic.v1.error_wrappers.ValidationError: 1 validation error for TableOutput
root
Expecting value: line 1 column 1 (char 0) (type=value_error.jsondecode; msg=Expecting value; doc=Empty Response; pos=0; lineno=1; colno=1)

It is important that those thrown erros do not stop the execution flow and still some information is logged in Trace in LangFuse platform, but I guess this is not expected behaviour. Below I provide code snippet that I'm running, so you can reproduce it in your environment.

To reproduce

from llama_parse import LlamaParse
from langfuse.decorators import observe, langfuse_context
from llama_index.core import Settings, VectorStoreIndex
from llama_index.core.callbacks import CallbackManager
from llama_index.core.node_parser import MarkdownElementNodeParser
from llama_index.llms.openai import OpenAI
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.core.postprocessor import SimilarityPostprocessor

tender_path = <path to folder with PDF/Word files>

num_workers = len([path for path in tender_path.iterdir()])
parser = LlamaParse(
    result_type="markdown",
    verbose=True,
    language="en",
    num_workers=num_workers,  # should be number of documents, limit 10
)
from config import QUESTIONS

@observe()
def index_docs(docs):
    Settings.llm = OpenAI(model="gpt-3.5-turbo")
    Settings.embed_model = OpenAIEmbedding(model="text-embedding-ada-002")
    node_parser = MarkdownElementNodeParser(
            llm=OpenAI(model="gpt-3.5-turbo"), num_workers=num_workers # should be number of documents, limit 10
        )
    langfuse_context.update_current_observation(name="Indexing documents")
    langfuse_callback_handler = langfuse_context.get_current_llama_index_handler()
    Settings.callback_manager = CallbackManager([langfuse_callback_handler])

    nodes = node_parser.get_nodes_from_documents(documents=docs)
    base_nodes, objects = node_parser.get_nodes_and_objects(nodes=nodes)

    index = VectorStoreIndex(nodes=base_nodes + objects)

    engine = index.as_query_engine(
        similarity_top_k=15,
        node_postprocessors=[
            SimilarityPostprocessor(similarity_cutoff=0.4)
        ],
        verbose=True,
    )
    return engine


@observe()
def ask_question(question: str, engine):
    
    langfuse_context.update_current_observation(name="Ask question")
    langfuse_callback_handler = langfuse_context.get_current_llama_index_handler()
    Settings.callback_manager = CallbackManager([langfuse_callback_handler])

    return engine.query(question)

@observe()
def generate_answers(engine) -> dict[str, str]:
    langfuse_callback_handler = langfuse_context.get_current_llama_index_handler()
    Settings.callback_manager = CallbackManager([langfuse_callback_handler])
    langfuse_context.update_current_observation(name="Generate answers")
    dict_res = {}
    for i, question in enumerate(QUESTIONS):
        response = ask_question(question , engine)
        dict_res[i] = response.response

    return dict_res

@observe()
def run():
    documents = parser.load_data([str(path) for path in tender_path.iterdir()])
    engine = index_docs(docs=documents)
    return generate_answers(engine)


run()

Additional information

image

Surprisingly during Indexing documents observation, some information is being logged, but I can't tell why and can't tell if any other information is missing.

Copy link
Member

Thanks for reporting, sorry for the slow response here due to Launch Week. Can you try if disabling capturing the output of index_docs stops this error from being logged? This would help pinpointing the issue.

@observe(capture_output=False)
def index_docs(docs):

@michalwelna0
Copy link
Author

No worries. I did as you suggested, added argument capture_output = False do index_docs() decorator and I got still the same bunch of issues. Also for testing I put in all observed methods that argument and still the same.

An error occurred in _handle_span_events: not enough values to unpack (expected 2, got 1)
Traceback (most recent call last):
  File "/.../venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 539, in parse_raw
    obj = load_str_bytes(
  File "/.../venv/lib/python3.10/site-packages/pydantic/v1/parse.py", line 37, in load_str_bytes
    return json_loads(b)
  File "/.../.pyenv/versions/3.10.12/lib/python3.10/json/__init__.py", line 346, in loads
    return _default_decoder.decode(s)
  File "/../.pyenv/versions/3.10.12/lib/python3.10/json/decoder.py", line 337, in decode
    obj, end = self.raw_decode(s, idx=_w(s, 0).end())
  File "/.../.pyenv/versions/3.10.12/lib/python3.10/json/decoder.py", line 355, in raw_decode
    raise JSONDecodeError("Expecting value", s, err.value) from None
json.decoder.JSONDecodeError: Expecting value: line 1 column 1 (char 0)

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File /.../venv/lib/python3.10/site-packages/llama_index/core/callbacks/base.py", line 189, in as_trace
    yield
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/base/base_query_engine.py", line 65, in aquery
    query_result = await self._aquery(str_or_query_bundle)
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
    result = await func(*args, **kwargs)
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/query_engine/retriever_query_engine.py", line 206, in _aquery
    response = await self._response_synthesizer.asynthesize(
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
    result = await func(*args, **kwargs)
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/response_synthesizers/base.py", line 308, in asynthesize
    response_str = await self.aget_response(
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
    result = await func(*args, **kwargs)
  File /.../venv/lib/python3.10/site-packages/llama_index/core/response_synthesizers/compact_and_refine.py", line 23, in aget_response
    return await super().aget_response(
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/instrumentation/dispatcher.py", line 233, in async_wrapper
    result = await func(*args, **kwargs)
  File "/.../venv/lib/python3.10/site-packages/llama_index/core/response_synthesizers/refine.py", line 375, in aget_response
    response = self._output_cls.parse_raw(response)
  File "/.../venv/lib/python3.10/site-packages/pydantic/v1/main.py", line 548, in parse_raw
    raise ValidationError([ErrorWrapper(e, loc=ROOT_KEY)], cls)
pydantic.v1.error_wrappers.ValidationError: 1 validation error for TableOutput
__root__
  Expecting value: line 1 column 1 (char 0) (type=value_error.jsondecode; msg=Expecting value; doc=Empty Response; pos=0; lineno=1; colno=1)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants