Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

FEATURE REQUEST: Local LLM endpoint integration (like Ollama and OpenAI API compatible) and Docker Container #2

Open
mjtechguy opened this issue Apr 11, 2024 · 21 comments

Comments

@mjtechguy
Copy link

This project seems awesome. Thanks for building this.

Would it be possible to:

  1. Expose a variable for the LLM Endpoint address so systems like Ollama could be used as OpenAI API compatible endpoints?
  2. Would you be able to offer the tool / UI as a Docker container or Docker Compose stack?

Thanks!

@shaoyijia
Copy link
Collaborator

For 1, you can change the LLM configuration according to https://github.com/stanford-oval/storm?tab=readme-ov-file#customize-the-storm-configurations.

For 2, having a Docker release sounds like a good idea. We will consider this.

@shuther
Copy link

shuther commented Apr 17, 2024

for the point 1, it doesn't seem obvious because storm is calling the endpoint /v1/completions and not /v1/chat/completions so direct ollama is out-of-question. Maybe litellm could help.

@shaoyijia
Copy link
Collaborator

Hi @shuther , thanks for the info! Are you willing to help with ollama integration to see whether it's feasible? We develop STORM using dspy and I just found they support ollama in their codebase. New LM integration can be added in src/lm.py.

@shuther
Copy link

shuther commented Apr 23, 2024

yes I can help. Today, I am using litellm (through docker) to ollama or the ollama openAI endpoint directly.
What is important is to be able to change the base_url and ideally the model:name (but with litellm we can work around it)

@shaoyijia
Copy link
Collaborator

Hi @shuther , thank you so much!

I feel we can leverage the effort in https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/ollama.py, though some testing is needed and we may need to create a wrapper instead of directly using it (you can check out examples in src/lm.py).

When testing Mistral 7B, I found although it's worth than GPT (which is as expected), it can do a pretty good job in the knowledge curation part (i.e., finding and organizing information). That's why I think ollama integration is interesting.

@shuther
Copy link

shuther commented Apr 24, 2024

Ollama provides an openAI compatible endpoint. It could make sense to focus on this one or you have a concern? I was running some tests with mistral and wizardlm2; llama3 may worth a try.

@shaoyijia
Copy link
Collaborator

It could make sense to focus on this one or you have a concern?

No concern, this is good. I think https://github.com/stanfordnlp/dspy/blob/main/dsp/modules/ollama.py is also implemented based on it. You may check whether their implementation can directly work or not. If not, we also hope to inherit dspy.LM. In this way, we can pass the model object into STORMWikiLMConfigs without changing anything else. For example, people may use Ollama to run STORM in this way:

lm_configs = STORMWikiLMConfigs()

conv_simulator_lm = OllamaModel(...)
question_asker_lm = OllamaModel(...)
outline_gen_lm = OllamaModel(...)
article_gen_lm = OllamaModel(...)
article_polish_lm = OllamaModel(...)

# LM setup.
lm_configs.set_conv_simulator_lm(conv_simulator_lm)
lm_configs.set_question_asker_lm(question_asker_lm)
lm_configs.set_outline_gen_lm(outline_gen_lm)
lm_configs.set_article_gen_lm(article_gen_lm)
lm_configs.set_article_polish_lm(article_polish_lm)

# STORM pipeline setup (independent with LM setup).
engine_args = STORMWikiRunnerArguments(...)

# User / Other functions only need to call runner, and do not need to think about Ollama after they are set up.
runner = STORMWikiRunner(engine_args, lm_configs)

llama3 may worth a try.

Totally. I only provide an example with Mistral because I haven't got time to configure llama3 yet - but it's really worth a try.

@shuther
Copy link

shuther commented Apr 25, 2024

to run: python examples/run_storm_wiki_mistral.py --url http://192.168.0.120 --port 4000 --do-generate-outline --remove-duplicate --do-research, I had to add in the requirements.txt: anthropic and streamlit
Also ideally the model name is an argument as I had to enforce it to mistral

Traceback (most recent call last):
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 25, in <module>
    from lm import VLLMClient
  File "/home/shuther/devProjects/storm/./src/lm.py", line 7, in <module>
    import anthropic
ModuleNotFoundError: No module named 'anthropic'

Traceback (most recent call last):
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 26, in <module>
    from storm_wiki.engine import STORMWikiRunnerArguments, STORMWikiRunner, STORMWikiLMConfigs
  File "/home/shuther/devProjects/storm/./src/storm_wiki/__init__.py", line 1, in <module>
    from .engine import *
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 10, in <module>
    from storm_wiki.modules.article_generation import StormArticleGenerationModule
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/__init__.py", line 1, in <module>
    from .knowledge_curation import *
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/knowledge_curation.py", line 12, in <module>
    from streamlit.runtime.scriptrunner import add_script_run_ctx
ModuleNotFoundError: No module named 'streamlit'

Would it be possible to have a startingexample to run run_prewriting.py with ollama/mistral/generic open ai so we can test it further?

@shuther
Copy link

shuther commented Apr 25, 2024

Following my experience above: python examples/run_storm_wiki_mistral.py --url http://192.168.0.120 --port 4000 --do-generate-outline --remove-duplicate --do-research
I am getting the error message below and I am not sure how to debug it further:
root : ERROR : Error occurs when searching query : 'hits'

Then, I received this:

Failed to parse JSON response: {"error":{"message":"","type":null,"param":null,"code":500}}
Traceback (most recent call last):
  File "/home/shuther/devProjects/storm/./src/lm.py", line 262, in _generate
    completions = json_response["choices"]
                  ~~~~~~~~~~~~~^^^^^^^^^^^
KeyError: 'choices'

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 165, in <module>
    main(parser.parse_args())
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 118, in main
    runner.run(
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 283, in run
    information_table = self.run_knowledge_curation_module(ground_truth_url=ground_truth_url,
                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/interface.py", line 376, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 178, in run_knowledge_curation_module
    information_table, conversation_log = self.storm_knowledge_curation_module.research(
                                          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/knowledge_curation.py", line 302, in research
    conversations = self._run_conversation(conv_simulator=self.conv_simulator,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/knowledge_curation.py", line 269, in _run_conversation
    conv = future.result()
           ^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/concurrent/futures/_base.py", line 449, in result
    return self.__get_result()
           ^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/concurrent/futures/_base.py", line 401, in __get_result
    raise self._exception
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/concurrent/futures/thread.py", line 58, in run
    result = self.fn(*self.args, **self.kwargs)
             ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/knowledge_curation.py", line 252, in run_conv
    return conv_simulator(
           ^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dspy/primitives/program.py", line 29, in __call__
    return self.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/knowledge_curation.py", line 42, in forward
    user_utterance = self.wiki_writer(topic=topic, persona=persona, dialogue_turns=dlg_history).question
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dspy/primitives/program.py", line 29, in __call__
    return self.forward(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/knowledge_curation.py", line 85, in forward
    question = self.ask_question_with_persona(topic=topic, persona=persona, conv=conv).question
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dspy/predict/predict.py", line 60, in __call__
    return self.forward(**kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dspy/predict/chain_of_thought.py", line 66, in forward
    return super().forward(signature=signature, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dspy/predict/predict.py", line 87, in forward
    x, C = dsp.generate(signature, **config)(x, stage=self.stage)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dsp/primitives/predict.py", line 78, in do_generate
    completions: list[dict[str, Any]] = generator(prompt, **kwargs)
                                        ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dsp/modules/hf.py", line 137, in __call__
    response = self.request(prompt, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dsp/modules/lm.py", line 26, in request
    return self.basic_request(prompt, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/dsp/modules/hf.py", line 91, in basic_request
    response = self._generate(prompt, **kwargs)
               ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/lm.py", line 271, in _generate
    raise Exception("Received invalid JSON response from server")
Exception: Received invalid JSON response from server

@Yucheng-Jiang
Copy link
Collaborator

@shuther The error root : ERROR : Error occurs when searching query : 'hits' you see is most likely due to rate limit exceeds from You.com search engine API. (check if you have some error log looks like this #28)

@Yucheng-Jiang
Copy link
Collaborator

@shuther Also, note that the argument --remove-duplicate should go with --do-polish-article as the argument controls over whether remove duplicate content after final article generation. It does nothing when you don't specify --do-polish-article.

@shuther
Copy link

shuther commented Apr 26, 2024

@shuther The error root : ERROR : Error occurs when searching query : 'hits' you see is most likely due to rate limit exceeds from You.com search engine API. (check if you have some error log looks like this #28)

I checked the logs from you and it doesn't seem to be the case. Is there a way to print more debug statements/logs?

@Yucheng-Jiang
Copy link
Collaborator

@shuther Would you mind provide full log of running the command? Also please confirm that you are on the head of main branch.

@shuther
Copy link

shuther commented Apr 26, 2024

I did git pull
then I have this:
http://192.168.0.120 is an openAI compatible endpoint (litellm)
python examples/run_storm_wiki_mistral.py --url http://192.168.0.120 --port 4000 --max-thread-num 1 --do-generate-outline --do-research
Traceback (most recent call last):
File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 25, in
from lm import VLLMClient
File "/home/shuther/devProjects/storm/./src/lm.py", line 106, in
class ClaudeModel(dspy.dsp.modules.lm.LM):
File "/home/shuther/devProjects/storm/./src/lm.py", line 191, in ClaudeModel
(RateLimitError,),
^^^^^^^^^^^^^^
NameError: name 'RateLimitError' is not defined

@Yucheng-Jiang
Copy link
Collaborator

seems like storm/./src/lm.py, line 191, should be self.history.append(json_serializable_history) instead of (RateLimitError,),. Could you confirm you are at head of main branch?

@shuther
Copy link

shuther commented Apr 30, 2024

I did again a git pull, now it runs but I guess it is not complete:
python examples/run_storm_wiki_mistral.py --url http://192.168.0.120 --port 4000 --max-thread-num 1 --do-generate-outline --do-research

root : ERROR    : Error occurs when searching query : 'hits'
root : ERROR    : Error occurs when searching query : 'hits'
root : ERROR    : Simulated Wikipedia writer utterance is empty.
interface : INFO     : run_knowledge_curation_module executed in 99.2199 seconds
interface : INFO     : run_outline_generation_module executed in 3.6078 seconds
***** Execution time *****
run_knowledge_curation_module: 99.2199 seconds
run_outline_generation_module: 3.6078 seconds
***** Token usage of language models: *****
run_knowledge_curation_module
run_outline_generation_module
***** Number of queries of retrieval models: *****
run_knowledge_curation_module: {'YouRM': 3}
run_outline_generation_module: {'YouRM': 0}

then I tried python examples/run_storm_wiki_mistral.py --url http://192.168.0.120 --port 4000 --max-thread-num 1 --do-generate-article --do-research and I got more questions:

  1. is it possible to change the model from paraphrase-MiniLM-L6-v2 to ollama/all-minilm (using openai/litellm compatible endpoint)
  2. I received the error:
root : ERROR    : No outline for xxx. Will directly search with the topic.
Traceback (most recent call last):
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 166, in <module>
    main(parser.parse_args())
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 119, in main
    runner.run(
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 319, in run
    draft_article = self.run_article_generation_module(outline=outline,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/interface.py", line 376, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 210, in run_article_generation_module
    draft_article = self.storm_article_generation.generate_article(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/article_generation.py", line 66, in generate_article
    section_output_dict = self.generate_section(
                          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/article_generation.py", line 33, in generate_section
    collected_info = information_table.retrieve_information(queries=section_query,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/storm_dataclass.py", line 170, in retrieve_information
    sim = cosine_similarity([encoded_query], self.encoded_snippets)[0]
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/utils/_param_validation.py", line 214, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/metrics/pairwise.py", line 1578, in cosine_similarity
    X, Y = check_pairwise_arrays(X, Y)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/metrics/pairwise.py", line 173, in check_pairwise_arrays
    Y = check_array(
        ^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/utils/validation.py", line 938, in check_array
    raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

@shaoyijia
Copy link
Collaborator

Hi @shuther , I haven't met this error before. Looking into your error log, I notice a few things:

  • What's your input topic? I cannot infer this given root : ERROR : No outline for xxx. Will directly search with the topic..
  • run_knowledge_curation_module: {'YouRM': 3}. The number of calls to search engine is abnormally low since you haven't changed our default configuration in your command.

@shaoyijia
Copy link
Collaborator

is it possible to change the model from paraphrase-MiniLM-L6-v2 to ollama/all-minilm (using openai/litellm compatible endpoint)

This is possible, though you need to check its performance. This model is used to retrieve collected information to write each section. Here is an example for how to use ollama/all-minilm

  1. Create OllamaStormInformationTable inheriting StormInformationTable but using ollama for retrieving information from the information table.
import numpy as np
import ollama  # Assume ollama is already installed on the PC and you have already run `ollama pull all-minilm`

class OllamaStormInformationTable(InformationTable):
    """Manage StormInformation with ollama embedding model."""

    def prepare_table_for_retrieval(self):
        self.encoder = SentenceTransformer('paraphrase-MiniLM-L6-v2')
        self.collected_urls = []
        self.collected_snippets = []
        for url, information in self.url_to_info.items():
            for snippet in information.snippets:
                self.collected_urls.append(url)
                self.collected_snippets.append(snippet)
                # This part is new.
                self.encoded_snippets.append(ollama.embeddings(model='all-minilm', prompt=snippet)['embeddings'])
        self.encoded_snippets = np.array(self.encoded_snippets)

    def retrieve_information(self, queries: Union[List[str], str], search_top_k) -> List[StormInformation]:
        selected_urls = []
        selected_snippets = []
        if type(queries) is str:
            queries = [queries]
        for query in queries:
            # This line is new.
            encoded_query = np.array(ollama.embeddings(model='all-minilm', prompt=query)['embeddings'])
            sim = cosine_similarity([encoded_query], self.encoded_snippets)[0]
            sorted_indices = np.argsort(sim)
            for i in sorted_indices[-search_top_k:][::-1]:
                selected_urls.append(self.collected_urls[i])
                selected_snippets.append(self.collected_snippets[i])

        url_to_snippets = {}
        for url, snippet in zip(selected_urls, selected_snippets):
            if url not in url_to_snippets:
                url_to_snippets[url] = set()
            url_to_snippets[url].add(snippet)

        selected_url_to_info = {}
        for url in url_to_snippets:
            selected_url_to_info[url] = copy.deepcopy(self.url_to_info[url])
            selected_url_to_info[url].snippets = list(url_to_snippets[url])

        return list(selected_url_to_info.values())
  1. Customize STORMWikiRunner to use OllamaStormInformationTable.
class MySTORMWikiRunner(STORMWikiRunner):

    """Customize STORMWikiRunner to use OllamaStormInformationTable."""
    def run(self,
        topic: str,
        ground_truth_url: str = '',
        do_research: bool = True,
        do_generate_outline: bool = True,
        do_generate_article: bool = True,
        do_polish_article: bool = True,
        remove_duplicate: bool = False,
        callback_handler: BaseCallbackHandler = BaseCallbackHandler()):
    self.topic = topic
    self.article_dir_name = topic.replace(' ', '_').replace('/', '_')
    self.article_output_dir = os.path.join(self.args.output_dir, self.article_dir_name)
    os.makedirs(self.article_output_dir, exist_ok=True)

    # research module
    information_table: OllamaStormInformationTable = None  # StormInformationTable -> OllamaStormInformationTable
    if do_research:
        information_table = self.run_knowledge_curation_module(ground_truth_url=ground_truth_url,
                                                               callback_handler=callback_handler)
    else:
        information_table = OllamaStormInformationTable.from_conversation_log_file(
            os.path.join(self.article_output_dir, 'conversation_log.json'))  # StormInformationTable -> OllamaStormInformationTable

    # outline generation module
    outline: StormArticle = None
    if do_generate_outline:
        outline = self.run_outline_generation_module(information_table=information_table,
                                                     callback_handler=callback_handler)
    else:
        outline = StormArticle.from_outline_file(topic=topic, file_path=os.path.join(self.article_output_dir,
                                                                                 'storm_gen_outline.txt'))

    # article generation module
    draft_article: StormArticle = None
    if do_generate_article:
        draft_article = self.run_article_generation_module(outline=outline,
                                                           information_table=information_table,
                                                           callback_handler=callback_handler)

    # article polishing module
    if do_polish_article:
        polished_article = self.run_article_polishing_module(draft_article=draft_article,
                                                             remove_duplicate=remove_duplicate)

If you want to use https://docs.litellm.ai/docs/providers/ollama instead of https://github.com/ollama/ollama-python, you just need to change the ollama.embeddings(...) to corresponding call to litellm API.

@shuther
Copy link

shuther commented May 2, 2024

so after a git pull again and running:
python examples/run_storm_wiki_mistral.py --url http://192.168.0.120 --port 4000 --max-thread-num 1 --do-generate-article --do-research --do-generate-outline
I have this:

Topic: Differences between capitalism and liberalism
root : ERROR    : Error occurs when searching query : 'hits'
root : ERROR    : Error occurs when searching query : 'hits'
root : ERROR    : Error occurs when searching query : 'hits'
interface : INFO     : run_knowledge_curation_module executed in 1.4256 seconds
interface : INFO     : run_outline_generation_module executed in 3.9669 seconds
sentence_transformers.SentenceTransformer : INFO     : Load pretrained SentenceTransformer: paraphrase-MiniLM-L6-v2
/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/transformers/utils/generic.py:311: UserWarning: torch.utils._pytree._register_pytree_node is deprecated. Please use torch.utils._pytree.register_pytree_node instead.
  torch.utils._pytree._register_pytree_node(
sentence_transformers.SentenceTransformer : INFO     : Use pytorch device: cuda
root : ERROR    : No outline for Differences between capitalism and liberalism. Will directly search with the topic.
Traceback (most recent call last):
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 166, in <module>
    main(parser.parse_args())
  File "/home/shuther/devProjects/storm/examples/run_storm_wiki_mistral.py", line 119, in main
    runner.run(
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 319, in run
    draft_article = self.run_article_generation_module(outline=outline,
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/interface.py", line 376, in wrapper
    result = func(*args, **kwargs)
             ^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/engine.py", line 210, in run_article_generation_module
    draft_article = self.storm_article_generation.generate_article(
                    ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/article_generation.py", line 66, in generate_article
    section_output_dict = self.generate_section(
                          ^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/article_generation.py", line 33, in generate_section
    collected_info = information_table.retrieve_information(queries=section_query,
                     ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/devProjects/storm/./src/storm_wiki/modules/storm_dataclass.py", line 170, in retrieve_information
    sim = cosine_similarity([encoded_query], self.encoded_snippets)[0]
          ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/utils/_param_validation.py", line 214, in wrapper
    return func(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/metrics/pairwise.py", line 1578, in cosine_similarity
    X, Y = check_pairwise_arrays(X, Y)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/metrics/pairwise.py", line 173, in check_pairwise_arrays
    Y = check_array(
        ^^^^^^^^^^^^
  File "/home/shuther/mambaforge/envs/storm/lib/python3.11/site-packages/sklearn/utils/validation.py", line 938, in check_array
    raise ValueError(
ValueError: Expected 2D array, got 1D array instead:
array=[].
Reshape your data either using array.reshape(-1, 1) if your data has a single feature or array.reshape(1, -1) if it contains a single sample.

Maybe I am missing how to use it?

@shaoyijia
Copy link
Collaborator

root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
root : ERROR : Error occurs when searching query : 'hits'
interface : INFO : run_knowledge_curation_module executed in 1.4256 seconds
interface : INFO : run_outline_generation_module executed in 3.9669 seconds

Based on the log, it seems like the do_research part is not correctly executed, i.e., no information is collected. Could you check whether YouRM is working? It may be exceeding the search API limit, so no information is returned when calling the retriever. To check this, you can do the following

from rm import YouRM

you_rm = YouRM(ydc_api_key="you_com_api_key")
output = you_rm(query_or_queries="Differences between capitalism and liberalism", exclude_urls=[])
print(output)

@shuther
Copy link

shuther commented May 7, 2024

the first description value seems strange?
[{'description': 'Something went wrong. Wait a moment and try again.Try again', 'snippets': ['Answer (1 of 40): Liberalism and Capitalism aren’t really on the same continuum and can’t be compared in meaningful ways. It’s a bit like asking ‘what’s the difference between north and green?’. (one is a compass direction, the other a color) Liberalism (in the classical sense) is a political ph...', 'Something went wrong. Wait a moment and try again.Try again'], 'title': 'What is the difference between liberalism and capitalism? - Quora', 'url': 'https://www.quora.com/What-is-the-difference-between-liberalism-and-capitalism'}, {'description': 'It’s because of this that we’ve rejected it and replaced the word with ‘liberalism’. Today, however, when even a country like China is welcoming some form of capitalism, the landscape is better suited to making a clear distinction between liberal thought and capitalist thought, so that ...', 'snippets': ['This is important because it avoids misunderstandings about what the market economy is. If balance is at the heart of liberal thinking, this is not the case with capitalism. If we differentiate liberalism and capitalism properly, we could better protect ourselves against the overwhelming power of finance, and its resulting crises, in the name of balance and competition in a liberal market economy.', 'Valerie Charolles is a researcher at the Institut Mines-Telecom Business School and associate researcher at the Interdisciplinary Institute of Anthropology of the Contemporary (CNRS-EHESS) She has published four papers, including ‘Le libéralisme contre le capitalisme’ in which she describes the synonymy between the two concepts as ideologies, taking the form of a ‘soft totalitarianism’. As part of its series of lectures on the ecological and social transition, the Society & Organizations Center was delighted to invite her for a talk.', 'It’s because of this that we’ve rejected it and replaced the word with ‘liberalism’. Today, however, when even a country like China is welcoming some form of capitalism, the landscape is better suited to making a clear distinction between liberal thought and capitalist thought, so that we can then see what consequences this might have.', "Explaining that it is capitalist is not offensive, it’s an observation. One could also say that it was neo-liberal or ultra-liberal, but it can’t be described it as liberal if we are to take Adam Smith's philosophy as a reference point, which is the one I rely on."], 'title': 'Making the Distinction between Liberalism and Capitalism ...', 'url': 'https://www.hec.edu/en/news-room/making-distinction-between-liberalism-and-capitalism-21st-century'}, {'description': 'As is clear from the above, capitalism is an economic practice and neo-liberalism is a philosophy that fanatically formulates how societies practising capitalism should be managed. ... Sharing is caring! ... Email This Post : If you like this article or our site.', 'snippets': ['Both capitalism and neo-liberalism basically advocate free market economy without state control. The dividing line between capitalism and neo-liberalism is so thin that many consider the two concepts as synonymous with each other. Yet there are differences that give each of them a separate identity.', 'As is clear from the above, capitalism is an economic practice and neo-liberalism is a philosophy that fanatically formulates how societies practising capitalism should be managed. ... Sharing is caring! ... Email This Post : If you like this article or our site. Please spread the word. Share it with your friends/family. ... Cite APA 7 Chakraborty, P. (2016, January 11). Difference Between Capitalism And Neo-liberalism.', 'Difference Between Similar Terms and Objects, 11 January, 2016, http://www.differencebetween.net/miscellaneous/politics/ideology-politics/difference-between-capitalism-and-neo-liberalism/. ... This gives an extremely narrowly and mostly incorrect explanation of capitalism. It is not nearly the same as classic liberalism/neoliberalism. State capitalism or Keynesian economics is vastly different from the laissez-faire approach associated with classic liberalism.', 'State capitalism or Keynesian economics is vastly different from the laissez-faire approach associated with classic liberalism. Detailing these differences is vital in explaining and understanding the difference between a Social Democratic and Third Way ideology.'], 'title': 'Difference Between Capitalism And Neo-liberalism | Difference Between', 'url': 'http://www.differencebetween.net/miscellaneous/politics/ideology-politics/difference-between-capitalism-and-neo-liberalism/'}]

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants