Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RFC: Support for bidirectional communication (prompting the user) #44

Open
Pwuts opened this issue Aug 23, 2023 · 3 comments
Open

RFC: Support for bidirectional communication (prompting the user) #44

Pwuts opened this issue Aug 23, 2023 · 3 comments

Comments

@Pwuts
Copy link
Contributor

Pwuts commented Aug 23, 2023

The protocol currently supports making requests to an agent service. However, some agents may need to be able to communicate with the user in order to function optimally. For example:

  1. User: please buy me a new set of cutting boards
  2. AI: would you like wooden or plastic cutting boards?
  3. User: I like wood
  4. AI: searches for wooden cutting boards and places an order on Amazon

Adding a way for agents to prompt the user would greatly increase the versatility of the protocol imo.

Proposal

Two primary options:

  1. Extension of the protocol with a status awaiting_input, and a way to resolve this status with additional input for an existing task or step

  2. Extension of the task endpoint with a callback (or similar) attribute through which a client can specify a callback URL which may be polled with prompts for the user until they are resolved.
    Example:

    1. Giving the agent a task
    POST /agent/tasks
    {
      "input": "Please find a nice olive wood cutting board on Amazon and order it for me.",
      "callback_url": "https://my-service.url/agents/203820/callbacks"
    }
    1. The agent wants more info
    POST https://my-service.url/agents/203820/callbacks
    {
      "prompt": "What is your budget for this purchase?"
    }
    {
      "prompt_id": 123,
      "status": "pending",
      "created": "2023-08-23T13:49:51.141Z",
      "last_updated": "2023-08-23T13:49:51.141Z"
    }
    1. The agent polls the client until the prompt is resolved
    GET https://my-service.url/agents/203820/callbacks/123

    Responses:

    {
      "prompt_id": 123,
      "status": "pending",
      "created": "2023-08-23T13:49:51.141Z",
      "last_updated": "2023-08-23T13:49:51.141Z"
    }
    {
      "prompt_id": 123,
      "status": "resolved",
      "answer": "I don't want to spend more than €40 on this purchase",
      "created": "2023-08-23T13:49:51.141Z",
      "last_updated": "2023-08-23T13:53:12.634Z",
    }
    {
      "prompt_id": 123,
      "status": "rejected",
      "created": "2023-08-23T13:49:51.141Z",
      "last_updated": "2023-08-23T13:53:12.634Z",
    }

Alternatives

  • Extending the protocol with full chatting capabilities:
    • GET /agent/tasks/<task_id>/chats
      List chats regarding task <task_id>

    • POST /agent/tasks/<task_id>/chats
      Start a new chat regarding task <task_id>

    • POST /agent/tasks/<task_id>/chats/<chat_id>/messages
      Post a new message in an existing chat

    • GET /agent/tasks/<task_id>/chats/<chat_id>/messages
      Get all messages in a chat

    • POST /agent/tasks/<task_id>/chats/<chat_id>/close
      Close/resolve a chat

@hackgoofer
Copy link
Contributor

Interesting, I feel like chats are better implemented with websockets - What do you think? @jzanecook

@jzanecook
Copy link
Collaborator

This might be one of the perfect examples for a plugin referencing #71 since it's something that not all agents might have.

@ntindle
Copy link

ntindle commented Dec 5, 2023

Can you attend a meeting the Agent protocol meeting on the 12th to discuss this?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants