RFC: Support for bidirectional communication (prompting the user) #44

Pwuts · 2023-08-23T09:49:53Z

The protocol currently supports making requests to an agent service. However, some agents may need to be able to communicate with the user in order to function optimally. For example:

User: please buy me a new set of cutting boards
AI: would you like wooden or plastic cutting boards?
User: I like wood
AI: searches for wooden cutting boards and places an order on Amazon

Adding a way for agents to prompt the user would greatly increase the versatility of the protocol imo.

Proposal

Two primary options:

Extension of the protocol with a status awaiting_input, and a way to resolve this status with additional input for an existing task or step

Extension of the task endpoint with a callback (or similar) attribute through which a client can specify a callback URL which may be polled with prompts for the user until they are resolved.
Example:

Giving the agent a task

POST /agent/tasks
{
  "input": "Please find a nice olive wood cutting board on Amazon and order it for me.",
  "callback_url": "https://my-service.url/agents/203820/callbacks"
}

The agent wants more info

POST https://my-service.url/agents/203820/callbacks
{
  "prompt": "What is your budget for this purchase?"
}

{
  "prompt_id": 123,
  "status": "pending",
  "created": "2023-08-23T13:49:51.141Z",
  "last_updated": "2023-08-23T13:49:51.141Z"
}

The agent polls the client until the prompt is resolved

GET https://my-service.url/agents/203820/callbacks/123

Responses:

{
  "prompt_id": 123,
  "status": "pending",
  "created": "2023-08-23T13:49:51.141Z",
  "last_updated": "2023-08-23T13:49:51.141Z"
}

{
  "prompt_id": 123,
  "status": "resolved",
  "answer": "I don't want to spend more than €40 on this purchase",
  "created": "2023-08-23T13:49:51.141Z",
  "last_updated": "2023-08-23T13:53:12.634Z",
}

{
  "prompt_id": 123,
  "status": "rejected",
  "created": "2023-08-23T13:49:51.141Z",
  "last_updated": "2023-08-23T13:53:12.634Z",
}

Alternatives

Extending the protocol with full chatting capabilities:
- GET /agent/tasks/<task_id>/chats
  List chats regarding task <task_id>
- POST /agent/tasks/<task_id>/chats
  Start a new chat regarding task <task_id>
- POST /agent/tasks/<task_id>/chats/<chat_id>/messages
  Post a new message in an existing chat
- GET /agent/tasks/<task_id>/chats/<chat_id>/messages
  Get all messages in a chat
- POST /agent/tasks/<task_id>/chats/<chat_id>/close
  Close/resolve a chat

The text was updated successfully, but these errors were encountered:

hackgoofer · 2023-09-21T01:46:43Z

Interesting, I feel like chats are better implemented with websockets - What do you think? @jzanecook

jzanecook · 2023-09-21T02:24:02Z

This might be one of the perfect examples for a plugin referencing #71 since it's something that not all agents might have.

ntindle · 2023-12-05T17:37:05Z

Can you attend a meeting the Agent protocol meeting on the 12th to discuss this?

Pwuts mentioned this issue Nov 29, 2023

RFC: Topic Endpoint #77

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RFC: Support for bidirectional communication (prompting the user) #44

RFC: Support for bidirectional communication (prompting the user) #44

Pwuts commented Aug 23, 2023 •

edited

hackgoofer commented Sep 21, 2023

jzanecook commented Sep 21, 2023

ntindle commented Dec 5, 2023

RFC: Support for bidirectional communication (prompting the user) #44

RFC: Support for bidirectional communication (prompting the user) #44

Comments

Pwuts commented Aug 23, 2023 • edited

Proposal

Alternatives

hackgoofer commented Sep 21, 2023

jzanecook commented Sep 21, 2023

ntindle commented Dec 5, 2023

Pwuts commented Aug 23, 2023 •

edited