Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ResourceBusy errors with curio #118

Open
Zaharid opened this issue Apr 10, 2019 · 4 comments
Open

ResourceBusy errors with curio #118

Zaharid opened this issue Apr 10, 2019 · 4 comments

Comments

@Zaharid
Copy link

Zaharid commented Apr 10, 2019

As far as I can tell, the following should work (sorry it requires BeautifulSoap); I want to crawl over each email in a mailing list and do something with it:

from urllib.parse import urljoin, urlparse

import asks
import curio
from bs4 import BeautifulSoup

ARCHIVES_URL = 'https://mail.python.org/pipermail/python-ideas/'


def make_soup(data):
    return BeautifulSoup(data)
    #return BeautifulSoup(data, features="html5lib")


async def get_archive_index(session):
    resp = await session.post(ARCHIVES_URL)
    resp.raise_for_status()
    return resp.text


def parse_threads(archive_index):
    soup = make_soup(archive_index)
    return [
        urljoin(ARCHIVES_URL, th.attrs['href'])
        for th in soup.find_all('a', string='[ Thread ]')
    ]


def parse_emails(thread_index, month_url):
    soup = make_soup(thread_index)
    return [
        urljoin(month_url, em.attrs['href'])
        for em in soup.find_all('a', attrs={'name': True, 'href': True})
    ]


async def process_email(email_url, session, sync):
    async with sync:
        resp = await session.get(email_url)
    resp.raise_for_status()
    print(resp.text)


async def process_month(month_url, session, sync):
    async with sync:
        resp = await session.get(month_url)
    resp.raise_for_status()
    thread_index = resp.text
    emails = await curio.run_in_process(parse_emails, thread_index, month_url)
    async with curio.TaskGroup() as tg:
        for email_url in emails:
            await tg.spawn(process_email, email_url, session, sync)


async def main():
    s = asks.Session(connections=100, persist_cookies=True)
    archive_data = await get_archive_index(s)
    archive_index = await curio.run_in_process(parse_threads, archive_data)
    sync = curio.Semaphore(value=1)
    async with curio.TaskGroup() as tg, s:
        for month_url in archive_index:
            await tg.spawn(process_month, month_url, s, sync)


if __name__ == '__main__':
    curio.run(main)

I am getting the following error, which I do not understand:

Task Crash: Task(id=104, name='process_month', state='TERMINATED')
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 744, in _run_coro
    trap = current._send(current.next_value)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 166, in _task_runner
    self.next_value = await coro
  File "index-email2.py", line 56, in process_month
    resp = await session.get(month_url)
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 214, in request
    await self._handle_exception(e, sock)
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 255, in _handle_exception
    await sock.close()
  File "/home/zah/anaconda3/lib/python3.7/site-packages/anyio/_networking.py", line 215, in close
    await self._socket.unwrap_tls()
  File "/home/zah/anaconda3/lib/python3.7/site-packages/anyio/_networking.py", line 187, in unwrap_tls
    await self._wait_readable()
  File "/home/zah/anaconda3/lib/python3.7/site-packages/anyio/_backends/curio.py", line 333, in wait_socket_readable
    return await curio.traps._read_wait(sock)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/traps.py", line 54, in _read_wait
    yield (_trap_io, fileobj, EVENT_READ, 'READ_WAIT')
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 406, in _trap_io
    _register_event(fileobj, event, current)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 359, in _register_event
    raise ReadResourceBusy("Multiple tasks can't wait to read on the same file descriptor %r" % fileobj)
curio.errors.ReadResourceBusy: Multiple tasks can't wait to read on the same file descriptor <ssl.SSLSocket fd=7, family=AddressFamily.AF_INET, type=SocketKind.SOCK_STREAM, proto=0, laddr=('10.248.111.146', 41418), raddr=('188.166.95.178', 443)>
Task Crash: Task(id=105, name='process_month', state='TERMINATED')
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 744, in _run_coro
    trap = current._send(current.next_value)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 166, in _task_runner
    self.next_value = await coro
  File "index-email2.py", line 56, in process_month
    resp = await session.get(month_url)
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 167, in request
    connection_timeout, self._grab_connection, url)
  File "/home/zah/sourcecode/asks/asks/utils.py", line 15, in timeout_manager
    return await coro(*args)
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 364, in _grab_connection
    sock = await self._make_connection(host_loc)
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 338, in _make_connection
    sock, port = await self._connect(host_loc)
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 100, in _connect
    (host, int(port))), port
  File "/home/zah/sourcecode/asks/asks/sessions.py", line 78, in _open_connection_https
    autostart_tls=True)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/anyio/__init__.py", line 344, in connect_tcp
    await sock.connect((address, port))
  File "/home/zah/anaconda3/lib/python3.7/site-packages/anyio/_networking.py", line 82, in connect
    await self._wait_writable()
  File "/home/zah/anaconda3/lib/python3.7/site-packages/anyio/_backends/curio.py", line 350, in wait_socket_writable
    return await curio.traps._write_wait(sock)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/traps.py", line 63, in _write_wait
    yield (_trap_io, fileobj, EVENT_WRITE, 'WRITE_WAIT')
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 753, in _run_coro
    traps[trap[0]](*trap[1:])
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 406, in _trap_io
    _register_event(fileobj, event, current)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 364, in _register_event
    (task, wtask) if event == EVENT_READ else (rtask, task))
  File "/home/zah/anaconda3/lib/python3.7/selectors.py", line 389, in modify
    self._selector.modify(key.fd, selector_events)
FileNotFoundError: [Errno 2] No such file or directory
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 744, in _run_coro
    trap = current._send(current.next_value)
RuntimeError: cannot reuse already awaited coroutine

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 766, in _run_coro
    del tasks[active.id]
KeyError: 104

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "index-email2.py", line 77, in <module>
    curio.run(main)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 826, in run
    return kernel.run(corofunc, *args)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 150, in run
    ret_val, ret_exc = self._runner.send(coro)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 790, in _run_coro
    _unregister_event(*active._last_io)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 370, in _unregister_event
    key = selector_getkey(fileobj)
  File "/home/zah/anaconda3/lib/python3.7/selectors.py", line 190, in get_key
    return mapping[fileobj]
  File "/home/zah/anaconda3/lib/python3.7/selectors.py", line 71, in __getitem__
    fd = self._selector._fileobj_lookup(fileobj)
  File "/home/zah/anaconda3/lib/python3.7/selectors.py", line 225, in _fileobj_lookup
    return _fileobj_to_fd(fileobj)
  File "/home/zah/anaconda3/lib/python3.7/selectors.py", line 42, in _fileobj_to_fd
    raise ValueError("Invalid file descriptor: {}".format(fd))
ValueError: Invalid file descriptor: -1
Exception ignored in: <function Kernel.__del__ at 0x7f520784e840>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 123, in __del__
RuntimeError: Curio kernel not properly terminated.  Please use Kernel.run(shutdown=True)
Task(id=2, name='main', state='TASK_JOIN') never joined
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
RuntimeError: coroutine ignored GeneratorExit
Task(id=110, name='process_month', state='READY') never joined
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
  File "index-email2.py", line 59, in process_month
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
  File "index-email2.py", line 59, in process_month
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
  File "index-email2.py", line 59, in process_month
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
  File "index-email2.py", line 59, in process_month
RuntimeError: coroutine ignored GeneratorExit
Task(id=25, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=26, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=27, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=28, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=29, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=30, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=31, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=32, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=33, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=34, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=35, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=36, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=37, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=38, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=39, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=40, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=41, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=42, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=43, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=44, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=45, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=46, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=47, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=48, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=49, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=50, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=51, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=52, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=53, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=54, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=55, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=56, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=57, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=58, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=59, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=60, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=61, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=62, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=63, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=64, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=65, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=66, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=67, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=68, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=69, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=70, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=71, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=72, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=73, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=74, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=76, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=77, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=78, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=79, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=80, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=81, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=82, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=83, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=84, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=85, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=86, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=87, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=88, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=89, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=90, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=91, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=92, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=93, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=94, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=95, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=96, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=97, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=98, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=99, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=100, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=101, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=102, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=103, name='process_month', state='SEMA_ACQUIRE') never joined
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <function Task.__del__ at 0x7f5207b5a1e0>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/task.py", line 159, in __del__
RuntimeError: coroutine ignored GeneratorExit
Task(id=111, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=112, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=113, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=114, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=115, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=116, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=117, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=118, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=119, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=120, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=121, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=122, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=123, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=124, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=125, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=126, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=127, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=128, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=129, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=130, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=131, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=132, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=133, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=134, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=135, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=136, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=137, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=138, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=139, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=140, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=141, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=142, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=143, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=144, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=145, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=146, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=147, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=148, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=149, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=150, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=151, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=152, name='process_month', state='SEMA_ACQUIRE') never joined
sys:1: RuntimeWarning: coroutine 'Task._task_runner' was never awaited

Playing with the value of the semaphore or the number of connections doesn't seem to change the result for me.

@theelous3
Copy link
Owner

So I smashed some prints and stuff in haphazardly, like so:

from urllib.parse import urljoin, urlparse

import asks
import curio
from bs4 import BeautifulSoup

ARCHIVES_URL = 'https://mail.python.org/pipermail/python-ideas/'


def make_soup(data):
    return BeautifulSoup(data, features="html.parser")
    # return BeautifulSoup(data, features="html5lib")


async def get_archive_index(session):
    resp = await session.post(ARCHIVES_URL)
    resp.raise_for_status()
    return resp.text


def parse_threads(archive_index):
    soup = make_soup(archive_index)
    return [
        urljoin(ARCHIVES_URL, th.attrs['href'])
        for th in soup.find_all('a', string='[ Thread ]')
    ]


def parse_emails(thread_index, month_url):
    soup = make_soup(thread_index)
    return [
        urljoin(month_url, em.attrs['href'])
        for em in soup.find_all('a', attrs={'name': True, 'href': True})
    ]

processed_email = False

async def process_email(email_url, session, sync):
    print('starting to process email')
    async with sync:
        print('email acquired sema')
        resp = await session.get(email_url)
        print('email got response')
    print("Email resp", resp.status, resp.content)
    global processed_email
    if not processed_email:
        print("process_email -> resp.text", resp.text[:10])
        processed_email = True
    resp.raise_for_status()
    print(resp.text)


processed_month = False

async def process_month(month_url, session, sync):
    async with sync:
        resp = await session.get(month_url)
    resp.raise_for_status()
    thread_index = resp.text
    global processed_month
    if not processed_month:
        print("process_month -> resp.text", resp.text[:10])
        processed_month = True
    emails = await curio.run_in_process(parse_emails, thread_index, month_url)
    if emails:
        print(emails)
    async with curio.TaskGroup() as tg:
        for email_url in emails:
            await tg.spawn(process_email, email_url, session, sync)


async def main():
    s = asks.Session(connections=100, persist_cookies=True)
    archive_data = await get_archive_index(s)
    print("main -> archive_data", archive_data[:10])
    archive_index = await curio.run_in_process(parse_threads, archive_data)
    sync = curio.Semaphore(value=1)
    async with curio.TaskGroup() as tg, s:
        for month_url in archive_index:
            await tg.spawn(process_month, month_url, s, sync)


if __name__ == '__main__':
    curio.run(main)

and as output we get:

(env) VM-dev :: zoo/python/asks ꗝ  python test.py                             
main -> archive_data <!DOCTYPE 
process_month -> resp.text <!DOCTYPE 
(env) VM-dev :: zoo/python/asks ꗝ                                             

No errors. Note, the email parsing doesn't seem to be working.

Off the bat, try just use a regular old virtualenv and python rather than annaconda. Constantly see people with totally random errors using annaconda + random normally functioning things.

@Zaharid
Copy link
Author

Zaharid commented Apr 10, 2019

I am getting some errors in addition to the prints, in the current environment:

main -> archive_data <!DOCTYPE 
process_month -> resp.text <!DOCTYPE 
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 744, in _run_coro
    trap = current._send(current.next_value)
RuntimeError: cannot reuse already awaited coroutine

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "test.py", line 84, in <module>
    curio.run(main)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 826, in run
    return kernel.run(corofunc, *args)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 150, in run
    ret_val, ret_exc = self._runner.send(coro)
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 766, in _run_coro
    del tasks[active.id]
KeyError: 13
Exception ignored in: <function Kernel.__del__ at 0x7f1d6bd80a60>
Traceback (most recent call last):
  File "/home/zah/anaconda3/lib/python3.7/site-packages/curio/kernel.py", line 123, in __del__
RuntimeError: Curio kernel not properly terminated.  Please use Kernel.run(shutdown=True)

I'll try to set up a reproducible environment and see if this fixes itself.

@Zaharid
Copy link
Author

Zaharid commented Apr 11, 2019

FWIW I get the same error on an ubuntu docker image, with the system python and the pip default version of the packages. I also tried the git versions of both curio ans asks.

$ docker run -it ubuntu:latest
root@6f38b0c5e248:/#  apt update
root@6f38b0c5e248:/#  apt install python3-pip
root@6f38b0c5e248:/# pip3 install asks
Collecting asks
  Using cached https://files.pythonhosted.org/packages/ae/3d/c4aee6c889f525b52c52e0d17edac569e8c81ad2905b4bb5e022598d26b2/asks-2.3.2.tar.gz
Collecting anyio (from asks)
  Downloading https://files.pythonhosted.org/packages/40/1b/614827dc317cce9ba53818d5a2f77710496ae672a63b504b64de36f7d466/anyio-1.0.0rc1-py3-none-any.whl
Collecting async_generator (from asks)
  Downloading https://files.pythonhosted.org/packages/71/52/39d20e03abd0ac9159c162ec24b93fbcaa111e8400308f2465432495ca2b/async_generator-1.10-py3-none-any.whl
Collecting h11 (from asks)
  Downloading https://files.pythonhosted.org/packages/f9/f3/8e4cf5fa1a3d8bda942a0b1cf92f87815494216fd439f82eb99073141ba0/h11-0.8.1-py2.py3-none-any.whl (55kB)
    100% |################################| 61kB 2.6MB/s 
Collecting sniffio (from anyio->asks)
  Downloading https://files.pythonhosted.org/packages/ca/08/58f3b857b8bba832983e8c5dce5e3f8c677a5527e41cf61ff45effc78cae/sniffio-1.0.0-py3-none-any.whl
Collecting contextvars>=2.1; python_version < "3.7" (from sniffio->anyio->asks)
  Downloading https://files.pythonhosted.org/packages/83/96/55b82d9f13763be9d672622e1b8106c85acb83edd7cc2fa5bc67cd9877e9/contextvars-2.4.tar.gz
Collecting immutables>=0.9 (from contextvars>=2.1; python_version < "3.7"->sniffio->anyio->asks)
  Downloading https://files.pythonhosted.org/packages/e3/91/bc4b34993ef77aabfd1546a657563576bdd437205fa24d4acaf232707452/immutables-0.9-cp36-cp36m-manylinux1_x86_64.whl (91kB)
    100% |################################| 92kB 2.9MB/s 
Building wheels for collected packages: asks, contextvars
  Running setup.py bdist_wheel for asks ... done
  Stored in directory: /root/.cache/pip/wheels/49/97/6e/5e5226c795e2e186ff0ac7839e91eb3b5468cf76aeef09a746
  Running setup.py bdist_wheel for contextvars ... done
  Stored in directory: /root/.cache/pip/wheels/a5/7d/68/1ebae2668bda2228686e3c1cf16f2c2384cea6e9334ad5f6de
Successfully built asks contextvars
Installing collected packages: async-generator, immutables, contextvars, sniffio, anyio, h11, asks
Successfully installed anyio-1.0.0rc1 asks-2.3.2 async-generator-1.10 contextvars-2.4 h11-0.8.1 immutables-0.9 sniffio-1.0.0
root@6f38b0c5e248:/# pip3 install curio
Collecting curio
  Downloading https://files.pythonhosted.org/packages/e4/61/6e7daab81d17c47296c63c346d794c29e95218a7bceba88bf4a57cf7bb27/curio-0.9.tar.gz (482kB)
    100% |################################| 491kB 1.7MB/s 
Building wheels for collected packages: curio
  Running setup.py bdist_wheel for curio ... done
  Stored in directory: /root/.cache/pip/wheels/c3/be/fd/adc5cecb8264d9a87edd721057f208b36749b22b3990447302
Successfully built curio
Installing collected packages: curio
Successfully installed curio-0.9
root@6f38b0c5e248:/# pip3 install beautifulsoup4
Collecting beautifulsoup4
  Downloading https://files.pythonhosted.org/packages/1d/5d/3260694a59df0ec52f8b4883f5d23b130bc237602a1411fa670eae12351e/beautifulsoup4-4.7.1-py3-none-any.whl (94kB)
    100% |################################| 102kB 629kB/s 
Collecting soupsieve>=1.2 (from beautifulsoup4)
  Downloading https://files.pythonhosted.org/packages/c9/f8/e54b1d771ed4fab66b3fa1c178e137a3c73d84fb6f64329bddf0da5a371c/soupsieve-1.9-py2.py3-none-any.whl
Installing collected packages: soupsieve, beautifulsoup4
Successfully installed beautifulsoup4-4.7.1 soupsieve-1.9

root@6f38b0c5e248:/#  python3 example.py 
main -> archive_data <!DOCTYPE 
process_month -> resp.text <!DOCTYPE 
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/curio/kernel.py", line 736, in _run_coro
    trap = current._send(current.next_value)
RuntimeError: cannot reuse already awaited coroutine

During handling of the above exception, another exception occurred:

Traceback (most recent call last):
  File "example.py", line 84, in <module>
    curio.run(main)
  File "/usr/local/lib/python3.6/dist-packages/curio/kernel.py", line 818, in run
    return kernel.run(corofunc, *args)
  File "/usr/local/lib/python3.6/dist-packages/curio/kernel.py", line 149, in run
    ret_val, ret_exc = self._runner.send(coro)
  File "/usr/local/lib/python3.6/dist-packages/curio/kernel.py", line 758, in _run_coro
    del tasks[active.id]
KeyError: 5
Exception ignored in: <bound method Task.__del__ of Task(id=17, name='process_month', state='READ_WAIT')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/curio/task.py", line 160, in __del__
RuntimeError: coroutine ignored GeneratorExit
Exception ignored in: <bound method Kernel.__del__ of <curio.kernel.Kernel object at 0x7f31a45fcfd0>>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/curio/kernel.py", line 122, in __del__
RuntimeError: Curio kernel not properly terminated.  Please use Kernel.run(shutdown=True)
Task(id=119, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=57, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=104, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=127, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=58, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=126, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=59, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=110, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=125, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=60, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=118, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=61, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=62, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=63, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=107, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=64, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=120, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=65, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=66, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=67, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=20, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=117, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=68, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=116, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=69, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=106, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=70, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=123, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=71, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=93, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=105, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=103, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=72, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=73, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=102, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=74, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=134, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=75, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=76, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=101, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=133, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=77, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=132, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=100, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=78, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=98, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=79, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=80, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=130, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=81, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=94, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=129, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=82, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=112, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=128, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=83, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=111, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=131, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=84, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=115, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=99, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=85, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=124, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=97, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=122, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=86, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=87, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=92, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=121, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=88, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=96, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=114, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=89, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=113, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=90, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=95, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=109, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=144, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=108, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=145, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=21, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=137, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=146, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=22, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=143, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=147, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=23, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=148, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=24, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=142, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=149, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=25, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=150, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=26, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=151, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=27, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=152, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=28, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=29, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=30, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=31, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=32, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=33, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=34, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=35, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=141, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=36, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=140, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=37, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=38, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=39, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=40, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=41, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=136, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=42, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=135, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=43, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=44, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=139, name='process_month', state='SEMA_ACQUIRE') never joined
Exception ignored in: <bound method Task.__del__ of Task(id=9, name='process_month', state='READ_WAIT')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/curio/task.py", line 160, in __del__
  File "example.py", line 64, in process_month
RuntimeError: coroutine ignored GeneratorExit
Task(id=45, name='process_month', state='SEMA_ACQUIRE') never joined
Exception ignored in: <bound method Task.__del__ of Task(id=10, name='process_month', state='READ_WAIT')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/curio/task.py", line 160, in __del__
  File "example.py", line 64, in process_month
RuntimeError: coroutine ignored GeneratorExit
Task(id=46, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=91, name='process_month', state='SEMA_ACQUIRE') never joined
Exception ignored in: <bound method Task.__del__ of Task(id=11, name='process_month', state='READ_WAIT')>
Traceback (most recent call last):
  File "/usr/local/lib/python3.6/dist-packages/curio/task.py", line 160, in __del__
  File "example.py", line 64, in process_month
RuntimeError: coroutine ignored GeneratorExit
Task(id=47, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=12, name='process_month', state='READY') never joined
Task(id=48, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=138, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=13, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=49, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=14, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=50, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=15, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=51, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=16, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=52, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=53, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=18, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=54, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=19, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=55, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=56, name='process_month', state='SEMA_ACQUIRE') never joined
Task(id=2, name='main', state='READY') never joined
sys:1: RuntimeWarning: coroutine 'Task._task_runner' was never awaited


Zaharid added a commit to NNPDF/nnpdf that referenced this issue Apr 15, 2019
Add a script to parse the emails, and find the mentions of validphys
reports and associate report id with email url and title. Because there
is no way to get an email URL from the email as received, we scan the
HTML of the archives, by crawling over each message in each month.

The script tries to remove links that are in quoted sections but that
only works if these have already been parsed as a `backquote` HTML
element in the email archives.

We use this information to create a link to the email, in the index
page, by adding an email emoji link to each email. It could be used for
other things such as displaying the email in the template.

One annoying aspect is that this is an embracingly parallel task (we
could be processing the emails while we are waiting for other emails to
download), but I am hitting some bug I don't understand when trying to
do this with curio and asks
(theelous3/asks#118), so it will stay
sequential for the moment. Because it is slow, we add a cache to
remember already seen emails. At the moment index-emails needs to be run
independently from index-reports (I run it once a day), but that may not
be optimal.
@agronholm
Copy link
Contributor

Does this happen with other backends as well? Or only curio?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants