Downloading data using async_download is failing
the code to reproduce the error in a jupyter notebook environment:
from nomad.client.archive import ArchiveQuery
import nest_asyncio
nest_asyncio.apply()
max_entries=9000
query={
'results.method.simulation.program_name': 'VASP',
'results.material.elements': ['O'],
"results.material.symmetry.crystal_system:any": [
"cubic"
],
'results.material.n_elements': {
"lte": 3,
"gte": 3
},
"results.method.simulation.dft.xc_functional_type:any": [
"GGA"
],
"results.properties.geometry_optimization": {
"final_energy_difference": {
"lte": 2e-22,
"gte": 1e-30
}
},
}
required={
'results':{
'material':{
'elements': '*',
'symmetry': {
'space_group_number': '*'
}
}
},
'workflow': {
'geometry_optimization': {
'final_energy_difference': '*',
},
'calculation_result_ref': {
'system_ref': {
'chemical_composition_reduced': '*',
'chemical_composition': '*',
'atoms': {
'labels': '*',
'species': '*',
'lattice_vectors': '*',
'positions': '*',
}
}
}
}
}
query = ArchiveQuery(query=query, required=required, page_size=100, results_max=max_entries)
number_of_entries = await query.async_fetch()
results = await query.async_download()
Output:
Fetching remote uploads...
8862 entries are qualified and added to the download list.
Downloading required data...
Request with upload id 8iZ-Q4gySLKyIfMSGjsoLg returns 502, will retry in the next download call...
Request with upload id 87KqS7FoSjqwavM97n1gKQ returns 502, will retry in the next download call...
Request with upload id 1n9HhIu4RxCqQ-_c2HicsA returns 502, will retry in the next download call...
Request with upload id AU3KetYTRqaBDwu287YdMw returns 502, will retry in the next download call...
Request with upload id AZPHk9AIQTWhR26duHtm4w returns 502, will retry in the next download call...
Request with upload id BNtvotKFT-yKzBSQiuN6Ew returns 502, will retry in the next download call...
Request with upload id 6BJYr9T3SmSm7EXyAS4neQ returns 502, will retry in the next download call...
Request with upload id AUlQEgYjTw-0ZmKplDVXAg returns 502, will retry in the next download call...
Request with upload id 2vgGNtDiR-K2eYWoggE8cA returns 502, will retry in the next download call...
Request with upload id 7-qU8aU8T-6HcsqeJye_Xg returns 502, will retry in the next download call...
Request with upload id -zC5tNujT3ivFkyl3FtjoQ returns 502, will retry in the next download call...
Request with upload id BFAbJBuOTWyRMPdbJRXgEA returns 502, will retry in the next download call...
Request with upload id -TG77dGiSTyrDAFNqTKa6Q returns 502, will retry in the next download call...
Request with upload id 3pKsCWHHRy-kggiJMJTwRw returns 502, will retry in the next download call...
Request with upload id 1X-hKSMDTRKjok2eI7Qckg returns 502, will retry in the next download call...
Request with upload id -nosdRAgSR2tdIaZ0cnzUA returns 502, will retry in the next download call...
Request with upload id B5Iv8N-DRy-9s725TJxpog returns 502, will retry in the next download call...
Request with upload id B0ygTlpCR7yU5ORzv8i94g returns 502, will retry in the next download call...
Request with upload id 6R5k9ky3Rqar2LK-lCwz5g returns 502, will retry in the next download call...
Request with upload id ARF2JaIWQ2ySTZpIqEj6hA returns 502, will retry in the next download call...
Request with upload id 1vKLCgZ-Q3aZBwvKNbd_2Q returns 502, will retry in the next download call...
Request with upload id 1CSoHDpnQIa_yeRY8MQP7w returns 502, will retry in the next download call...
Request with upload id 8zYl6tWPRZSjn6CcgDMf6w returns 502, will retry in the next download call...
Request with upload id Anrp3fR4QfiTJ-N41jgP9Q returns 502, will retry in the next download call...
Request with upload id 8iG1bncCRn-UcyqLfNq56Q returns 502, will retry in the next download call...
Request with upload id 1BxTcDYxRlC7c7n08c24dA returns 502, will retry in the next download call...
Request with upload id 8jrkMecDSKqDyN_aQsV-oQ returns 502, will retry in the next download call...
Request with upload id 9WRGuN5ESuugjSNCnSdiSw returns 502, will retry in the next download call...
Request with upload id 4KqYbTjfSDablV4i5QR14w returns 502, will retry in the next download call...
Request with upload id 5kv9AynVT6yNIW-7xfNg8w returns 502, will retry in the next download call...
Request with upload id CHJ-fV71RpWHd-mtWOr5yw returns 502, will retry in the next download call...
Request with upload id CV_pC515Roq5KSwfHfz67A returns 502, will retry in the next download call...
Request with upload id CrvgmZOjS32LcfI-soISgw returns 502, will retry in the next download call...
Request with upload id D4IvyaGHTJqXSpQ0h6d8tg returns 502, will retry in the next download call...
---------------------------------------------------------------------------
RemoteProtocolError Traceback (most recent call last)
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpx/_transports/default.py:66, in map_httpcore_exceptions()
65 try:
---> 66 yield
67 except Exception as exc: # noqa: PIE-786
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpx/_transports/default.py:366, in AsyncHTTPTransport.handle_async_request(self, request)
365 with map_httpcore_exceptions():
--> 366 resp = await self._pool.handle_async_request(req)
368 assert isinstance(resp.stream, typing.AsyncIterable)
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/connection_pool.py:262, in AsyncConnectionPool.handle_async_request(self, request)
261 await self.response_closed(status)
--> 262 raise exc
263 else:
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/connection_pool.py:245, in AsyncConnectionPool.handle_async_request(self, request)
244 try:
--> 245 response = await connection.handle_async_request(request)
246 except ConnectionNotAvailable:
247 # The ConnectionNotAvailable exception is a special case, that
248 # indicates we need to retry the request on a new connection.
(...)
252 # might end up as an HTTP/2 connection, but which actually ends
253 # up as HTTP/1.1.
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/connection.py:103, in AsyncHTTPConnection.handle_async_request(self, request)
101 raise ConnectionNotAvailable()
--> 103 return await self._connection.handle_async_request(request)
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/http11.py:133, in AsyncHTTP11Connection.handle_async_request(self, request)
132 await self._response_closed()
--> 133 raise exc
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/http11.py:111, in AsyncHTTP11Connection.handle_async_request(self, request)
103 async with Trace(
104 "receive_response_headers", logger, request, kwargs
105 ) as trace:
106 (
107 http_version,
108 status,
109 reason_phrase,
110 headers,
--> 111 ) = await self._receive_response_headers(**kwargs)
112 trace.return_value = (
113 http_version,
114 status,
115 reason_phrase,
116 headers,
117 )
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/http11.py:176, in AsyncHTTP11Connection._receive_response_headers(self, request)
175 while True:
--> 176 event = await self._receive_event(timeout=timeout)
177 if isinstance(event, h11.Response):
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/httpcore/_async/http11.py:226, in AsyncHTTP11Connection._receive_event(self, timeout)
225 msg = "Server disconnected without sending a response."
--> 226 raise RemoteProtocolError(msg)
228 self._h11_state.receive_data(data)
RemoteProtocolError: Server disconnected without sending a response.
The above exception was the direct cause of the following exception:
RemoteProtocolError Traceback (most recent call last)
Cell In[1], line 56
54 query = ArchiveQuery(query=query, required=required, page_size=100, results_max=max_entries)
55 number_of_entries = await query.async_fetch() # indicative number n applies: async_fetch(n)
---> 56 results = await query.async_download() # indicative number n applies: async_download(n)
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/nomad/client/archive.py:461, in ArchiveQuery.async_download(self, number)
457 await self.async_fetch(number - pending_size)
459 print('Downloading required data...')
--> 461 return await self._download_async(number)
File ~/Work/ai-toolkit/tutorial-query-nomad-archive/.venv/lib/python3.9/site-packages/nomad/client/archive.py:343, in ArchiveQuery._download_async(self, number)
339 async with httpx.AsyncClient(timeout=Timeout(timeout=300)) as session:
340 tasks = [asyncio.create_task(
341 self._acquire(
342 upload, session, semaphore)) for upload in self._uploads[:num_upload]]
--> 343 results = await asyncio.gather(*tasks)
345 # flatten 2D list
346 return [result for sub_results in results for result in sub_results]
File ~/.pyenv/versions/3.9.16/lib/python3.9/asyncio/tasks.py:256, in Task.__step(***failed resolving arguments***)
252 try:
253 if exc is None:
254 # We use the `send` method directly, because coroutines
255 # don't have `__iter__` and `__next__` methods.
--> 256 result = coro.send(None)
257 else:
258 result = coro.throw(exc)
...
80 raise
82 message = str(exc)
---> 83 raise mapped_exc(message) from exc
RemoteProtocolError: Server disconnected without sending a response.
Output is truncated. View as a scrollable element or open in a text editor. Adjust cell output settings...
Request with upload id DOuenAdHSayyWlp4a8tOMQ returns 502, will retry in the next download call...
Request with upload id E-YjvtW_T-u1KsvXY-0Yvg returns 502, will retry in the next download call...
Request with upload id EaGeF-WGR9264hBA20WHIg returns 502, will retry in the next download call...
Request with upload id Eq3Kk4idS56aBFSwwlb50Q returns 502, will retry in the next download call...
Request with upload id FguF5BPPQZGcaptuyvBv4A returns 502, will retry in the next download call...
Request with upload id CTYV3haLQAep01cNP_EBgA returns 502, will retry in the next download call...
Request with upload id FNcMBezKTFOCFS7riRSitQ returns 502, will retry in the next download call...
Request with upload id D9lafl_eRzii97IjM1quAQ returns 502, will retry in the next download call...
Request with upload id G9U-fJl7TDie_KLj5gG36Q returns 502, will retry in the next download call...
Request with upload id F7ab6NaBS5uN_suh82LKOQ returns 502, will retry in the next download call...
Request with upload id GFwcEYgBR2SkvxgWt4-3mw returns 502, will retry in the next download call...
Request with upload id GFznz5XRSc6pYgPUrBhamg returns 502, will retry in the next download call...
Request with upload id Czf5OvsYR6idvpve2Wa03Q returns 502, will retry in the next download call...