Skip to content

Improved app and helm chart with respect to timeouts and rate limiting.

Markus Scheidgen requested to merge archive-query-server-fixes into develop

This partially helps with: #1914, it includes the changes of !1701 (closed)

Changes:

  • Timeouts are now consistently applied to ingress and proxy rules based on shared values
  • Separate ingress for api and others (gui, docs) for tighter rate limits at the api
  • Concurrent connections limit in addition to connection per second limit
  • ArchiveQuery defaults fit the timeout and rate limiting settings
  • Increased the HPC cloud loadbalancer timeouts to be slightly longer than the nomad timeouts (not this MR)
  • Removed the joblib based threading for multi entry archive apis. This was a noop due to GIL.
  • Added an await call into the multi entry archive loop, allowing requests (e.g. probes) during a running multi entry archive call.
  • Multi entry archive apis stop computing the requested archive list after a client disconnect.
  • refactored the main app, because HTTP middlewares are prohibiting recognising client disconnects (https://github.com/encode/starlette/discussions/2094). Now the api does not use any HTTP middleware
  • more consistent use of parameter free events in api logging

Solutions:

  • The app now does stop when a request is canceled (e.g. via timeout).
  • Timeouts are a consistent 60s and the rate limit is set to 10 concurrent api requests and 32 requests per second.
  • The long running multi entry archive api calls allow concurrent requests. This already worked for all downloads via the used StreamingResponses.
Edited by Markus Scheidgen

Merge request reports