Improved app and helm chart with respect to timeouts and rate limiting.
This partially helps with: #1914, it includes the changes of !1701 (closed)
Changes:
- Timeouts are now consistently applied to ingress and proxy rules based on shared values
- Separate ingress for api and others (gui, docs) for tighter rate limits at the api
- Concurrent connections limit in addition to connection per second limit
- ArchiveQuery defaults fit the timeout and rate limiting settings
- Increased the HPC cloud loadbalancer timeouts to be slightly longer than the nomad timeouts (not this MR)
- Removed the joblib based threading for multi entry archive apis. This was a noop due to GIL.
- Added an await call into the multi entry archive loop, allowing requests (e.g. probes) during a running multi entry archive call.
- Multi entry archive apis stop computing the requested archive list after a client disconnect.
- refactored the main app, because HTTP middlewares are prohibiting recognising client disconnects (https://github.com/encode/starlette/discussions/2094). Now the api does not use any HTTP middleware
- more consistent use of parameter free events in api logging
Solutions:
- The app now does stop when a request is canceled (e.g. via timeout).
- Timeouts are a consistent 60s and the rate limit is set to 10 concurrent api requests and 32 requests per second.
- The long running multi entry archive api calls allow concurrent requests. This already worked for all downloads via the used StreamingResponses.
Edited by Markus Scheidgen