820 - Oasislink from local to central as TCP server

Merged Daniel Lehmberg requested to merge 820_separate_logstash_tcpserver into develop

@mscheidg

  • logtransfer and federation FastAPI backend mostly finished and tested (see open points below)
  • rudimentary class for additional statistics (queries to mongodb) that can be transferred
    • currently this is located in logtransfer.py (it shouldn't be within the main Nomad API as this can run with multiple workers, which turned out to be problematic in the past)
    • we agreed that we use reasonable cases from the KPI as well as statistics required for the homepage
  • Check documentation for data sharing in oasis.md
  • The test case that sometimes used failed for remote CI pipeline is (hopefully) fixed

Relates to issues #820 and #886 This merge request follows up the closed merge requests !697 (closed), !727 (closed) and !784 (closed).

Here the logtransfer is implemented in a separate service to the Nomad app - a TCP Server (Python native classes). In particular the service runs on a local Nomad Oasis and receives logs on the same address and port than where logstash logs are sent on the central Oasis.

For Nomad Oasis the server running on logtransfer can therefore be interpreted as a logstash proxy. Unless running the logstash instance itself, all logs are submitted to the central Oasis where the logs are then eventually stored in logstash.

Open questions:

  • To identify a local Oasis we read the IP address on the FastAPI receiving end. Do we still need to adapt the field deployment_id per local instance (we would make sure that this is unique -- maybe taking a hash from some config parameters)?

To clarify:

  • A problem for testing is currently, that when the app is started with nomad app run app, then there the logs are different set than to the log config in docker. Should this maybe be fixed? (If starting Nomad with nomad app run app, then no API calls are processed in logstash format).
  • Use threading TCP server? If a client does not close the connection to the logstash proxy server, it can block the whole server. ThreadingTCP server is only supported for UNIX systems.
  • Validate IP address that was received in federation/logs?
  • Protect against malicious (gzipped) content?
    • against too large data (load into main memory)
    • against gzip bombs (uncompress too much data into main memory)
    • too many newlines in the data (long running loop)
Edited by Daniel Lehmberg

Merge request reports