Skip to content

GitLab

  • Menu
Projects Groups Snippets
    • Loading...
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • nomad-FAIR nomad-FAIR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 107
    • Issues 107
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 8
    • Merge requests 8
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Monitor
    • Monitor
    • Incidents
  • Packages & Registries
    • Packages & Registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • nomad-lab
  • nomad-FAIRnomad-FAIR
  • Issues
  • #704

Closed
Open
Created Dec 20, 2021 by Markus Scheidgen@mscheidgOwner3 of 6 tasks completed3/6 tasks

Improved "ArchiveQuery"

The existing ArchiveQuery has some obvious flaws.

  • #682 describes failure due to 502. This might be unavoidable if the API is under high load. ArchiveQuery should deal with it instead of error-ing out
  • #679 (closed) describes a JSON decode error. This should not happen, but obviously can happen. The ArchiveQuery should deal with it instead of error-ing out. Proper logging should also help to better identify the cause (e.g. specific calculation)
  • #680 (closed) describes that some required things are missing. This should be fixed in v1, which adds all required to the search.
  • The last point is implemented poorly, because references are treated like sub-sections and not followed

Long running queries might always exhibit problems. The ArchiveQuery should be reimplemented with the explicit premise of API failures. As a consequence:

  • results should be cached explicitly, locally, and somewhat permanently
  • actual error handling
  • the implementation should be more modern, e.g. with asyncio

Steps to take:

  • get familiar with asyncio
  • rework the ArchiveQuery implementation based on httpx + asyncio
  • evaluate how much parallelism (asyncio again) we can use in the archive query API
  • rework the API accordingly
  • discuss the documentation examples with luca/luigi and martin/simon to make them more meaningful (this should finally also address the bugs above)
Edited Feb 13, 2022 by Theodore Chang
Assignee
Assign to
Time tracking