Skip to content
GitLab
Projects Groups Snippets
  • /
  • Help
    • Help
    • Support
    • Community forum
    • Submit feedback
    • Contribute to GitLab
  • Sign in
  • nomad-FAIR nomad-FAIR
  • Project information
    • Project information
    • Activity
    • Labels
    • Members
  • Repository
    • Repository
    • Files
    • Commits
    • Branches
    • Tags
    • Contributors
    • Graph
    • Compare
  • Issues 218
    • Issues 218
    • List
    • Boards
    • Service Desk
    • Milestones
  • Merge requests 30
    • Merge requests 30
  • CI/CD
    • CI/CD
    • Pipelines
    • Jobs
    • Schedules
  • Deployments
    • Deployments
    • Environments
    • Releases
  • Packages and registries
    • Packages and registries
    • Package Registry
    • Container Registry
    • Infrastructure Registry
  • Monitor
    • Monitor
    • Incidents
  • Analytics
    • Analytics
    • Value stream
    • CI/CD
    • Repository
  • Snippets
    • Snippets
  • Activity
  • Graph
  • Create a new issue
  • Jobs
  • Commits
  • Issue Boards
Collapse sidebar
  • nomad-labnomad-lab
  • nomad-FAIRnomad-FAIR
  • Issues
  • #1097
Closed
Open
Issue created Sep 29, 2022 by Markus Scheidgen@mscheidgOwner

Arbitrary quantities in ES

There will always be demand for putting as much stuff into ES as possible. But we cannot index everything the same way. There are three possibilities in ES:

  • fixed mapping: what we currently generate from our elasticsearch metainfo annotations
  • dynamic mapping: we currently disabled this, includes the risk of index failures, if a key changes type between documents ... its also creates a mapping mess. A mappings created onced can't be removed, even if no values for a key exist anymore.
  • use nested objects like this {'key': 'myschema.chamber_pressure', 'number_value': 0.23} or {'key': 'myschema.inventory_tag', 'string_value': '192-19283-382.1'}

We have not explored the last bit. To do this, we could:

  • devise a generic metainfo section for these nested documents: find a good, unambiguous format for key; think about other fields, like unit, schema_version, etc.
  • create a special flavor of the elasticsearch metainfo annotation to denote indexed quantities. the whole archive and even custom schemas apply.
  • we put a copy of the collected list of all these key-value pairs into section metadata or results and then use our normal fixed mapping es functionality
  • create a search filter menu that lets you create filters for arbitrary key-value pairs.

This is very similar to what we are doing currently with references: #999

see this blog post: https://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch

If this works, the question remains what to put in a "fixed mapping" and what to put into "nested objects".

Assignee
Assign to
Time tracking