Arbitrary quantities in ES

There will always be demand for putting as much stuff into ES as possible. But we cannot index everything the same way. There are three possibilities in ES:

  • fixed mapping: what we currently generate from our elasticsearch metainfo annotations
  • dynamic mapping: we currently disabled this, includes the risk of index failures, if a key changes type between documents ... its also creates a mapping mess. A mappings created onced can't be removed, even if no values for a key exist anymore.
  • use nested objects like this {'key': 'myschema.chamber_pressure', 'number_value': 0.23} or {'key': 'myschema.inventory_tag', 'string_value': '192-19283-382.1'}

We have not explored the last bit. To do this, we could:

  • devise a generic metainfo section for these nested documents: find a good, unambiguous format for key; think about other fields, like unit, schema_version, etc.
  • create a special flavor of the elasticsearch metainfo annotation to denote indexed quantities. the whole archive and even custom schemas apply.
  • we put a copy of the collected list of all these key-value pairs into section metadata or results and then use our normal fixed mapping es functionality
  • create a search filter menu that lets you create filters for arbitrary key-value pairs.

This is very similar to what we are doing currently with references: #999 (closed)

see this blog post: https://smnh.me/indexing-and-searching-arbitrary-json-data-using-elasticsearch

If this works, the question remains what to put in a "fixed mapping" and what to put into "nested objects".

Assignee Loading
Time tracking Loading