Some tabular parser modifications to make it more flexible
This issues is configured as a consolidation step after the last features added in the tabular parser.
There is a change in the logic I would like to discuss with @amgo regarding the possibility of fishing quantities from different sheets and parse them in the same section of the schema.
The initial implementation foresees to fill one branch of the schema only from one excel sheet so whenever a quantity from another sheet is requested, this is not filled.
This became a limitation because we started producing a large number of references in our schema and it may happen that some lab_id is being fished from different sheets around the excel file.
Also, together to Amir we figured out some warning that should be raised.
I will list all this point in a schematic way below. All of them refer to this excel file: test.xlsx
-
there is a bug in the following variant of the example number 4 more_nested_tabular-parser_4_column_single-new-entry_to-path.archive.yaml -
several quantities fished from different sheets are not ending up in the archive: tabular-parser_5_row_single-new-entry_to-path_multi_sheet_2.archive.yamlI revisited this point after discussing with Amir. Indeed I don't want to fish from different sheets to fill the same section. I just found a bug: when I write the following mapping targeting the same entry, only the first is processed and the second is forgotten. This point is closely related to the bullet point no. 4 below Here a MWE test_movpe_cnr.schema.archive.yaml 013_example_dataset.xlsx
mapping_options:
- mapping_mode: row
file_mode: single_new_entry
sections:
- growth_run/steps
- mapping_mode: row
file_mode: single_new_entry
sections:
- growth_run/grown_samples
-
We must throw a warning when such schema is produced:
"mapping_mode": "row",
"file_mode": "multiple_new_entries",
"sections": [
"growth_run/steps"
"growth_run/grown_samples" ]
only one subsection per time must be specified in "multiple_new_entries"
-
A different behaviour should be planned for these two schemas: tabular-parser_5_row_single-new-entry_to-path_MULTI.archive.yaml tabular-parser_5_row_single-new-entry_to-path_multi_sheet.archive_SINGLE.yaml -
Let's throw some warning when the EntryData inheritance is forgotten in the schema for the single_new_entry and multiple_new_entries modes -
N + 1 subsections are created with following annotation, instzaed of N
{
"mapping_mode": "row",
"file_mode": "multiple_new_entries",
"sections": [
"#root"
]
}
-
Naming of entries is not set properly by "more: quantity_label" -
In row mode, multiple_new_entries, root: the data_file quantity turns to be filled only in the original entry, it should be filled in every of them
Thanks