Skip to content

Draft: Resolve "Example upload on how to create a curated database in NOMAD"

This is a draft and WIP of a potential notebook tutorial to create a database of thermally delayed fluorescent molecules published in the literature and curated from papers with ChemDataExtractor. It could serve as material for the upcoming user meeting.

The notebook is divided into the following sections:

  1. Load, inspect, and clean the data
  2. Defining a NOMAD schema
  3. Populating the NOMAD schema
  4. Uploading files via the NOMAD API

The first section is tedious and takes more than 30 cells. While it is important to have it, it might be less interesting for the tutorial and might be placed in a different notebook.

The notebook is similar to the other tutorial but with a more materials science context. Ideally, the properties in the schema could be taken from the properties pool that we are preparing, but I do not think we will be ready.

It might be a nice introduction to then prepare an app based on the dataset created.

@himanel1 we need to evaluate if this is interesting at all. My impression of using this kind of content for onboarding people is that it was useful. If so, we still need to improve it and put some fixes. For example, some of the entries fail to generate a valid conformer, and I am unsure if I should try to put a try and catch later on, or if I just want to make sure that it forms a confirmed in the data cleaning section. Give it a go and let's have a chat about it.

Closes #1973

Merge request reports