Refactor the tasks functionality
The aim of this branch is to simplify the process/task related code and use more consistent terminology, by:
- Using only one status (
process_status) on the Proc object instead of two as before (
- Changing to only use the "task" terminology when referring to celery tasks, to avoid confusion.
- Getting rid of our own @task decorator and the related validation logic (as this logic was not designed for a case where there are many different types of processes on the same Proc object, as is the case now).
process_status is changed to take on the following values:
- READY: The process is ready to start
- PENDING: The process has been called, but still waiting for a celery worker to start running.
- RUNNING: Currently running the main process function.
- WAITING_FOR_RESULT: Waiting for the result from some other process (used when the upload waits for the entries to finish processing)
- SUCCESS: The last process completed successfully.
- FAILURE: The last process completed with a fatal failure.
This is almost exactly the same values as used by
task_status previously, the only difference is the new statuses WAITING_FOR_RESULT and READY, which are usually not used in logical checks etc (thus, usually in the code we just need to read the
process_status where we previously read the
task_status to adapt to the new philosophy).
Additional information about what the process is doing is stored in a "free" text field,
current_process_step, roughly replacing the old
current_task field (the main difference being a better name and that we don't have any validation logic on it).
Error handling is done at the "top" level, i.e. the process level.
Mongo fields removed from the
- tasks: List[str]
- current_task = StringField(default=None)
- tasks_status = StringField(default=CREATED)
Mongo fields added to the
- current_process_step = StringField(default=None)
Note, because of these changes a datafix is needed to migrate existing data!