A more flexible and more celery-tonic processing module

This is a pre-requisite for #251 (closed)

@himanel1 already started the implementation. I think this looks good and retains all features we had before. Here are some furhter todos that I see:

@mscheidg

add a redis to helm
run some large scale processing

@himanel1 We should do the refactoring all the way. I think you should organize the submodules based on the processed entities Calc and Upload rather then trying to separate mongo from celery. A typical submodule structure that we also use in other modules would be:

processing/__init__.py - Only docs and imports to expose to other modules
processing/processing.py - All celery setup suff
processing/common.py - Common sutff, the mongoengine base, our custom celery tasks/request, shared constants, etc., Pipeline, PipelineContext, Stage, empty_task
processing/calc.py - Including the "celery task" comp_process (don't like the name, btw.)
processing/upload.py - Including upload_cleanup, pipelines, get_pipeline, run_pipline
I think we can move the tests into a singular module, or rename test_base->test_common, test_data->test_upload

upload can depend on calc; upload and calc can depend on common; all can depend on processing; no other dependencies between submodules should be necessary

In the future we could think about replacing: @process, current_process, process_status with celery. But at the moment its very convinient to use mongodb query to check on the processing status of all entities. I feel celery wasn't really designed with persistent tasks in mind. Also we would need to be far more regid with the celery infrastructure and add persistence to rabbitmq and redis.

Edited Apr 20, 2020 by Markus Scheidgen