README.md

conda create --name imaging_heterogeneity_study
conda activate imaging_heterogeneity_study
pip install scipy==1.10.1 numpy==1.23.5 squidpy==1.3.0 pandas==1.5.3 scikit-learn==1.2.2
conda create --name robust python=3.7
conda activate robust
conda install numpy matplotlib pandas networkx pip jupyter
pip install pcst_fast
python3 robust.py data/data-example1-prec-puberty/BioGRID.txt data/data-example1-prec-puberty/prec-pub-seeds.txt prec_puberty.graphml

[1] file with a list of seed genes (delimiter: newline-separated)
[2] path to output file (supported output file types: .graphml, .csv, others) [read more below]


The suffix of the path to the output file you specify, determine the format of the output.
You can either choose
- .graphml: A .graphml file is written that contains the following vertex properties: isSeed, significance, nrOfOccurrences, connected_components_id, trees
- .csv: A .csv file which contains a vertex table with #occurrences, %occurrences, terminal (isSeed)
- everything else: An edge list


[1] --network NETWORK					Description: Specify path to graph or identifier of networks shipped with ROBUST ('BioGRID', 'APID', 'STRING'), type=str or file (allowed types: .graphml, .txt, .csv, .tsv), default: 'BioGRID' [read more below]

Network input options:
	- A two-column edgelist. File types and corresponding delimiters are as follows: 1. '.txt' file should be space-separated 2. '.tsv' file should be tab-separated 3. '.csv' file should be comma-separated. No other file  formats except '.txt', '.csv' and '.tsv' are accepted at the moment.
	- A valid .graphml file
	- In-built network name {'BioGRID', 'APID', 'STRING'}


[2] --alpha ALPHA					Description: initial fraction for ROBUST, type=float, expected range=[0,1], default: 0.25

[3] --beta BETA						Description: reduction factor for ROBUST, type=float, expected range=[0,1], default: 0.90

[4] --n N						Description: # of steiner trees for ROBUST, type=int, expected range=(0,+inf], default: 30

[5] --tau TAU						Description: threshold value for ROBUST, type=float, expected range=(0,+inf], default: 0.1

[6] --namespace {'ENTREZ', 'GENE_SYMBOL', 'UNIPROT', 'ENSEMBL'}	Description: gene/ protein identifier options for study bias data, type=str, default: 'GENE_SYMBOL'

[7] --study-bias-scores					Description: specify edge weight function used by ROBUST, type=str, default: 'BAIT_USAGE' [read more below]

Study bias score input options:
	- A two-column file (delimiter: comma), where the first column is the gene or protein name (column datatype: string) and the second column is the study bias score (column datatype: int).
	- In-built study-bias-score options {'NONE' or 'None', 'BAIT_USAGE', 'STUDY_ATTENTION'} ('NONE' or 'None' leads to running ROBUST with uniform edge costs.)


--gamma							Description: Hyper-parameter gamma used by bias-aware edge weights. This hyperparameter regulates to what extent the study bias data is being leveraged when running ROBUST., type=float, expected range=[0,1], default: 1.00
python3 ./data/networks/update_inbuilt_ppi_networks.py
python3 ./data/study_bias_scores/update_inbuilt_study_bias_scores.py
python3 czi_to_tiff_disassembly.py ../data/Stacks_single_sample_test ../data/Individual_single_sample_test --process_mode sum --vox_dim_1 0.4 --deconvolve False --deconvolution_iterations 5 --psf_kernel_size 10 --psf_sigma 0.75
python3 czi_to_tiff_disassembly.py ../data/Stacks_single_sample_test ../data/Sum_single_sample_test --process_mode sum --vox_dim_1 0.4
python3 stack_and_composite_handling.py ../data/Individual_single_sample_test ../data/Composites_single_sample_test
python3 stack_and_composite_handling.py ../data/Sum_test_copy ../data/Composites_single_sample_test
python3 stack_and_composite_handling.py ../data/Sum_test ../data/Composites_single_sample_test