Useful Resources¶
The snakemake tutorial: https://snakemake.readthedocs.io/en/stable/tutorial/tutorial.html.
Nextflow¶
Nextflow is also a great workflow manager, with a lot of examples and users in bioinformatics.
One key difference is that Nextflow does not compute the workflow DAG before running: thus it does not know what the final target is (Snakemake’s all rule). For large workflows that means it starts running faster.
To link processes and pass data through it relies on channels instead of files. Channels can contain filenames, but also any other data which can be manipulated in different ways (for example, filtered) during workflow execution.
In one sentence Snakemake is ‘output-oriented’ while Nextflow is ‘process-oriented’.
Here is my take on some differences. The more stars the better.
What I want |
snakemake |
nextflow |
---|---|---|
Easy scripting |
[**] Python <3 |
[*] Java scripting derivative: Groovy |
Getting going fast |
[**] Rules and files |
[*] Processes and channels |
Visualise DAG |
[**] Get before the run |
[*] Get only after the run |
Good Documentation |
[**] |
[**] |
Start workflow quickly |
[*] Needs to compute DAG |
[**] No need for DAG |
Clean working directory 1 |
[*] Execution where files are |
[**] Execution in isolated directories |
Getting run statistics |
[*] html reports, benchmark rule |
[**] html reports, |
Data communication |
[*] always files |
[**] Manipulate channels with |
Rerun where I stopped |
[**] |
[**] |
Environment control 2 |
[**] |
[**] |
Execution abstraction 3 |
[**] |
[**] |
Ultimately both are great and actively developed.
- 1
nextflow does not rely on file names to link processes and runs each process in its own working directory: thus you can names files whatever you like in the workflow script. snakemake can use
shadow
directive to execute a rule in this way.- 2
run in conda environment, in docker/singularity image…
- 3
run on clusters, AWS…
Glossary¶
- conda
A tool for building virtual software environments. https://conda.readthedocs.io/en/latest/
- singularity
A tool for building containers, which are mini-operating systems with isolated software environments. https://sylabs.io/guides