6.1. Nextflow

6.1.1. Install Nextflow

  • cd $HOME

  • mkdir bin

  • cd bin

  • java -version (Check that java is installed)

  • curl -s https://get.nextflow.io | bash

  • echo 'export PATH=$HOME/bin:$PATH' >> $HOME/.bashrc (Not sure it is necessary on the VM)

  • nextflow -version (Check that it works)

6.1.2. Example of a nexflow workflow (see the project)

Get the reference file covid.fa:

wget http://dl.pasteur.fr/fop/BID9tf52/nCov_ampliseq_up.fasta
mv nCov_ampliseq_up.fasta covid.fa

Starting from a reference, we can use bwa to build its index using this nextflow script:

referenceFile = Channel.fromPath("covid.fa")

process indexRef {
    input:
    file ref from referenceFile

    output:
    file "covid.fa.bwt" into index

    script:
    """
    bwa index ${ref}
    """
    }

Write this nextflow.config file:

singularity {
    enabled = true
    autoMounts = true
    runOptions = '--home $HOME:/home/$USER --bind /pasteur'
}

process {
    executor='local'
    withName: 'indexRef' {
        container="evolbioinfo/bwa:v0.7.17"
    }
}

Then execute with:

nextflow run main.nf

6.1.3. Example of a Nextflow workflow

Here is another example that will start from several files:

fastaFiles = Channel.fromPath("/home/user/fasta/*.fasta")

process reformatFasta {
  input:
  file sequences from fastaFiles

  output:
  file "*.phylip" into phylipFiles

  script:
  """
  goalign reformat phylip -i ${sequences} -o ${sequences.baseName}.phylip
  """
}

nextflow.config:

singularity {
    enabled = true
    autoMounts = true
    runOptions = '--home $HOME:/home/$USER --bind /pasteur'
}

report {
    enabled = true
    file = 'reports/report.html'
}

trace {
    enabled = true
    file = 'reports/trace.txt'
}

timeline {
    enabled = true
    file = 'reports/timeline.html'
}

dag {
    enabled = true
    file = 'reports/dag.dot'
}

process {
    executor='local'
    scratch=false
    maxRetries=30
    errorStrategy='retry'

    withName: reformatFasta {
        cpus=1
        memory="1G"
        time="30s"
        container="evolbioinfo/goalign:v0.3.2"
    }
}

6.1.4. Run the workflow

nextflow run main.nf

Nextflow options:

  • -resume : restarts an analysis without reexecuting tasks there were succesful or that are not impacted by the workflow update

  • -w <dir> : Changes the default work directory (default: work/)

  • -C <file>: Changes the default configuration file (default: nextflow.config)

6.1.6. Exercise 1

Create a Nextflow workflow with the following steps:

  1. Download the phylogenetic tree : wget https://booster.pasteur.fr/static/files/primates/ref.nw.gz

  2. List the names of the tips: gotree stats tips -i <file>

Using docker containers:

  1. evolbioinfo/ubuntu:v16.04

  2. evolbioinfo/gotree:v0.4.1a

6.1.7. Exercise 2

Write a Nextflow workflow corresponding to the project (SARS-CoV-2 analysis), and run it locally. (can be finished tomorrow)