nextflow — quality + safety report
In the Skillier index (kdense-scientific__nextflow) · scanned 2026-06-03 · engine: builtin+triage
2 heuristic flags to review
Heuristic flags from the builtin scanner, which is known to over-flag (it trips on legitimate env-reading integrations, security skills, and library .eval calls). This is NOT an authoritative malicious verdict — re-scan with SkillSpector for the authoritative result. Run the authoritative scan →
📇 This skill is in the Skillier index (curated · deduped · quality-filtered). Install Skillier to route & load it into your AI client.
Quality notes
About this skill
Build, run, and debug Nextflow data pipelines and nf-core workflows end to end. Use whenever the user mentions Nextflow, nf-core, .nf files, nextflow.config, DSL2, processes/channels/operators, samplesheets, or wants to run a community pipeline e.g. nf-core/rnaseq, nf-core/sarek , write or test a…
📄 Read the SKILL.md
---
name: nextflow
description: Build, run, and debug Nextflow data pipelines and nf-core workflows end to end. Use whenever the user mentions Nextflow, nf-core, .nf files, nextflow.config, DSL2, processes/channels/operators, samplesheets, or wants to run a community pipeline (e.g. nf-core/rnaseq, nf-core/sarek), write or test a module/subworkflow with nf-test, configure executors/containers (Docker, Singularity/Apptainer, Conda, Wave), scale a workflow to HPC/SLURM or cloud (AWS Batch, Google Batch, Azure, Kubernetes), or debug a failed/-resume run. Make sure to use this skill for any reproducible scientific/bioinformatics workflow work even if the user does not say the word "Nextflow", and for authoring nf-core-compliant pipelines, modules, configs, and linting.
license: Apache-2.0
metadata:
version: "1.0"
skill-author: K-Dense Inc.
---
# Nextflow
## Overview
Nextflow is a workflow language and runtime for building **reproducible, portable, scalable** data pipelines. It is dominant in bioinformatics but works for any data-heavy computation. nf-core is a community curating production-grade Nextflow pipelines, reusable modules, and the `nf-core` tooling on top of Nextflow.
Key ideas:
- **Dataflow programming**: pipelines are `process` tasks connected by **channels**. Nextflow infers execution order and parallelism from data dependencies — there is no explicit scheduler to write.
- **Write once, run anywhere**: the same pipeline runs locally, on HPC (SLURM, SGE, LSF, PBS), and on cloud (AWS Batch, Google Batch, Azure Batch, Kubernetes) by changing config/profiles, not code.
- **Reproducibility**: per-task containers (Docker/Singularity/Apptainer/Conda/Wave) + `-resume` caching + pinned pipeline revisions.
- **DSL2** is the modern, required syntax: modular `process`/`workflow`/`include` definitions.
This skill covers both **running** existing pipelines and **developing** your own (Nextflow language + nf-core conventions, testing with nf-test, configuration, and deployment).
## When to Use This Skill
Use this skill when the user wants to:
- Run an nf-core or custom Nextflow pipeline, or debug a failing/resuming run.
- Write or modify `.nf` scripts, `nextflow.config`, profiles, or `nextflow_schema.json`.
- Author or test nf-core-style modules/subworkflows (`main.nf`, `meta.yml`, `tests/`, nf-test).
- Configure executors, containers, or resources; scale to HPC or cloud.
- Build a reproducible scientific/bioinformatics workflow (even if "Nextflow" is not named).
- Understand processes, channels, operators, `take`/`emit`, `publishDir`, `ext.args`, meta maps.
## Setup
Nextflow needs **Bash** and **Java 17 or newer** (17–25 supported). Verify with `java -version`.
```bash
# Install Nextflow (self-contained launcher)
curl -s https://get.nextflow.io | bash # creates ./nextflow
sudo mv nextflow /usr/local/bin/ # put on PATH
nextflow info # verify
# Or via conda/bioconda (also gets a managed Java)
conda create -n nf -c bioconda -c conda-forge nextflow nf-core
```
```bash
# nf-core tools (Python) for creating/linting/running nf-core assets
pip install nf-core # or: conda install -c bioconda nf-core
nf-core --version
```
Pin the engine for reproducibility: `export NXF_VER=24.10.0` (use an [edge] release only if needed). For air-gapped/HPC, see `references/running-pipelines.md` (offline mode) and `references/configuration.md`.
## Two Modes of Work
Decide which path the user is on — it changes everything:
| Goal | Start here |
|------|-----------|
| **Run** an existing pipeline (nf-core or a `.nf` you were given) | `references/running-pipelines.md` |
| **Develop** a new pipeline / module / subworkflow | `references/language.md` + `references/developing.md` |
| **Configure / scale** (HPC, cloud, containers, resources) | `references/configuration.md` + `references/containers.md` |
| **Test** modules/pipelines | `references/testing.md` |
## Quick Start
### Run an nf-core pipeline
Always smoke-test with the bundled `test` profile first; it uses tiny data and proves your environment works.
```bash
# 1. Confirm setup works (downloads pipeline + tiny test data)
nextflow run nf-core/rnaseq -profile test,docker --outdir results
# 2. Real run: pin a revision (-r), pick a container engine, pass inputs
nextflow run nf-core/rnaseq -r 3.14.0 \
-profile docker \
--input samplesheet.csv \
--genome GRCh38 \
--outdir results \
-resume
```
- `-profile` (single dash) selects bundled config profiles; **combine** them comma-separated, e.g. `test,docker`. Container/infra profiles (`docker`, `singularity`, `conda`) are mutually exclusive — pick one.
- `--input`, `--genome`, `--outdir` (double dash) are **pipeline** parameters. nf-core pipelines take a **samplesheet CSV**, not loose files.
- `-resume` reuses cached results from the last run. `-r <version>` pins a release for reproducibility.
Use `nf-core pipelines launch <name>` for an interactive, schema-validated way to build the command and a `-params-file`. See `references/running-pipelines.md`.
### Write a minimal pipeline
```nextflow
#!/usr/bin/env nextflow
process SAYHELLO {
tag "$greeting"
publishDir "results", mode: 'copy'
input:
val greeting
output:
path "${greeting}.txt"
script:
"""
echo '$greeting world' > ${greeting}.txt
"""
}
workflow {
channel.of('hello', 'bonjour', 'hola') | SAYHELLO
}
```
```bash
nextflow run main.nf # add -resume on reruns
```
The full language (processes, channels, operators, DSL2 workflows with `take`/`main`/`emit`, modules) is in `references/language.md`.
## Core Concepts at a Glance
- **Process**: a unit of work that runs a script (Bash by default). Declares `input:`, `output:`, optional `directives` (resources, container, `publishDir`, `tag`, `errorStrategy`), and a `script:`/`shell:`/`exec:` block. Each task runs in its own isolated work directory (`work/xx/yy…`).
- **Channel**: the async queues that connect processes. **Queue channels** are consumable streams; **value channels** hold a single reusable value. Created with factories like `channel.of`, `channel.fromPath`, `channel.fromFilePairs`, `channel.value`.
- **Operator**: transforms/combines channels — `map`, `filter`, `collect`, `groupTuple`, `join`, `combine`, `mix`, `flatten`, `branch`, `multiMap`, `splitCsv`, `view`, `set`.
- **Workflow**: composes processes. DSL2 workflows can declare `take:` (inputs), `main:` (logic), `emit:` (named outputs) and be `include`d as subworkflows. The unnamed `workflow {}` is the entry point.
- **Module**: a `.nf` file exposing processes/workflows via `include { NAME } from './path'` (supports `as` aliasing).
- **Configuration**: `nextflow.config` sets `params`, `process` directives, `executor`, container engines, and named `profiles`. Selectors `withName:`/`withLabel:` target specific processes. See `references/configuration.md`.
- **meta map** (nf-core): the convention of carrying a metadata map (`[ id:'sample1', single_end:false ]`) alongside files in input/output tuples so samples stay labeled through the pipeline. See `references/developing.md`.
## nf-core tools CLI
nf-core tools (v3+) group subcommands under `pipelines`, `modules`, and `subworkflows`. (Bare forms like `nf-core lint` still work but warn — prefer the grouped form.)
| Command | Purpose |
|---------|---------|
| `nf-core pipelines list` | List/search nf-core pipelines (`--json`, keywords) |
| `nf-core pipelines create` | Scaffold a new pipeline from the nf-core template |
| `nf-core pipelines launch <name>` | Interactive, schema-driven run command + params file |
| `nf-core pipelines download <name>` | Download pipeline + containers for offline/HPC use |
| `nf-core pipelines lint` | Lint a pipeline against nf-core standards (run in repo root) |
| `nf-core pipelines schema build` | Build/edit `nextflow_schema.json` via web GUI |
| `nf-core pipelines create-params-file <name>` | Generate a documented YAML params file |
| `nf-core pipelines bump-version` / `sync` | Bump version / sync with template updates |
| `nf-core modules list/info/install/update/remove` | Manage modules from nf-core/modules |
| `nf-core modules create` / `lint` / `test` | Author, lint, and nf-test a module |
| `nf-core modules patch` / `bump-versions` | Patch an installed module / bump tool versions |
| `nf-core subworkflows install/create/lint/test` | Same lifecycle for subworkflows |
Full command reference, flags, and examples: `references/nf-core-tools.md`.
## Essential `nextflow` CLI
| Command | Purpose |
|---------|---------|
| `nextflow run <pipeline> -profile <p> --outdir <dir>` | Run a pipeline (path, `.nf`, or `user/repo`) |
| `-resume` | Reuse cached results from prior run |
| `-r <rev>` | Run a specific git revision/tag/branch |
| `-params-file params.yml` | Supply parameters from YAML/JSON |
| `-c custom.config` | Layer in an extra config file |
| `-with-report -with-trace -with-timeline -with-dag flow.html` | Execution report, trace, timeline, DAG |
| `-stub-run` | Run `stub:` blocks only (dry-run plumbing) |
| `nextflow log` | Inspect past runs |
| `nextflow clean -f -before <run>` | Delete old `work/` data |
| `nextflow pull / drop / list / info <repo>` | Manage cached remote pipelines |
Config, executors, caching internals, and tracing details: `references/configuration.md`.
## Best Practices (high-value habits)
- **Always `test` first**: `-profile test,docker` (or `singularity`/`conda`) before real data — fast and catches environment problems.
- **Pin everything**: pipeline revision (`-r`), `NXF_VER`, and tool versions (containers). Don't run `latest` for science you'll publish.
- **Use `-resume`** and understand caching: a task re-runs if its inputs, script, or container change. See cache-debugging in `references/configuration.md`.
- **Parameterize via config/params-file**, not hardcoded paths. Keep `params` and profiles in `nextflow.config`.
- **One container/conda env per process**; never rely on tools installed on the host.
- **For nf-core dev**: reuse existing modules (`nf-core modules install`) before writing new ones; pass tool flags through `ext.args` (not hardcoded in the script); always include a `stub:` block and nf-test tests; run `nf-core pipelines lint` and `prettier` before committing.
- **Right-size resources** with `process_low/medium/high` labels and `errorStrategy 'retry'` with dynamic `task.attempt` scaling instead of one giant request.
- **Write forward-compatible syntax**: the strict-syntax parser becomes the default in Nextflow 26.04. Prefer lowercase `channel.of(...)`, explicit closure params (`{ v -> ... }`), `def` for all variables, and `emit:`-named outputs. Check with `nextflow lint`.
## Reference Files
Read the relevant file when you need depth — each is self-contained:
- `references/language.md` — DSL2 language: processes, directives, channels, operators, workflows (`take`/`emit`), modules, dynamic resources, error handling.
- `references/configuration.md` — `nextflow.config`, scopes, `profiles`, `withName`/`withLabel` selectors, executors (local/SLURM/cloud), caching/`-resume` internals, tracing/reports, the `nextflow` CLI.
- `references/containers.md` — Docker, Singularity/Apptainer, Podman, Conda, Wave containers; choosing and enabling engines; common gotchas.
- `references/running-pipelines.md` — finding/running nf-core pipelines, samplesheets, params files, reference genomes (iGenomes), offline runs, institutional configs, Seqera Platform.
- `references/nf-core-tools.md` — complete `nf-core` CLI reference (pipelines/modules/subworkflows), flags, and workflows.
- `references/developing.md` — authoring nf-core pipelines & modules: template layout, module `main.nf`/`meta.yml`, meta maps, `ext.args`/`modules.config`, subworkflows, resource labels, linting & Harshil alignment style.
- `references/testing.md` — nf-test for modules/subworkflows/pipelines: test structure, asserti
… (truncated)Want a live grade + an embeddable README badge? Run your skill through the free scanner.
Graded independently by Skillproof — nothing to sell the author. Quality is mechanical + corpus-grounded; safety flags are heuristic (builtin+triage), not a malicious verdict.