pytorch-lightning — quality + safety report
In the Skillier index (kdense-scientific__pytorch-lightning) · scanned 2026-06-03 · engine: builtin+triage
1 heuristic flag to review
Heuristic flags from the builtin scanner, which is known to over-flag (it trips on legitimate env-reading integrations, security skills, and library .eval calls). This is NOT an authoritative malicious verdict — re-scan with SkillSpector for the authoritative result. Run the authoritative scan →
📇 This skill is in the Skillier index (curated · deduped · quality-filtered). Install Skillier to route & load it into your AI client.
Quality notes
About this skill
Deep learning framework PyTorch Lightning / lightning package . Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging W&B, TensorBoard, MLflow , distributed training DDP, FSDP, DeepSpeed , for scalable neural network training.
📄 Read the SKILL.md
---
name: pytorch-lightning
description: Deep learning framework (PyTorch Lightning / lightning package). Organize PyTorch code into LightningModules, configure Trainers for multi-GPU/TPU, implement data pipelines, callbacks, logging (W&B, TensorBoard, MLflow), distributed training (DDP, FSDP, DeepSpeed), for scalable neural network training.
allowed-tools: Read Write Edit Bash
license: Apache-2.0 license
compatibility: Requires Python 3.10+ and lightning 2.6+ (or pytorch-lightning 2.6+). GPU training needs CUDA-capable PyTorch. Optional loggers (wandb, mlflow, comet-ml) and DeepSpeed require separate installs.
metadata:
version: "1.0"
skill-author: K-Dense Inc.
---
# PyTorch Lightning
## Overview
PyTorch Lightning is a deep learning framework that organizes PyTorch code to eliminate boilerplate while maintaining full flexibility. Automate training workflows, multi-device orchestration, and implement best practices for neural network training and scaling across multiple GPUs/TPUs.
**Current upstream:** lightning 2.6.4 (PyPI, May 2026). Docs: [lightning.ai/docs/pytorch/stable](https://lightning.ai/docs/pytorch/stable/). Use `import lightning as L` (the `pytorch-lightning` package name still installs the same library).
## Installation
```bash
uv pip install lightning
```
Optional extras:
```bash
uv pip install lightning[extra] # loggers, strategies, etc.
uv pip install wandb mlflow # specific loggers as needed
```
## When to Use This Skill
This skill should be used when:
- Building, training, or deploying neural networks using PyTorch Lightning
- Organizing PyTorch code into LightningModules
- Configuring Trainers for multi-GPU/TPU training
- Implementing data pipelines with LightningDataModules
- Working with callbacks, logging, and distributed training strategies (DDP, FSDP, DeepSpeed)
- Structuring deep learning projects professionally
## Core Capabilities
### 1. LightningModule - Model Definition
Organize PyTorch models into six logical sections:
1. **Initialization** - `__init__()` and `setup()`
2. **Training Loop** - `training_step(batch, batch_idx)`
3. **Validation Loop** - `validation_step(batch, batch_idx)`
4. **Test Loop** - `test_step(batch, batch_idx)`
5. **Prediction** - `predict_step(batch, batch_idx)`
6. **Optimizer Configuration** - `configure_optimizers()`
**Quick template reference:** See `scripts/template_lightning_module.py` for a complete boilerplate.
**Detailed documentation:** Read `references/lightning_module.md` for comprehensive method documentation, hooks, properties, and best practices.
### 2. Trainer - Training Automation
The Trainer automates the training loop, device management, gradient operations, and callbacks. Key features:
- Multi-GPU/TPU support with strategy selection (DDP, FSDP, DeepSpeed)
- Automatic mixed precision training
- Gradient accumulation and clipping
- Checkpointing and early stopping
- Progress bars and logging
**Quick setup reference:** See `scripts/quick_trainer_setup.py` for common Trainer configurations.
**Detailed documentation:** Read `references/trainer.md` for all parameters, methods, and configuration options.
### 3. LightningDataModule - Data Pipeline Organization
Encapsulate all data processing steps in a reusable class:
1. `prepare_data()` - Download and process data (single-process)
2. `setup()` - Create datasets and apply transforms (per-GPU)
3. `train_dataloader()` - Return training DataLoader
4. `val_dataloader()` - Return validation DataLoader
5. `test_dataloader()` - Return test DataLoader
**Quick template reference:** See `scripts/template_datamodule.py` for a complete boilerplate.
**Detailed documentation:** Read `references/data_module.md` for method details and usage patterns.
### 4. Callbacks - Extensible Training Logic
Add custom functionality at specific training hooks without modifying your LightningModule. Built-in callbacks include:
- **ModelCheckpoint** - Save best/latest models
- **EarlyStopping** - Stop when metrics plateau
- **LearningRateMonitor** - Track LR scheduler changes
- **BatchSizeFinder** - Auto-determine optimal batch size
**Detailed documentation:** Read `references/callbacks.md` for built-in callbacks and custom callback creation.
### 5. Logging - Experiment Tracking
Integrate with multiple logging platforms:
- TensorBoard (default)
- Weights & Biases (WandbLogger)
- MLflow (MLFlowLogger)
- Comet (CometLogger)
- CSV (CSVLogger)
Note: `NeptuneLogger` was removed in lightning 2.6.4. Use W&B, MLflow, or TensorBoard instead.
Log metrics using `self.log("metric_name", value)` in any LightningModule method.
**Detailed documentation:** Read `references/logging.md` for logger setup and configuration.
### 6. Distributed Training - Scale to Multiple Devices
Choose the right strategy based on model size:
- **DDP** - For models <500M parameters (ResNet, smaller transformers)
- **FSDP** - For models 500M+ parameters (large transformers, recommended for Lightning users)
- **DeepSpeed** - For cutting-edge features and fine-grained control
Configure with: `Trainer(strategy="ddp", accelerator="gpu", devices=4)`
**Detailed documentation:** Read `references/distributed_training.md` for strategy comparison and configuration.
### 7. Best Practices
- Device agnostic code - Use `self.device` instead of `.cuda()`
- Hyperparameter saving - Use `self.save_hyperparameters()` in `__init__()`
- Metric logging - Use `self.log()` for automatic aggregation across devices
- Reproducibility - Use `seed_everything()` and `Trainer(deterministic=True)`
- Debugging - Use `Trainer(fast_dev_run=True)` to test with 1 batch
**Detailed documentation:** Read `references/best_practices.md` for common patterns and pitfalls.
## Quick Workflow
1. **Define model:**
```python
class MyModel(L.LightningModule):
def __init__(self):
super().__init__()
self.save_hyperparameters()
self.model = YourNetwork()
def training_step(self, batch, batch_idx):
x, y = batch
loss = F.cross_entropy(self.model(x), y)
self.log("train_loss", loss)
return loss
def configure_optimizers(self):
return torch.optim.Adam(self.parameters())
```
2. **Prepare data:**
```python
# Option 1: Direct DataLoaders
train_loader = DataLoader(train_dataset, batch_size=32)
# Option 2: LightningDataModule (recommended for reusability)
dm = MyDataModule(batch_size=32)
```
3. **Train:**
```python
trainer = L.Trainer(max_epochs=10, accelerator="gpu", devices=2)
trainer.fit(model, train_loader) # or trainer.fit(model, datamodule=dm)
```
## Resources
### scripts/
Executable Python templates for common PyTorch Lightning patterns:
- `template_lightning_module.py` - Complete LightningModule boilerplate
- `template_datamodule.py` - Complete LightningDataModule boilerplate
- `quick_trainer_setup.py` - Common Trainer configuration examples
### references/
Detailed documentation for each PyTorch Lightning component:
- `lightning_module.md` - Comprehensive LightningModule guide (methods, hooks, properties)
- `trainer.md` - Trainer configuration and parameters
- `data_module.md` - LightningDataModule patterns and methods
- `callbacks.md` - Built-in and custom callbacks
- `logging.md` - Logger integrations and usage
- `distributed_training.md` - DDP, FSDP, DeepSpeed comparison and setup
- `best_practices.md` - Common patterns, tips, and pitfallsWant a live grade + an embeddable README badge? Run your skill through the free scanner.
Graded independently by Skillproof — nothing to sell the author. Quality is mechanical + corpus-grounded; safety flags are heuristic (builtin+triage), not a malicious verdict.