import ComparisonTable from ’../../components/ComparisonTable.astro’;
PyTorch and TensorFlow are the two dominant deep learning frameworks. The competitive landscape has shifted dramatically — PyTorch went from research-only to production-ready, while TensorFlow retains strong enterprise adoption and the Keras ecosystem.
Quick Verdict
Choose PyTorch if: You’re doing research, building LLM applications, using Hugging Face, or starting a new ML project. PyTorch has won the research community and is increasingly strong in production.
Choose TensorFlow if: You’re maintaining existing TF codebases, targeting TensorFlow Lite for mobile deployment, or integrating deeply with Google Cloud ML services.
Adoption Reality (2026)
| Metric | PyTorch | TensorFlow |
|---|---|---|
| Research papers (arXiv) | ~80% | ~20% |
| Hugging Face models | Dominant | Available |
| Job postings | Growing | Larger base |
| Kaggle usage | Majority | Declining |
| Enterprise legacy | Growing | Established |
| Google internal | No | Yes (TPU native) |
Feature Comparison
<ComparisonTable headers={[“Feature”, “PyTorch”, “TensorFlow 2.x”]} rows={[ [“Execution mode”, “Eager by default, JIT compile”, “Eager by default (TF2)”], [“Debugging”, “Native Python debugger”, “Harder (graph debugging)”], [“Dynamic graphs”, “Native (Pythonic)”, “Yes (TF2 eager)”], [“Model deployment”, “TorchServe, ONNX, TorchScript”, “TF Serving, TFLite, TF.js”], [“Mobile”, “ExecuTorch”, “TensorFlow Lite”], [“Browser”, “ONNX.js”, “TensorFlow.js”], [“Distributed training”, “torch.distributed, FSDP”, “tf.distribute”], [“Hugging Face”, “Native (primary)”, “Supported”], [“Google TPU”, “Limited”, “Native”], [“Learning curve”, “Gentler”, “Steeper”], ]} />
Neural Network Definition
PyTorch — Pythonic and explicit:
import torch
import torch.nn as nn
import torch.nn.functional as F
class ConvNet(nn.Module):
def __init__(self, num_classes: int = 10):
super().__init__()
# Explicit layer definitions
self.conv1 = nn.Conv2d(1, 32, kernel_size=3, padding=1)
self.conv2 = nn.Conv2d(32, 64, kernel_size=3, padding=1)
self.pool = nn.MaxPool2d(2, 2)
self.dropout = nn.Dropout(0.25)
self.fc1 = nn.Linear(64 * 7 * 7, 128)
self.fc2 = nn.Linear(128, num_classes)
def forward(self, x: torch.Tensor) -> torch.Tensor:
# Explicit forward pass — easy to debug
x = self.pool(F.relu(self.conv1(x)))
x = self.pool(F.relu(self.conv2(x)))
x = self.dropout(x)
x = x.view(x.size(0), -1) # Flatten
x = F.relu(self.fc1(x))
x = self.dropout(x)
return self.fc2(x)
# Instantiate and inspect
model = ConvNet(num_classes=10)
print(model) # Full architecture visible
# Count parameters
total_params = sum(p.numel() for p in model.parameters())
print(f"Parameters: {total_params:,}")
TensorFlow/Keras — declarative:
import tensorflow as tf
from tensorflow import keras
from tensorflow.keras import layers
def build_convnet(num_classes: int = 10) -> keras.Model:
model = keras.Sequential([
layers.Conv2D(32, (3, 3), activation='relu', padding='same', input_shape=(28, 28, 1)),
layers.MaxPooling2D((2, 2)),
layers.Conv2D(64, (3, 3), activation='relu', padding='same'),
layers.MaxPooling2D((2, 2)),
layers.Dropout(0.25),
layers.Flatten(),
layers.Dense(128, activation='relu'),
layers.Dropout(0.25),
layers.Dense(num_classes, activation='softmax'),
])
return model
model = build_convnet(num_classes=10)
model.summary() # Keras provides formatted summary
Keras is more concise for standard architectures. PyTorch is more explicit and flexible for custom architectures.
Training Loop
PyTorch — explicit training loop:
import torch
from torch.utils.data import DataLoader
from torch.optim import AdamW
from torch.optim.lr_scheduler import CosineAnnealingLR
def train(
model: nn.Module,
train_loader: DataLoader,
val_loader: DataLoader,
num_epochs: int = 10,
device: str = "cuda" if torch.cuda.is_available() else "cpu",
) -> dict:
model = model.to(device)
optimizer = AdamW(model.parameters(), lr=1e-3, weight_decay=1e-4)
scheduler = CosineAnnealingLR(optimizer, T_max=num_epochs)
criterion = nn.CrossEntropyLoss()
history = {"train_loss": [], "val_accuracy": []}
for epoch in range(num_epochs):
# Training phase
model.train()
running_loss = 0.0
for batch_idx, (data, targets) in enumerate(train_loader):
data, targets = data.to(device), targets.to(device)
optimizer.zero_grad()
outputs = model(data)
loss = criterion(outputs, targets)
loss.backward()
# Gradient clipping — prevents exploding gradients
torch.nn.utils.clip_grad_norm_(model.parameters(), max_norm=1.0)
optimizer.step()
running_loss += loss.item()
if batch_idx % 100 == 0:
print(f"Epoch {epoch+1}, Batch {batch_idx}: Loss = {loss.item():.4f}")
scheduler.step()
# Validation phase
model.eval()
correct = total = 0
with torch.no_grad():
for data, targets in val_loader:
data, targets = data.to(device), targets.to(device)
outputs = model(data)
_, predicted = outputs.max(1)
correct += predicted.eq(targets).sum().item()
total += targets.size(0)
val_acc = correct / total
history["train_loss"].append(running_loss / len(train_loader))
history["val_accuracy"].append(val_acc)
print(f"Epoch {epoch+1}: Val Accuracy = {val_acc:.4f}")
return history
TensorFlow/Keras — high-level API:
import tensorflow as tf
model = build_convnet(num_classes=10)
model.compile(
optimizer=tf.keras.optimizers.AdamW(learning_rate=1e-3, weight_decay=1e-4),
loss='sparse_categorical_crossentropy',
metrics=['accuracy'],
)
callbacks = [
tf.keras.callbacks.EarlyStopping(patience=5, restore_best_weights=True),
tf.keras.callbacks.ReduceLROnPlateau(factor=0.5, patience=3),
tf.keras.callbacks.ModelCheckpoint('best_model.keras', save_best_only=True),
]
history = model.fit(
train_dataset,
validation_data=val_dataset,
epochs=50,
callbacks=callbacks,
verbose=1,
)
Keras’s .fit() is faster to write. PyTorch’s explicit loop is easier to customize (custom loss, gradient accumulation, mixed-precision, etc.).
Transformer / LLM Work
For modern LLM and transformer work, PyTorch is the practical standard:
# Hugging Face Transformers — PyTorch native
from transformers import AutoTokenizer, AutoModelForSequenceClassification
from transformers import Trainer, TrainingArguments
import torch
model_name = "bert-base-uncased"
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name, num_labels=2)
# Fine-tune for sentiment classification
training_args = TrainingArguments(
output_dir="./results",
num_train_epochs=3,
per_device_train_batch_size=16,
per_device_eval_batch_size=64,
warmup_steps=500,
weight_decay=0.01,
evaluation_strategy="epoch",
fp16=True, # Mixed precision training
)
trainer = Trainer(
model=model,
args=training_args,
train_dataset=train_dataset,
eval_dataset=eval_dataset,
)
trainer.train()
Hugging Face’s 500K+ model hub is predominantly PyTorch. TF models exist but are secondary.
Production Deployment
PyTorch production options:
# Option 1: TorchScript (compile to portable format)
model = ConvNet()
model.load_state_dict(torch.load("model.pth"))
model.eval()
# Trace the model
example_input = torch.rand(1, 1, 28, 28)
traced_model = torch.jit.trace(model, example_input)
traced_model.save("model_traced.pt")
# Option 2: Export to ONNX (framework-agnostic)
torch.onnx.export(
model,
example_input,
"model.onnx",
input_names=["input"],
output_names=["output"],
dynamic_axes={"input": {0: "batch_size"}, "output": {0: "batch_size"}},
opset_version=17,
)
# Option 3: TorchServe REST API
# mar-file packaging → TorchServe → HTTP endpoint
TensorFlow production:
# SavedModel format — standard TF deployment
model.save("saved_model/")
# TF Serving — production REST API
# docker run -p 8501:8501 \
# --mount type=bind,source=/path/to/saved_model,target=/models/mymodel \
# -e MODEL_NAME=mymodel tensorflow/serving
# TFLite — mobile/edge deployment
converter = tf.lite.TFLiteConverter.from_saved_model("saved_model/")
converter.optimizations = [tf.lite.Optimize.DEFAULT]
tflite_model = converter.convert()
with open("model.tflite", "wb") as f:
f.write(tflite_model)
# TensorFlow.js — browser deployment
# tensorflowjs_converter --input_format=tf_saved_model saved_model/ tfjs_model/
TensorFlow’s deployment ecosystem for mobile (TFLite) and browser (TF.js) remains strong.
Distributed Training
PyTorch FSDP (Fully Sharded Data Parallel) — for large models:
from torch.distributed.fsdp import FullyShardedDataParallel as FSDP
from torch.distributed.fsdp.fully_sharded_data_parallel import CPUOffload
# FSDP enables training models that don't fit on a single GPU
model = FSDP(
model,
cpu_offload=CPUOffload(offload_params=True),
auto_wrap_policy=transformer_auto_wrap_policy,
)
# Launch with: torchrun --nproc_per_node=8 train.py
TensorFlow distributed:
# Multi-GPU training
strategy = tf.distribute.MirroredStrategy()
with strategy.scope():
model = build_model()
model.compile(optimizer='adam', loss='categorical_crossentropy')
model.fit(dataset, epochs=10)
# Multi-machine training
strategy = tf.distribute.MultiWorkerMirroredStrategy()
When to Choose Each
Choose PyTorch:
- Research and experimentation (dominant in academia)
- Working with Hugging Face models (most native)
- LLM fine-tuning and inference
- Custom architectures requiring granular control
- New projects starting from scratch
- Teams hiring ML engineers (PyTorch more common in resumes)
Choose TensorFlow:
- Mobile deployment (TFLite is excellent)
- Browser-based ML (TF.js)
- Google Cloud/TPU workloads
- Maintaining existing TF codebases
- Teams already on TF with working pipelines
Bottom Line
PyTorch has won the research community and is rapidly closing the production deployment gap. For new projects, the choice is increasingly PyTorch by default — the ecosystem, Hugging Face integration, and developer experience are superior. TensorFlow retains real advantages for mobile (TFLite) and Google Cloud (TPU), making it the right choice for those specific deployment targets.