Capability filtercapabilityintermediate

Fine-tuning

Also known as: fine tune, custom training

See matching models with benchmark scores and pricing.

matching active models

tracked providers

models with routes

model.fine_tuning

Definition

Fine-tuning is the process of further training a pretrained model on targeted datasets to specialize it for specific tasks or behaviors, adjusting weights minimally. It leverages transfer learning for customization without full retraining, creating instruct/chat variants from base models and boosting capabilities.

Models With Fine-tuning

Sorted by decision relevance, with tracked capability and provider-route evidence.

6 matches

ModelReleaseContextCapabilitiesProvider route

Cosmos 3 Nano Policy DROID

Cosmos 3 Nano Policy DROID is a 16B-parameter robotics policy model fine-tuned from Cosmos 3 Nano on the DROID dataset. Given natural language instructions and visual observations from a robot camera (image or video), it generates robot action trajectories (JSON 1D list) for manipulation and control tasks. Compatible with multiple robot embodiments including Franka Panda (single/dual), UR, Google robot, WidowX 250, UMI, and Agibot. Supports 16-400 frame action sequences in various DoF configurations (9D-57D). Intended as a reference implementation for post-training Cosmos 3 Nano on specific robot platforms. The action output modality is represented in prose because the current model schema only has text, vision, video, audio, and related capability flags.

2026-05-31

Researched 24d ago

4,000 tokens

VisionMultimodalJSONFine-tune

No tracked provider route

Cosmos 3 Super Image2Video

Cosmos 3 Super Image2Video is a 64B-parameter fine-tuned variant of Cosmos 3 Super specialized for temporally coherent image-to-video generation. Takes a single image (jpg/png/webp at 256p-720p) plus an optional text prompt (up to 4096 tokens) and outputs MP4 video with 5-400 frames (default 189) at up to 720p, with optional muxed AAC stereo audio at 48kHz. Ranked #1 on Artificial Analysis image-to-video leaderboard (open models). Available via Hugging Face Diffusers and vLLM-Omni.

2026-05-31

Researched 24d ago

4,000 tokens

VisionMultimodalAudioFine-tune

No tracked provider route

Cosmos 3 Super Text2Image

Cosmos 3 Super Text2Image is a 64B-parameter fine-tuned variant of Cosmos 3 Super specialized for high-fidelity text-to-image generation. Takes text prompts up to 4096 tokens and outputs JPEG images at 256p, 480p, or 720p in aspect ratios 16:9, 4:3, 1:1, 3:4, or 9:16. Ranked #1 on Artificial Analysis text-to-image leaderboard (open models). Available via Hugging Face Diffusers (DiffusionPipeline) and vLLM-Omni.

2026-05-31

Researched 24d ago

4,000 tokens

MultimodalFine-tune

No tracked provider route

GPT-4o

OpenAI GPT-4o: Flagship multimodal model with vision, function calling, and broad capability. $2.50/M input, $10/M output.

2024-05-13

Researched 46d ago

128k

128,000 tokens

128k contextVisionMultimodalTool useFunctionsJSON

OpenAI API

$2.50 in / $10.00 out / 1M tokens

5 routes · 1 batch · 2 cache

Provider docs

GPT-4o-mini

OpenAI: GPT-4o-mini available via OpenRouter. Pricing: $0.15/1M input, $0.6/1M output.

2024-07-18

Researched 46d ago

128k

128,000 tokens

128k contextJSONPrompt cacheBatchFine-tune

OpenAI API

$0.150 in / $0.600 out / 1M tokens

4 routes · 2 cache

Provider docs

Nemotron-Labs-Diffusion 14B

NVIDIA Nemotron-Labs-Diffusion 14B is the largest text model in NVIDIA Research's diffusion language model family, released May 23, 2026. Uses diffusion-based parallel decoding enabling up to 6× higher throughput versus autoregressive baselines, with three decoding modes: autoregressive, diffusion, and self-speculation. Training code released through NVIDIA Megatron Bridge framework alongside the weights, enabling fine-tuning. Released under NVIDIA Nemotron Open Model License (commercially usable open weights). Available on Hugging Face at nvidia/Nemotron-Labs-Diffusion-14B.

2026-05-23

Researched 5d ago

131k

131,072 tokens

131k contextFine-tune

No tracked provider route