NuExtract Models by NuMind
About
NuExtract is a line of lightweight, open-source text-to-JSON large language models (LLMs) crafted by NuMind for efficient structured information extraction. These models are adept at converting unstructured text into structured JSON formats, thus highly suitable for various data extraction tasks. The family includes different versions tailored for specific needs: from NuExtract-tiny with 0.5 billion parameters to NuExtract-large with 7 billion parameters. The latest iteration, NuExtract 1.5, features multilingual support, processes documents of any length, and even surpasses larger models like GPT-4o in certain benchmarks. Trained on a proprietary, high-quality synthetic dataset, these models are available under the MIT license and can operate in zero-shot or fine-tuned settings, offering flexibility for diverse applications 47.
Current Variants
Use-when guidance is derived from seed capabilities, context, release, and replacement fields.
| Model | Use when | Released | Signals | Status |
|---|---|---|---|---|
| NuExtract Large | Use when the workload needs 7B parameters. | 2023-11 | 7B parameters | Current |
| NuExtract | Use when the workload needs 3.8B parameters. | 2023-11 | 3.8B parameters | Current |
| NuExtract Tiny | Use when the workload needs 500M parameters. | 2023-11 | 500M parameters | Current |
Release Timeline
1 release groupSpecifications(3 models)
| Model | Released | Parameters |
|---|---|---|
| NuExtract Large | 2023-11 | 7B |
| NuExtract | 2023-11 | 3.8B |
| NuExtract Tiny | 2023-11 | 500M |
Frequently Asked Questions
- What is NuExtract used for?
- NuExtract is used for extraction and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
- How does NuExtract compare to NuExtract 1.5?
- NuExtract by NuMind is strongest where you need extraction, while NuExtract 1.5 by NuMind is the closest related family to check for extraction. NuExtract has 3 listed variants, while NuExtract 1.5 reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
- Which NuExtract model should I use?
- If price is the main constraint, use the pricing table first because NuExtract does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NuExtract Large.

