LLM Reference

NuExtract Models by NuMind

NuMindExtraction
3 models2023

About

NuExtract is a line of lightweight, open-source text-to-JSON large language models (LLMs) crafted by NuMind for efficient structured information extraction. These models are adept at converting unstructured text into structured JSON formats, thus highly suitable for various data extraction tasks. The family includes different versions tailored for specific needs: from NuExtract-tiny with 0.5 billion parameters to NuExtract-large with 7 billion parameters. The latest iteration, NuExtract 1.5, features multilingual support, processes documents of any length, and even surpasses larger models like GPT-4o in certain benchmarks. Trained on a proprietary, high-quality synthetic dataset, these models are available under the MIT license and can operate in zero-shot or fine-tuned settings, offering flexibility for diverse applications 47.

Current Variants

Use-when guidance is derived from seed capabilities, context, release, and replacement fields.

3 in view

Use when the workload needs 7B parameters.

2023-117B parameters
NuExtractCurrent

Use when the workload needs 3.8B parameters.

2023-113.8B parameters

Use when the workload needs 500M parameters.

2023-11500M parameters

Release Timeline

1 release group
2023-11
3 current
NuExtract
3.8B parameters
Current
NuExtract Large
7B parameters
Current
NuExtract Tiny
500M parameters
Current

Specifications(3 models)

NuExtract model specifications comparison
ModelReleasedParameters
NuExtract Large2023-117B
NuExtract2023-113.8B
NuExtract Tiny2023-11500M

Frequently Asked Questions

What is NuExtract used for?
NuExtract is used for extraction and structured outputs. The family description and listed model capabilities point to those workloads as the best fit.
How does NuExtract compare to NuExtract 1.5?
NuExtract by NuMind is strongest where you need extraction, while NuExtract 1.5 by NuMind is the closest related family to check for extraction. NuExtract has 3 listed variants, while NuExtract 1.5 reaches up to 128k context, so compare the specs and pricing tables before choosing a production model.
Which NuExtract model should I use?
If price is the main constraint, use the pricing table first because NuExtract does not have complete provider pricing in the local data. For the most capable/latest local choice, evaluate NuExtract Large.

Models(3)