LLM ReferenceLLM Reference

NuExtract

About

NuExtract is a collection of lightweight text-to-JSON large language models crafted by NuMind for extracting structured data from unstructured text. It is offered in various sizes: NuExtract-tiny, NuExtract, and NuExtract-large, and supports zero-shot and fine-tuned applications. A notable characteristic is its purely extractive approach, which ensures accuracy by copying output text directly from input, effectively preventing hallucinations. The models employ JSON templates to define necessary information structures, aiding in customizable extraction processes. Trained on a superior synthetic dataset, these models excel in certain tasks compared to larger LLMs. The latest versions, like NuExtract 1.5, offer multilingual capabilities and can handle lengthy documents efficiently.

Capabilities

VisionMultimodalReasoningFunction CallingTool UseStructured OutputsCode Execution

Rankings

Specifications

FamilyNuExtract
Released2023-11-30
Parameters3.8B
ArchitectureDecoder Only
Specializationgeneral
Trainingfinetuning

Created by

Innovative AI-driven NLP model creator

Paris, France
Founded 2022
Website