LLM Reference

NuExtract Large

About

NuExtract Large is an advanced information extraction model that derives from the Phi-3-small model. It specializes in turning unstructured text into JSON format, leveraging a JSON template to define the output information structure. The model is purely extractive, meaning it can only extract text that is present in the input, and it is capable of handling input texts up to 2000 tokens. This makes it ideal for tasks such as automated data entry, text summarization, and enhancing search systems. Despite being fine-tuned from a small-scale model, NuExtract Large outperforms some larger models on specific tasks, showcasing its efficiency and effectiveness. It also has companion models like NuExtract and NuExtract-tiny, and a multilingual version, NuExtract 1.5, developed to overcome limitations of its predecessor.

Capabilities

MultimodalFunction CallingTool UseJSON Mode

Specifications

FamilyNuExtract
Parameters7B
ArchitectureDecoder Only
Specializationgeneral