NSQL 2B
About
The NSQL 2B model is a specialized open-source large language model geared towards generating SQL queries. Built on the Salesforce CodeGen-Multi 2B framework, it has undergone extensive training on vast SQL datasets, including "The Stack" and over 20 other public datasets. This training involved a two-phased approach: pre-training on general SQL queries and fine-tuning on specific text-to-SQL pairs. These enhancements enable the model to accurately transform natural language prompts into SQL queries, particularly SELECT queries, using a table schema as context. However, the model may have limitations with more complex SQL structures and requires input similarity to its training data for optimal performance. While it excels in its intended applications, users should be mindful of its susceptibility to factual inaccuracies in generated SQL.