patrickNLP
/

Graphix-3B

text-generation-inference

Model card Files Files and versions

Graphix-3B / README.md

patrickNLP's picture

Update README.md

810f9c3 over 2 years ago

|

history blame contribute delete

2.79 kB

	---
	language:
	- en
	tags:
	- text2sql
	- spider
	- Transformer
	- Pytorch
	license: mit
	---
	## Model Description

	Graphix-T5 is a graph-aware semi-pretrained text-to-text PLM specifically designed to improve multi-hop reasoning for the complex text-to-SQL task.
	This novel architecture enhances the structural encoding capabilities of the T5 model while preserving its powerful contextual encoding ability.
	The experimental results demonstrate the effectiveness of GRAPHIX-T5 and underscore the importance of incorporating structural information in text-to-text PLMs for tackling intricate text-to-SQL challenges.
	The smaller gap in performance between the dev and test sets indicates the stronger generalization capability of Graphix-T5.

	## Training Data
	Graphix-3B is trained based on SPIDER, a cross-domain text-to-SQL benchmark. And it's evaluated in vanilla SPIDER dev, test, and other variants: SPIDER-SYN, SPIDER-DK,
	SPIDER-REALISTIC without additional training. This model will continue to be fine-tuned on more complex text-to-SQL data,
	i.e. BIRD to deal with harder but more real applications

	## To Begin With

	You can use this model directly with a pipeline for text generation. This example generates a different sequence each time it's run:
	```py
	from transformers import AutoTokenizer, AutoModel

	tokenizer = AutoTokenizer.from_pretrained("patrickNLP/Graphix-3B")

	model = AutoModel.from_pretrained("patrickNLP/Graphix-3B")
	```

	## Performance
	Graphix-3B w/ Picard maintains state-of-the-art (SOTA) semantic parsing capabilities, as demonstrated by its performance on the [`SPIDER`](https://yale-lily.github.io/spider) leaderboard. Its only submission achieves 74.0% on EM and 77.6% on EX in the testing dataset.
	Please see [`Graphix Official Implementation`]() for details.

	## Reference
	1. [`Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing`](https://arxiv.org/abs/2301.07507)
	2. [`Can LLM Already Serve as A Database Interface? A BIg Bench for Large-Scale Database Grounded Text-to-SQLs`](https://arxiv.org/abs/2305.03111)
	3. [`Spider: A Large-Scale Human-Labeled Dataset for Complex and Cross-Domain Semantic Parsing and Text-to-SQL Task`](https://arxiv.org/abs/1809.08887)
	4. [`PICARD: Parsing Incrementally for Constrained Auto-Regressive Decoding from Language Models`](https://arxiv.org/abs/2109.05093)


	## Citation
	```
	@misc{li2023graphixt5,
	title={Graphix-T5: Mixing Pre-Trained Transformers with Graph-Aware Layers for Text-to-SQL Parsing},
	author={Jinyang Li and Binyuan Hui and Reynold Cheng and Bowen Qin and Chenhao Ma and Nan Huo and Fei Huang and Wenyu Du and Luo Si and Yongbin Li},
	year={2023},
	eprint={2301.07507},
	archivePrefix={arXiv},
	primaryClass={cs.CL}
	}
	```