| | --- |
| | base_model: |
| | - meta-llama/Llama-3.2-3B-Instruct |
| | language: |
| | - en |
| | license: apache-2.0 |
| | pipeline_tag: text-generation |
| | library_name: transformers |
| | --- |
| | |
| | ```markdown |
| | <div align="center"> |
| | <b style="font-size: 40px;">Gen-8B-R2</b> |
| | </div> |
| | |
| | Note: We are still working on this. |
| | |
| | Are you looking for a more robust and reliable generation model for RAG system? |
| | |
| | Here is a Gen-8B-R2 model that effectively mitigates hallucinations caused by retrieval noise and information overload. |
| | |
| | See the details in our paper [Link](https://arxiv.org/pdf/2503.04789) |
| | |
| | |
| | ### What is Gen-8B-R2? |
| | |
| | This model is one of the variant of Ext2Gen-8B-R2, which disables the process of extracting sentences from the chunk list. |
| | |
| | See the details of Ext2Gen-8B-R2 in https://huggingface.co/DISLab/Ext2Gen-8B-R2 |
| | |
| | ### Recommended Prompt |
| | |
| | - query: the query to answer |
| | - chunk_list: the list of retrieved chunks, e.g., ["chunk 1", "chunk 2", "chunk 3"] |
| | |
| | ```python |
| |
|
| | def prepare_sample_text(prompt): |
| | row_json = [{"role": "user", "content": prompt}] |
| | return tokenizer.apply_chat_template(row_json, tokenize=False) |
| | |
| | def format_prompt_template(query, chunk_list): |
| | |
| | chunk_list = ['[Chunk ID: '+ str(idx+1) + '] ' + chunk_text for idx, chunk_text in enumerate(chunk_list)] |
| | chunk_list = ' |
| |
|
| | '.join(chunk_list) |
| | |
| | prompt = ''' |
| | You are an expert assistant trained to generate answers based on document chunks. |
| | |
| | |
| | ### Generation Instruction: |
| | - Answer to the Query based on the given Chunk List. |
| | |
| | |
| | ### Query: |
| | %s |
| | |
| | |
| | ### Chunk List: |
| | %s |
| | |
| | |
| | ### Output: |
| | ''' % (query, chunk_list) |
| | |
| | return prompt.strip() |
| | |
| |
|
| | prompt = format_prompt_template(query, noisy_chunks) |
| | prompt = prepare_sample_text(prompt) |
| | ``` |
| | |
| | |
| | Note that this prompt outputs both extracted relevant sentences and the answer to the query. |
| | |
| | The output follows a consistent format as seen in an example below. |
| | |
| | ``` |
| | The estimated number of deaths at Chelmno is 150-300,000, mainly Jews. |
| | ``` |
| | |
| | ### Recommended Generation Parameters |
| | |
| | ```python |
| | max_new_tokens=1024, # or 2048 |
| | do_sample=True, |
| | temperature=0.8, |
| | top_p=0.9, |
| | ``` |
| | ``` |