File size: 20,565 Bytes
b7662d1
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
53
54
55
56
57
58
59
60
61
62
63
64
65
66
67
68
69
70
71
72
73
74
75
76
77
78
79
80
81
82
83
84
85
86
87
88
89
90
91
92
93
94
95
96
97
98
99
100
101
102
103
104
105
106
107
108
109
110
111
112
113
114
115
116
117
118
119
120
121
122
123
124
125
126
127
128
129
130
131
132
133
134
135
136
137
138
139
140
141
142
143
144
145
146
147
148
149
150
151
152
153
154
155
156
157
158
159
160
161
162
163
164
165
166
167
168
169
170
171
172
173
174
175
176
177
178
179
180
181
182
183
184
185
186
187
188
189
190
191
192
193
194
195
196
197
198
199
200
201
202
203
204
205
206
207
208
209
210
211
212
213
214
215
216
217
218
219
220
221
222
223
224
225
226
227
228
229
230
231
232
233
234
235
236
237
238
239
240
241
242
243
244
245
246
247
248
249
250
251
252
253
254
255
256
257
258
259
260
261
262
263
264
265
266
267
268
269
270
271
272
273
274
275
276
277
278
279
280
281
282
283
284
285
286
287
288
289
290
291
292
293
294
295
296
297
298
299
300
301
302
303
304
305
306
307
308
309
310
311
312
313
314
315
316
317
318
319
320
321
322
323
324
325
326
327
328
329
330
331
332
333
334
335
336
337
338
339
340
341
342
343
344
345
346
347
348
349
350
351
352
353
354
355
356
357
358
359
360
361
362
363
364
365
366
367
368
369
370
371
372
373
374
375
376
377
378
379
380
381
382
383
384
385
386
387
388
389
390
391
392
393
394
395
396
397
398
399
400
401
402
403
404
405
406
407
408
409
410
411
412
413
414
415
416
417
418
419
420
421
422
423
424
425
426
427
428
429
430
431
432
433
434
435
436
437
438
439
440
441
442
443
444
445
446
447
448
449
450
451
452
453
454
455
456
457
458
459
460
461
462
463
464
465
466
467
468
469
470
471
472
473
474
475
476
477
478
479
480
481
482
483
484
485
486
487
488
489
490
491
492
493
494
495
496
497
498
499
500
501
502
503
504
505
506
507
508
509
510
511
512
513
514
515
516
517
518
519
520
521
522
523
524
525
526
527
528
529
530
531
532
533
534
535
536
537
538
539
540
541
542
543
544
545
546
547
548
549
550
551
552
553
554
555
556
557
558
559
560
561
562
563
564
565
566
567
568
569
570
571
572
573
574
575
576
577
578
579
580
581
582
583
584
585
586
587
588
589
590
591
592
593
594
595
596
597
598
599
600
601
602
603
604
605
606
607
608
609
610
611
612
613
614
615
616
617
618
619
620
621
622
623
624
625
626
627
628
629
630
631
632
633
634
635
636
637
638
639
640
641
642
643
644
645
646
647
648
649
650
651
652
653
654
655
656
657
658
659
660
661
662
663
664
665
666
667
668
669
670
671
672
673
674
675
676
677
678
679
680
681
682
683
684
685
686
687
688
689
690
691
692
693
694
695
696
697
698
699
700
701
702
703
704
705
```markdown
# LlamaIndex RAG Setup Guide

## Overview

RewardPilot uses LlamaIndex to build a semantic search system over 50+ credit card benefit documents. This enables the agent to answer complex questions like "Which card has the best travel insurance?" or "Does Amex Gold work at Costco?"

## Why LlamaIndex + RAG?

| Problem | Traditional Approach | RAG Solution |
|---------|---------------------|--------------|
| **Card benefits change** | Hardcode rules β†’ outdated | Dynamic document retrieval |
| **Complex questions** | Manual lookup | Semantic search |
| **50+ cards** | Impossible to memorize | Vector similarity |
| **Nuanced rules** | Prone to errors | Context-aware answers |

**Example:**
- **Question:** "Can I use Chase Sapphire Reserve for airport lounge access when flying domestic?"
- **Traditional:** Check 10+ pages of terms
- **RAG:** Semantic search β†’ "Yes, Priority Pass includes domestic lounges"

---

## Architecture

```
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                    User Question                         β”‚
β”‚   "Which card has best grocery rewards?"                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Query Transformation                        β”‚
β”‚         (Expand, rephrase, extract keywords)            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Embedding Model                          β”‚
β”‚            OpenAI text-embedding-3-small                β”‚
β”‚              (1536 dimensions)                           β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                  Vector Store                            β”‚
β”‚                   ChromaDB                               β”‚
β”‚              (50+ card documents)                        β”‚
β”‚              (10,000+ chunks)                            β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β”‚ Retrieve top-k (k=5)
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚              Retrieved Context                           β”‚
β”‚   1. Amex Gold: 4x points on U.S. supermarkets...      β”‚
β”‚   2. Citi Custom Cash: 5% on top category...           β”‚
β”‚   3. Chase Freedom Flex: 5% rotating categories...     β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Reranking                                β”‚
β”‚         (Cohere Rerank or Cross-Encoder)                β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                   LLM Synthesis                          β”‚
β”‚              Gemini 2.0 Flash Exp                       β”‚
β”‚         (Generate answer from context)                  β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”¬β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
                     β”‚
                     β–Ό
β”Œβ”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”
β”‚                 Final Answer                             β”‚
β”‚   "Amex Gold offers 4x points (best rate) but has      β”‚
β”‚    $25k annual cap. Citi Custom Cash gives 5% but       β”‚
β”‚    only $500/month. For high spenders, use Amex."      β”‚
β””β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”€β”˜
```

---

## Setup

### 1. Install Dependencies

```bash
pip install llama-index==0.12.5 \
  llama-index-vector-stores-chroma==0.4.1 \
  llama-index-embeddings-openai==0.3.1 \
  llama-index-llms-gemini==0.4.2 \
  chromadb==0.5.23 \
  pypdf==5.1.0 \
  beautifulsoup4==4.12.3
```

### 2. Prepare Card Documents

Create directory structure:
```
data/
β”œβ”€β”€ cards/
β”‚   β”œβ”€β”€ amex_gold.pdf
β”‚   β”œβ”€β”€ chase_sapphire_reserve.pdf
β”‚   β”œβ”€β”€ citi_custom_cash.pdf
β”‚   └── ... (50+ cards)
β”œβ”€β”€ terms/
β”‚   β”œβ”€β”€ amex_terms.pdf
β”‚   β”œβ”€β”€ chase_terms.pdf
β”‚   └── ...
└── guides/
    β”œβ”€β”€ maximizing_rewards.md
    β”œβ”€β”€ category_codes.md
    └── ...
```

### 3. Document Sources

#### Option A: Scrape from Issuer Websites

```python
# scrape_card_docs.py
import requests
from bs4 import BeautifulSoup
import PyPDF2
import os

CARD_URLS = {
    "amex_gold": "https://www.americanexpress.com/us/credit-cards/card/gold-card/",
    "chase_sapphire_reserve": "https://creditcards.chase.com/rewards-credit-cards/sapphire/reserve",
    # ... more cards
}

def scrape_card_benefits(url, output_file):
    """Scrape card benefits from issuer website"""
    response = requests.get(url)
    soup = BeautifulSoup(response.text, 'html.parser')
    
    # Extract benefits section
    benefits = soup.find('div', class_='benefits-section')
    
    # Save to markdown
    with open(output_file, 'w') as f:
        f.write(f"# {card_name}\n\n")
        f.write(benefits.get_text())

# Scrape all cards
for card_name, url in CARD_URLS.items():
    scrape_card_benefits(url, f"data/cards/{card_name}.md")
```

#### Option B: Manual Documentation

Create markdown files:

**File:** `data/cards/amex_gold.md`
```markdown
# American Express Gold Card

## Overview
- **Annual Fee:** $325
- **Rewards Rate:** 4x points on dining & U.S. supermarkets (up to $25k/year)
- **Welcome Bonus:** 90,000 points after $6k spend in 6 months

## Earning Structure

### 4x Points
- Restaurants worldwide (including takeout & delivery)
- U.S. supermarkets (up to $25,000 per year, then 1x)

### 3x Points
- Flights booked directly with airlines or on amextravel.com

### 1x Points
- All other purchases

## Monthly Credits
- $10 Uber Cash (Uber Eats eligible)
- $10 Grubhub/Seamless/The Cheesecake Factory/select Shake Shack

## Travel Benefits
- No foreign transaction fees
- Trip delay insurance
- Lost luggage insurance
- Car rental loss and damage insurance

## Merchant Acceptance
- **Accepted:** Most merchants worldwide
- **Not Accepted:** Costco warehouses (Costco.com works)
- **Not Accepted:** Some small businesses

## Redemption Options
- Transfer to 20+ airline/hotel partners (1:1 ratio)
- Pay with Points at Amazon (0.7 cents per point)
- Statement credits (0.6 cents per point)
- Book travel through Amex Travel (1 cent per point)

## Best For
- High grocery spending (up to $25k/year)
- Frequent dining out
- Travelers who value transfer partners

## Limitations
- $25,000 annual cap on 4x supermarket category
- Amex not accepted everywhere
- Annual fee not waived first year
```

---

## Implementation

### File: `rewards_rag_server.py`

```python
"""
LlamaIndex RAG server for credit card benefits
"""

from llama_index.core import (
    VectorStoreIndex,
    SimpleDirectoryReader,
    StorageContext,
    ServiceContext,
    Settings
)
from llama_index.vector_stores.chroma import ChromaVectorStore
from llama_index.embeddings.openai import OpenAIEmbedding
from llama_index.llms.gemini import Gemini
from llama_index.core.node_parser import SentenceSplitter
import chromadb
from fastapi import FastAPI, HTTPException
from pydantic import BaseModel
import os

# Initialize FastAPI
app = FastAPI(title="Rewards RAG MCP Server")

# Configure LlamaIndex
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",
    api_key=os.getenv("OPENAI_API_KEY")
)
Settings.llm = Gemini(
    model="models/gemini-2.0-flash-exp",
    api_key=os.getenv("GEMINI_API_KEY")
)
Settings.chunk_size = 512
Settings.chunk_overlap = 50

# Initialize ChromaDB
chroma_client = chromadb.PersistentClient(path="./chroma_db")
chroma_collection = chroma_client.get_or_create_collection("credit_cards")

# Create vector store
vector_store = ChromaVectorStore(chroma_collection=chroma_collection)
storage_context = StorageContext.from_defaults(vector_store=vector_store)

---

## Document Loading & Indexing

def load_and_index_documents():
    """Load card documents and create vector index"""
    
    # Load documents from directory
    documents = SimpleDirectoryReader(
        input_dir="./data",
        recursive=True,
        required_exts=[".pdf", ".md", ".txt"]
    ).load_data()
    
    print(f"Loaded {len(documents)} documents")
    
    # Parse into nodes (chunks)
    node_parser = SentenceSplitter(
        chunk_size=512,
        chunk_overlap=50
    )
    nodes = node_parser.get_nodes_from_documents(documents)
    
    print(f"Created {len(nodes)} nodes")
    
    # Create index
    index = VectorStoreIndex(
        nodes=nodes,
        storage_context=storage_context
    )
    
    # Persist to disk
    index.storage_context.persist(persist_dir="./storage")
    
    return index

# Load index on startup
try:
    # Try loading existing index
    storage_context = StorageContext.from_defaults(
        vector_store=vector_store,
        persist_dir="./storage"
    )
    index = VectorStoreIndex.from_storage_context(storage_context)
    print("Loaded existing index")
except:
    # Create new index
    print("Creating new index...")
    index = load_and_index_documents()

# Create query engine
query_engine = index.as_query_engine(
    similarity_top_k=5,
    response_mode="compact"
)

---

## API Endpoints

class QueryRequest(BaseModel):
    query: str
    card_name: str = None
    top_k: int = 5

class QueryResponse(BaseModel):
    answer: str
    sources: list
    confidence: float

@app.post("/query", response_model=QueryResponse)
async def query_benefits(request: QueryRequest):
    """
    Query credit card benefits
    
    Example:
    POST /query
    {
        "query": "Which card has best grocery rewards?",
        "top_k": 5
    }
    """
    try:
        # Add card filter if specified
        if request.card_name:
            query = f"For {request.card_name}: {request.query}"
        else:
            query = request.query
        
        # Query the index
        response = query_engine.query(query)
        
        # Extract sources
        sources = []
        for node in response.source_nodes:
            sources.append({
                "card_name": node.metadata.get("file_name", "Unknown"),
                "content": node.text[:200] + "...",
                "relevance_score": float(node.score)
            })
        
        # Calculate confidence based on top score
        confidence = sources[0]["relevance_score"] if sources else 0.0
        
        return QueryResponse(
            answer=str(response),
            sources=sources,
            confidence=confidence
        )
    
    except Exception as e:
        raise HTTPException(status_code=500, detail=str(e))

---

## Advanced Query Techniques

@app.post("/compare")
async def compare_cards(request: dict):
    """
    Compare multiple cards on specific criteria
    
    Example:
    POST /compare
    {
        "cards": ["Amex Gold", "Chase Sapphire Reserve"],
        "criteria": "travel benefits"
    }
    """
    cards = request["cards"]
    criteria = request["criteria"]
    
    # Query each card
    comparisons = []
    for card in cards:
        query = f"What are the {criteria} for {card}?"
        response = query_engine.query(query)
        
        comparisons.append({
            "card": card,
            "benefits": str(response)
        })
    
    # Synthesize comparison
    synthesis_prompt = f"""
    Compare these cards on {criteria}:
    
    {comparisons}
    
    Provide a clear winner and reasoning.
    """
    
    final_response = Settings.llm.complete(synthesis_prompt)
    
    return {
        "comparison": str(final_response),
        "details": comparisons
    }

---

## Metadata Filtering

def add_metadata_to_documents():
    """Add rich metadata for filtering"""
    
    documents = SimpleDirectoryReader("./data").load_data()
    
    for doc in documents:
        # Extract card name from filename
        card_name = doc.metadata["file_name"].replace(".md", "")
        
        # Add metadata
        doc.metadata.update({
            "card_name": card_name,
            "issuer": extract_issuer(card_name),
            "annual_fee": extract_annual_fee(doc.text),
            "category": extract_category(doc.text)
        })
    
    return documents

# Query with filters
@app.post("/query_filtered")
async def query_with_filters(request: dict):
    """
    Query with metadata filters
    
    Example:
    POST /query_filtered
    {
        "query": "best travel card",
        "filters": {
            "issuer": "Chase",
            "annual_fee": {"$lte": 500}
        }
    }
    """
    from llama_index.core.vector_stores import MetadataFilters, ExactMatchFilter
    
    # Build filters
    filters = MetadataFilters(
        filters=[
            ExactMatchFilter(key="issuer", value=request["filters"]["issuer"])
        ]
    )
    
    # Query with filters
    query_engine = index.as_query_engine(
        similarity_top_k=5,
        filters=filters
    )
    
    response = query_engine.query(request["query"])
    
    return {"answer": str(response)}

---

## Hybrid Search (Keyword + Semantic)

from llama_index.core.retrievers import VectorIndexRetriever, BM25Retriever
from llama_index.core.query_engine import RetrieverQueryEngine

def create_hybrid_retriever():
    """Combine vector search + keyword search"""
    
    # Vector retriever
    vector_retriever = VectorIndexRetriever(
        index=index,
        similarity_top_k=10
    )
    
    # BM25 keyword retriever
    bm25_retriever = BM25Retriever.from_defaults(
        docstore=index.docstore,
        similarity_top_k=10
    )
    
    # Combine retrievers
    from llama_index.core.retrievers import QueryFusionRetriever
    
    hybrid_retriever = QueryFusionRetriever(
        retrievers=[vector_retriever, bm25_retriever],
        similarity_top_k=5,
        num_queries=1
    )
    
    return RetrieverQueryEngine(retriever=hybrid_retriever)

---

## Reranking for Better Results

from llama_index.postprocessor.cohere_rerank import CohereRerank

def create_reranking_query_engine():
    """Add reranking for improved relevance"""
    
    # Cohere reranker
    reranker = CohereRerank(
        api_key=os.getenv("COHERE_API_KEY"),
        top_n=3
    )
    
    query_engine = index.as_query_engine(
        similarity_top_k=10,  # Retrieve more candidates
        node_postprocessors=[reranker]  # Rerank to top 3
    )
    
    return query_engine

---

## Evaluation & Metrics

from llama_index.core.evaluation import (
    RelevancyEvaluator,
    FaithfulnessEvaluator
)

async def evaluate_rag_quality():
    """Evaluate RAG system quality"""
    
    # Test queries
    test_queries = [
        "Which card has best grocery rewards?",
        "Does Amex Gold work at Costco?",
        "What are Chase Sapphire Reserve travel benefits?"
    ]
    
    # Ground truth answers
    ground_truth = [
        "Citi Custom Cash offers 5% on groceries...",
        "No, American Express is not accepted at Costco warehouses...",
        "Chase Sapphire Reserve includes Priority Pass..."
    ]
    
    # Evaluators
    relevancy_evaluator = RelevancyEvaluator(llm=Settings.llm)
    faithfulness_evaluator = FaithfulnessEvaluator(llm=Settings.llm)
    
    results = []
    for query, truth in zip(test_queries, ground_truth):
        response = query_engine.query(query)
        
        # Evaluate relevancy
        relevancy_result = await relevancy_evaluator.aevaluate(
            query=query,
            response=str(response)
        )
        
        # Evaluate faithfulness
        faithfulness_result = await faithfulness_evaluator.aevaluate(
            query=query,
            response=str(response),
            contexts=[node.text for node in response.source_nodes]
        )
        
        results.append({
            "query": query,
            "relevancy_score": relevancy_result.score,
            "faithfulness_score": faithfulness_result.score
        })
    
    return results

---

## Deployment

### 1. Build Docker Image

**File:** `Dockerfile`
```dockerfile
FROM python:3.11-slim

WORKDIR /app

# Install dependencies
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt

# Copy application
COPY . .

# Download and index documents on build
RUN python -c "from rewards_rag_server import load_and_index_documents; load_and_index_documents()"

# Expose port
EXPOSE 7860

# Run server
CMD ["uvicorn", "rewards_rag_server:app", "--host", "0.0.0.0", "--port", "7860"]
```

### 2. Deploy to Hugging Face Spaces

```bash
# Create Space
huggingface-cli repo create rewardpilot-rewards-rag --type space --space_sdk docker

# Push files
git add .
git commit -m "Deploy RAG server"
git push
```

---

## Performance Optimization

### 1. Caching Embeddings

```python
from functools import lru_cache

@lru_cache(maxsize=1000)
def get_embedding(text: str):
    """Cache embeddings for repeated queries"""
    return Settings.embed_model.get_text_embedding(text)
```

### 2. Batch Processing

```python
async def batch_query(queries: list):
    """Process multiple queries in parallel"""
    import asyncio
    
    tasks = [query_engine.aquery(q) for q in queries]
    results = await asyncio.gather(*tasks)
    
    return results
```

### 3. Index Optimization

```python
# Use smaller embedding model for speed
Settings.embed_model = OpenAIEmbedding(
    model="text-embedding-3-small",  # 1536 dims
    # vs text-embedding-3-large (3072 dims)
)

# Reduce chunk size for faster retrieval
Settings.chunk_size = 256  # vs 512
```

---

## Monitoring

```python
import time
from prometheus_client import Counter, Histogram

# Metrics
query_counter = Counter('rag_queries_total', 'Total RAG queries')
query_duration = Histogram('rag_query_duration_seconds', 'RAG query duration')

@app.post("/query")
async def query_with_monitoring(request: QueryRequest):
    query_counter.inc()
    
    start_time = time.time()
    response = query_engine.query(request.query)
    duration = time.time() - start_time
    
    query_duration.observe(duration)
    
    return response
```

---

**Related Documentation:**
- [MCP Server Implementation](./mcp_architecture.md)
- [Modal Deployment Guide](./modal_deployment.md)
- [Agent Reasoning Flow](./agent_reasoning.md)
```

---