--- title: Arabic Function Calling Leaderboard emoji: 🏆 colorFrom: green colorTo: blue sdk: gradio sdk_version: 4.44.0 app_file: app.py pinned: true license: apache-2.0 tags: - arabic - function-calling - leaderboard - llm-evaluation --- # 🏆 Arabic Function Calling Leaderboard لوحة تقييم استدعاء الدوال بالعربية ## Overview The **Arabic Function Calling Leaderboard (AFCL)** evaluates Large Language Models on their ability to: 1. Understand Arabic queries (MSA + Dialects) 2. Select appropriate functions from available options 3. Extract correct arguments from Arabic text 4. Handle parallel and complex function calls 5. Detect when no function should be called ## Models Evaluated - **Arabic-Native**: Jais, ALLaM, SILMA, AceGPT - **Multilingual**: Qwen, Llama, Gemma, Mistral, Phi, BLOOMZ, Aya ## Dataset 📊 **Dataset**: [HeshamHaroon/Arabic_Function_Calling](https://huggingface.co/datasets/HeshamHaroon/Arabic_Function_Calling) - **1,470 total samples** across 10 categories - Simple, Multiple, Parallel, Parallel Multiple - Irrelevance Detection - Dialect Handling (Egyptian, Gulf, Levantine) ## Evaluation The leaderboard automatically evaluates models using the HuggingFace Inference API when the Space starts. ## Citation ```bibtex @misc{afcl2024, title={Arabic Function Calling Leaderboard}, author={Hesham Haroon}, year={2024}, url={https://huggingface.co/spaces/HeshamHaroon/Arabic-Function-Calling-Leaderboard} } ```