ConicAI Coding LLM

Model Details

Model Description

ConicAI LLM Model is a parameter-efficient fine-tuned coding assistant built using LoRA on top of Qwen2.5-Coder. It is designed to generate, debug, and explain code with structured outputs.

Developed by: GIRISH KUMAR DEWANGAN
Model type: Causal Language Model (Code LLM)
Language(s): Python, general programming
used for: Code generation, debugging, fixing error, getting evaluation score, check hallucination and relevancy score as well
License: Apache 2.0
Finetuned from model: Qwen/Qwen2.5-Coder-0.5B-Instruct

Model Sources

Repository: https://huggingface.co/girish00/ConicAI_LLM_model
Paper: View Paper

Uses

Direct Use

Code generation
Debugging
Code explanation
Learning programming

Downstream Use

Coding assistants
AI-based education tools
Developer productivity tools

Out-of-Scope Use

Security-critical systems
Autonomous production systems
High-risk environments

Bias, Risks, and Limitations

May generate incorrect logic
Confidence scores are heuristic
Output depends on prompt quality
Limited dataset generalization

Recommendations

Always validate generated code
Use structured prompts
Avoid ambiguous instructions

Structured Output Framework

The model produces outputs in structured JSON format:

{
  "code": "...",
  "explanation": "...",
  "confidence": 0.84,
  "relevancy_score": 0.82,
  "hallucination": false
}

This enables:

-Easy API integration
-Automated evaluation
-Better interpretability

How to Get Started with the Model

!pip -q install -U transformers peft accelerate huggingface_hub safetensors
!pip install --upgrade torchao

from google.colab import userdata
HF_TOKEN = userdata.get('HF_TOKEN')

model = "girish00/ConicAI_LLM_model"
prompt = input("Enter your prompt: ")

from huggingface_hub import login, snapshot_download
login(token=HF_TOKEN)

repo = snapshot_download(model, token=HF_TOKEN)

import sys, os
sys.path.append(repo)

from infer_local import build_instruction_prompt, build_structured_result
from peft import PeftConfig, PeftModel
from transformers import AutoTokenizer, AutoModelForCausalLM
import torch, time, json

cfg = PeftConfig.from_pretrained(repo)
base = cfg.base_model_name_or_path

tokenizer = AutoTokenizer.from_pretrained(base)

base_model = AutoModelForCausalLM.from_pretrained(
    base,
    torch_dtype=torch.float16 if torch.cuda.is_available() else torch.float32,
    device_map="auto"
)

llm = PeftModel.from_pretrained(base_model, repo)
llm.eval()

inputs = tokenizer(build_instruction_prompt(prompt), return_tensors="pt").to(llm.device)

start = time.perf_counter()

with torch.no_grad():
    out = llm.generate(
        **inputs,
        max_new_tokens=320,
        output_scores=True,
        return_dict_in_generate=True,
        do_sample=False,
        pad_token_id=tokenizer.eos_token_id
    )

latency = int((time.perf_counter() - start) * 1000)

gen_ids = out.sequences[0][inputs["input_ids"].shape[1]:].tolist()
text = tokenizer.decode(gen_ids, skip_special_tokens=True)

conf = []
for tid, score in zip(gen_ids, out.scores):
    probs = torch.softmax(score[0], dim=-1)
    conf.append(float(probs[tid].item()))

print(json.dumps(
    build_structured_result(
        prompt,
        text,
        latency,
        tokenizer=tokenizer,
        generated_ids=gen_ids,
        token_confidences=conf
    ),
    indent=2
))

📊 Benchmark Results

Training Details

Dataset

Size: ~5K samples
Instruction-based coding dataset

Training Procedure

Method: LoRA fine-tuning
Framework: Transformers + PEFT
Precision: FP16 / Mixed

Training Hyperparameters

Parameter	Value
Epochs	1–3
Batch Size	2
Learning Rate	2e-4
Max Sequence Length	512
LoRA Rank (r)	8
LoRA Alpha	16
LoRA Dropout	0.05

Inference Configuration

max_new_tokens = 200
temperature = 0.2
top_p = 0.9
do_sample = True

Evaluation

Metrics

Code correctness
Syntax validity
Relevancy score
Hallucination rate
Confidence score
Latency

Results Summary

Higher correctness vs base model
Lower hallucination rate
Better structured outputs

Technical Specifications

Architecture

Transformer-based causal LM
LoRA adaptation

Hardware

GPU recommended (optional)
CPU supported

Software

Transformers
PEFT
PyTorch

Environmental Impact

Low compute due to LoRA
Efficient fine-tuning

Citation

BibTeX:

@misc{conicai_llm,
  author = {Girish},
  title = {ConicAI Coding LLM},
  year = {2026},
  publisher = {Hugging Face}
}

Model Card Authors

GIRISH KUMAR DEWANGAN

Framework versions

PEFT 0.19.0

Downloads last month: 557

Safetensors

Model size

0.5B params

Tensor type

F32

Model tree for girish00/ConicAI_LLM_model

Base model

Qwen/Qwen2.5-0.5B

Finetuned

Qwen/Qwen2.5-Coder-0.5B

Finetuned

Qwen/Qwen2.5-Coder-0.5B-Instruct

Adapter

(32)

this model