Hugging Face
Models
Datasets
Spaces
Buckets
new
Docs
Enterprise
Pricing
Website
Tasks
HuggingChat
Collections
Languages
Organizations
Community
Blog
Posts
Daily Papers
Learn
Discord
Forum
GitHub
Solutions
Team & Enterprise
Hugging Face PRO
Enterprise Support
Inference Providers
Inference Endpoints
Storage Buckets
Log In
Sign Up
rtferraz
/
domainTokenizer
like
0
arxiv:
9 papers
Model card
Files
Files and versions
xet
Community
main
domainTokenizer
/
docs
Ctrl+K
Ctrl+K
1 contributor
History:
8 commits
rtferraz
Add e-commerce pre-training report β successful demo, behavioral clusters found, future improvements noted
2b3e3af
verified
10 days ago
adr
Add ADR-002: Dataset selection for Phase 3 demos β research findings, rationale, phased plan
16 days ago
reports
Add e-commerce pre-training report β successful demo, behavioral clusters found, future improvements noted
10 days ago
nubank_nuformer_analysis.md
Safe
29.9 kB
Add Nubank nuFormer reverse-engineering analysis β full pipeline reconstruction
16 days ago
phase2_implementation_report.md
Safe
19.2 kB
Update implementation report: add Phase 2D, update header to v0.4.0 / 139 tests, update cumulative summary and API
16 days ago
research_report.md
Safe
52.8 kB
Add comprehensive research report on domain-specific tokenization
16 days ago