Instructions to use MikhailRepkin/news_classifier with libraries, inference providers, notebooks, and local apps. Follow these links to get started.
- Libraries
- Transformers
How to use MikhailRepkin/news_classifier with Transformers:
# Use a pipeline as a high-level helper from transformers import pipeline pipe = pipeline("text-classification", model="MikhailRepkin/news_classifier")# Load model directly from transformers import AutoTokenizer, AutoModelForSequenceClassification tokenizer = AutoTokenizer.from_pretrained("MikhailRepkin/news_classifier") model = AutoModelForSequenceClassification.from_pretrained("MikhailRepkin/news_classifier") - Notebooks
- Google Colab
- Kaggle
Model Details
Model Description
News_classifier is a fine-tuned model designed for binary classifying (news/not news) from various Russian-language Telegram channels. This model can be integrated into a news aggregation service.
- Model type: Sentence RuBERT (Russian, cased, 12-layer, 768-hidden, 12-heads, 180M parameters)
- Language(s): russian (ru)
- License: mit
- Finetuned from model:
DeepPavlov/rubert-base-cased-sentence
Dataset
- Russian telegram posts
- train/valid/test: 2970/165/165
Training Details
- token max length: 512
- num labels: 2
- batch size: 16
- learning rate: 2e-5
- train epochs: 20
- weight decay: 0.01
Metrics:
- Matthews_correlation (training evaluation metric): 0.89
- Accuracy: 0.95
Label Scheme
- LABEL_1 - news
- LABEL_0 - not news
- Downloads last month
- 5