Wm-Grsa-Bilingual-Xgboost - Game Review Sentiment Analysis
Model Description
This model performs sentiment analysis on game reviews, classifying them into three categories:
- Positive: Favorable reviews
- Mixed: Neutral or mixed sentiment reviews
- Negative: Unfavorable reviews
Model Type: Wm-Grsa-Bilingual-Xgboost
Training Date: 2025-12-23
Performance
Test Set Metrics
| Metric | Score |
|---|---|
| Accuracy | 0.7793 |
| F1-Score | 0.7901 |
| Precision | 0.8085 |
| Recall | 0.7793 |
Training Information
- Training Time: 274.84 seconds
- Training Samples: 629,884
- Validation Samples: 78,735
- Test Samples: 78,737
Model Configuration
{
"model_name": "XGBoost",
"embedding_model": "Lajavaness/bilingual-embedding-small",
"n_estimators": 5000,
"max_depth": 4,
"learning_rate": 0.1,
"subsample": 0.8,
"colsample_bytree": 0.8,
"subset": 1.0
}
Usage
Loading the Model
from pathlib import Path
import pickle
# Load the model components
model_dir = Path("path/to/model")
with open(model_dir / 'vectorizer.pkl', 'rb') as f:
vectorizer = pickle.load(f)
with open(model_dir / 'classifier.pkl', 'rb') as f:
classifier = pickle.load(f)
with open(model_dir / 'label_encoder.pkl', 'rb') as f:
label_encoder = pickle.load(f)
Making Predictions
# Example reviews
reviews = [
"This game is absolutely amazing! Best game I've played this year.",
"It's okay, nothing special but not terrible either.",
"Terrible game, waste of money and time."
]
# Transform and predict
X = vectorizer.transform(reviews)
predictions_encoded = classifier.predict(X)
predictions = label_encoder.inverse_transform(predictions_encoded)
print(predictions)
# Output: ['positive', 'mixed', 'negative']
# Get probabilities
probabilities = classifier.predict_proba(X)
print(probabilities)
Per-Class Performance
| Class | Precision | Recall | F1-Score | Support |
|---|---|---|---|---|
| Positive | 0.9192 | 0.8197 | 0.8666 | 45859 |
| Mixed | 0.4403 | 0.6047 | 0.5096 | 12697 |
| Negative | 0.7884 | 0.7972 | 0.7928 | 20181 |
Feature Importance
The model identifies important words/phrases for each sentiment class. See results.json for the complete feature importance analysis.
Limitations
- The model is trained specifically on game reviews and may not generalize well to other domains
- Performance may vary on reviews with sarcasm or nuanced sentiments
- The model treats text as bag-of-words and doesn't capture word order
Training Details
This model was trained as part of a game review sentiment analysis project. For more information, see the project repository.
Files
vectorizer.pkl: TF-IDF vectorizerclassifier.pkl: Trained classifierlabel_encoder.pkl: Label encoder for sentiment classesconfig.json: Model configurationresults.json: Complete training results and metrics
Citation
If you use this model, please cite:
@misc{game_review_sentiment,
author = {Game Review Sentiment Analysis Project},
title = {Sentiment Analysis Model for Game Reviews},
year = {2025},
url = {https://huggingface.co/wm-grsa-bilingual-xgboost}
}
- Downloads last month
- 3