File size: 5,677 Bytes

---
language: zh
license: apache-2.0
tags:
- sentiment-analysis
- chinese
- finance
- finbert
- crypto
- text-classification
- news
datasets:
- custom
metrics:
- accuracy
- f1
- precision
- recall
model-index:
- name: Chinese Financial Sentiment Analysis (Crypto)
  results:
  - task:
      type: text-classification
      name: Sentiment Analysis
    metrics:
    - type: accuracy
      value: 0.645
      name: Accuracy
    - type: f1
      value: 0.6365
      name: F1 Score
    - type: precision
      value: 0.6394
      name: Precision
    - type: recall
      value: 0.645
      name: Recall
---

# Chinese Financial Sentiment Analysis Model (Crypto Focus)

中文金融情感分析模型（加密货币领域）

## 模型描述 | Model Description

本模型基于 `yiyanghkust/finbert-tone-chinese` 微调，专门用于分析中文加密货币相关新闻和社交媒体内容的情感倾向。模型可以识别三种情感类别：正面（Positive）、中性（Neutral）和负面（Negative）。

This model is fine-tuned from `yiyanghkust/finbert-tone-chinese` and specifically designed for sentiment analysis of Chinese cryptocurrency-related news and social media content. It can classify text into three sentiment categories: Positive, Neutral, and Negative.

## 训练数据 | Training Data

- **数据量 | Size**: 1000条人工标注的中文金融新闻 | 1000 manually annotated Chinese financial news articles
- **数据来源 | Source**: 加密货币相关新闻和推文 | Cryptocurrency-related news and tweets
- **标注方式 | Annotation**: AI辅助 + 人工修正 | AI-assisted + Manual correction
- **数据分布 | Distribution**:
  - Positive（正面）: 420条 (42.0%)
  - Neutral（中性）: 420条 (42.0%)
  - Negative（负面）: 160条 (16.0%)

## 性能指标 | Performance Metrics

在200条测试集上的表现 | Performance on 200 test samples:

| 指标 Metric | 数值 Value |
|-------------|-----------|
| 准确率 Accuracy | 64.50% |
| F1分数 F1 Score | 63.65% |
| 精确率 Precision | 63.94% |
| 召回率 Recall | 64.50% |

## 使用方法 | Usage

### 快速开始 | Quick Start

```python
from transformers import AutoTokenizer, AutoModelForSequenceClassification
import torch

# 加载模型和分词器 | Load model and tokenizer
model_name = "LocalOptimum/chinese-crypto-sentiment"  
tokenizer = AutoTokenizer.from_pretrained(model_name)
model = AutoModelForSequenceClassification.from_pretrained(model_name)

# 分析文本 | Analyze text
text = "比特币突破10万美元创历史新高"
inputs = tokenizer(text, return_tensors="pt", truncation=True, max_length=128)

# 预测 | Predict
with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_class = torch.argmax(predictions, dim=-1).item()

# 结果映射 | Result mapping
labels = ['positive', 'neutral', 'negative']
sentiment = labels[predicted_class]
confidence = predictions[0][predicted_class].item()

print(f"情感: {sentiment}")
print(f"置信度: {confidence:.4f}")
```

### 批量处理 | Batch Processing

```python
texts = [
    "币安获得阿布扎比监管授权",
    "以太坊完成Fusaka升级",
    "某交易所遭攻击损失100万美元"
]

inputs = tokenizer(texts, return_tensors="pt", truncation=True,
                   max_length=128, padding=True)

with torch.no_grad():
    outputs = model(**inputs)
    predictions = torch.nn.functional.softmax(outputs.logits, dim=-1)
    predicted_classes = torch.argmax(predictions, dim=-1)

labels = ['positive', 'neutral', 'negative']
for text, pred in zip(texts, predicted_classes):
    print(f"{text} -> {labels[pred]}")
```

## 训练参数 | Training Configuration

- **基础模型 | Base Model**: yiyanghkust/finbert-tone-chinese
- **训练轮数 | Epochs**: 5
- **批次大小 | Batch Size**: 16
- **学习率 | Learning Rate**: 2e-5
- **最大序列长度 | Max Length**: 128
- **训练设备 | Device**: NVIDIA GeForce RTX 3060 Laptop GPU
- **训练时间 | Training Time**: ~5分钟 | ~5 minutes

## 适用场景 | Use Cases

- ✅ 加密货币新闻情感分析
- ✅ 社交媒体舆情监控
- ✅ 金融市场情绪指标
- ✅ 实时新闻情感跟踪
- ✅ 投资决策辅助参考

## 局限性 | Limitations

- ⚠️ 主要针对加密货币领域的金融新闻，其他金融领域可能表现不佳
- ⚠️ 负面样本相对较少（16%），对负面情感的识别可能不够敏感
- ⚠️ 短文本（少于10字）的分析准确率可能下降
- ⚠️ 仅支持简体中文
- ⚠️ 模型不能替代人工判断，仅供参考

## 许可证 | License

Apache-2.0

## 引用 | Citation

如果使用本模型，请引用：

```bibtex
@misc{watchtower-sentiment-2025,
  title={Chinese Financial Sentiment Analysis Model (Crypto Focus)},
  author={Onefly},
  year={2025},
  howpublished={\url{https://huggingface.co/YOUR_USERNAME/sentiment-finetuned-1000}},
  note={Fine-tuned from yiyanghkust/finbert-tone-chinese}
}
```

## 基础模型 | Base Model

本模型基于以下模型微调：
- [yiyanghkust/finbert-tone-chinese](https://huggingface.co/yiyanghkust/finbert-tone-chinese)

感谢原作者的贡献！

## 更新日志 | Changelog

### v2.0 (2025-12-09)
- ✅ 扩充训练数据至1000条
- ✅ 修正标注错误，提升数据质量
- ✅ 优化类别分布，提升模型平衡性
- ✅ F1分数提升2.01%（0.6165 → 0.6365）

### v1.0 (Initial Release)
- 基于500条标注数据的初始版本

## 联系方式 | Contact

如有问题或建议，欢迎提 issue 或 PR。

---

**维护者 | Maintainer**: Onefly
**最后更新 | Last Updated**: 2025-12-09