merge2docs-models / README.md
peterliu06's picture
Upload README.md with huggingface_hub
736c82f verified
---
license: mit
tags:
- graph-neural-networks
- gnn
- mathematical-proofs
- curriculum-learning
- pytorch
language:
- en
library_name: pytorch
---
# Merge2Docs Models
This repository contains pre-trained models for the [Merge2Docs](https://github.com/pechang03/merge2docs) project - a research project combining LLMs with advanced graph algorithms for sophisticated document merging.
## Models
Total size: **4.38 MB**
### sparse_hierarchical_net_v1.pth
- **Type**: GNN
- **Size**: 0.375 MB
- **Description**: Sparse Hierarchical GNN model for graph neural network tasks
- **Path**: `sparse_hierarchical_net_v1.pth`
### curriculum_ultimate_model.pth
- **Type**: Math/Curriculum
- **Size**: 4.0 MB
- **Description**: Curriculum learning model for mathematical proof routing
- **Path**: `curriculum_ultimate_model.pth`
## Usage
### Download models
```python
from huggingface_hub import hf_hub_download
# Download GNN model
gnn_model_path = hf_hub_download(
repo_id="peterliu06/merge2docs-models",
filename="sparse_hierarchical_net_v1.pth"
)
# Download curriculum model
math_model_path = hf_hub_download(
repo_id="peterliu06/merge2docs-models",
filename="curriculum_ultimate_model.pth"
)
```
### Load in PyTorch
```python
import torch
# Load GNN model
gnn_model = torch.load(gnn_model_path)
# Load curriculum model
math_model = torch.load(math_model_path)
```
## Model Details
### Sparse Hierarchical GNN (sparse_hierarchical_net_v1.pth)
A Graph Neural Network designed for hierarchical document structure analysis.
**Architecture:**
- Sparse graph representation
- Hierarchical message passing
- Optimized for document-level relationships
**Training:**
- Dataset: Document graph structures
- Framework: PyTorch + PyTorch Geometric
- Hardware: GPU-accelerated training
### Curriculum Ultimate Model (curriculum_ultimate_model.pth)
A curriculum learning-based model for mathematical proof routing and validation.
**Architecture:**
- Progressive difficulty learning
- Multi-task proof validation
- Curriculum-based training strategy
**Training:**
- Dataset: Mathematical proofs and reasoning tasks
- Framework: PyTorch
- Approach: Curriculum learning with increasing complexity
## Integration with Merge2Docs
To use these models with Merge2Docs, update your `.env_m2d`:
```bash
# Download from Hugging Face
GNN_MODEL_PATH="./models/sparse_hierarchical_net_v1.pth"
MATH_MODEL_PATH="./models/curriculum_ultimate_model.pth"
```
Then download the models:
```bash
cd merge2docs
python scripts/download_models_from_huggingface.py
```
## Citation
If you use these models in your research, please cite:
```bibtex
@software{merge2docs_models,
author = {Chang, Peter},
title = {Merge2Docs Pre-trained Models},
year = {2025},
publisher = {Hugging Face},
url = {https://huggingface.co/peterliu06/merge2docs-models}
}
```
## License
MIT License - See [LICENSE](https://github.com/pechang03/merge2docs/blob/main/LICENSE)
## Project Links
- **GitHub**: https://github.com/pechang03/merge2docs
- **Documentation**: See project README
- **Issues**: https://github.com/pechang03/merge2docs/issues
## Model Updates
Models are periodically retrained and updated. Check the git history for version information.