--- license: mit tags: - graph-neural-networks - gnn - mathematical-proofs - curriculum-learning - pytorch language: - en library_name: pytorch --- # Merge2Docs Models This repository contains pre-trained models for the [Merge2Docs](https://github.com/pechang03/merge2docs) project - a research project combining LLMs with advanced graph algorithms for sophisticated document merging. ## Models Total size: **4.38 MB** ### sparse_hierarchical_net_v1.pth - **Type**: GNN - **Size**: 0.375 MB - **Description**: Sparse Hierarchical GNN model for graph neural network tasks - **Path**: `sparse_hierarchical_net_v1.pth` ### curriculum_ultimate_model.pth - **Type**: Math/Curriculum - **Size**: 4.0 MB - **Description**: Curriculum learning model for mathematical proof routing - **Path**: `curriculum_ultimate_model.pth` ## Usage ### Download models ```python from huggingface_hub import hf_hub_download # Download GNN model gnn_model_path = hf_hub_download( repo_id="peterliu06/merge2docs-models", filename="sparse_hierarchical_net_v1.pth" ) # Download curriculum model math_model_path = hf_hub_download( repo_id="peterliu06/merge2docs-models", filename="curriculum_ultimate_model.pth" ) ``` ### Load in PyTorch ```python import torch # Load GNN model gnn_model = torch.load(gnn_model_path) # Load curriculum model math_model = torch.load(math_model_path) ``` ## Model Details ### Sparse Hierarchical GNN (sparse_hierarchical_net_v1.pth) A Graph Neural Network designed for hierarchical document structure analysis. **Architecture:** - Sparse graph representation - Hierarchical message passing - Optimized for document-level relationships **Training:** - Dataset: Document graph structures - Framework: PyTorch + PyTorch Geometric - Hardware: GPU-accelerated training ### Curriculum Ultimate Model (curriculum_ultimate_model.pth) A curriculum learning-based model for mathematical proof routing and validation. **Architecture:** - Progressive difficulty learning - Multi-task proof validation - Curriculum-based training strategy **Training:** - Dataset: Mathematical proofs and reasoning tasks - Framework: PyTorch - Approach: Curriculum learning with increasing complexity ## Integration with Merge2Docs To use these models with Merge2Docs, update your `.env_m2d`: ```bash # Download from Hugging Face GNN_MODEL_PATH="./models/sparse_hierarchical_net_v1.pth" MATH_MODEL_PATH="./models/curriculum_ultimate_model.pth" ``` Then download the models: ```bash cd merge2docs python scripts/download_models_from_huggingface.py ``` ## Citation If you use these models in your research, please cite: ```bibtex @software{merge2docs_models, author = {Chang, Peter}, title = {Merge2Docs Pre-trained Models}, year = {2025}, publisher = {Hugging Face}, url = {https://huggingface.co/peterliu06/merge2docs-models} } ``` ## License MIT License - See [LICENSE](https://github.com/pechang03/merge2docs/blob/main/LICENSE) ## Project Links - **GitHub**: https://github.com/pechang03/merge2docs - **Documentation**: See project README - **Issues**: https://github.com/pechang03/merge2docs/issues ## Model Updates Models are periodically retrained and updated. Check the git history for version information.