peterliu06
/

merge2docs-models

graph-neural-networks

mathematical-proofs

curriculum-learning

Model card Files Files and versions

merge2docs-models / README.md

peterliu06's picture

Upload README.md with huggingface_hub

736c82f verified 3 months ago

|

history blame contribute delete

3.27 kB

	---
	license: mit
	tags:
	- graph-neural-networks
	- gnn
	- mathematical-proofs
	- curriculum-learning
	- pytorch
	language:
	- en
	library_name: pytorch
	---

	# Merge2Docs Models

	This repository contains pre-trained models for the [Merge2Docs](https://github.com/pechang03/merge2docs) project - a research project combining LLMs with advanced graph algorithms for sophisticated document merging.

	## Models

	Total size: 4.38 MB


	### sparse_hierarchical_net_v1.pth

	- Type: GNN
	- Size: 0.375 MB
	- Description: Sparse Hierarchical GNN model for graph neural network tasks
	- Path: `sparse_hierarchical_net_v1.pth`


	### curriculum_ultimate_model.pth

	- Type: Math/Curriculum
	- Size: 4.0 MB
	- Description: Curriculum learning model for mathematical proof routing
	- Path: `curriculum_ultimate_model.pth`


	## Usage

	### Download models

	```python
	from huggingface_hub import hf_hub_download

	# Download GNN model
	gnn_model_path = hf_hub_download(
	repo_id="peterliu06/merge2docs-models",
	filename="sparse_hierarchical_net_v1.pth"
	)

	# Download curriculum model
	math_model_path = hf_hub_download(
	repo_id="peterliu06/merge2docs-models",
	filename="curriculum_ultimate_model.pth"
	)
	```

	### Load in PyTorch

	```python
	import torch

	# Load GNN model
	gnn_model = torch.load(gnn_model_path)

	# Load curriculum model
	math_model = torch.load(math_model_path)
	```

	## Model Details

	### Sparse Hierarchical GNN (sparse_hierarchical_net_v1.pth)

	A Graph Neural Network designed for hierarchical document structure analysis.

	Architecture:
	- Sparse graph representation
	- Hierarchical message passing
	- Optimized for document-level relationships

	Training:
	- Dataset: Document graph structures
	- Framework: PyTorch + PyTorch Geometric
	- Hardware: GPU-accelerated training

	### Curriculum Ultimate Model (curriculum_ultimate_model.pth)

	A curriculum learning-based model for mathematical proof routing and validation.

	Architecture:
	- Progressive difficulty learning
	- Multi-task proof validation
	- Curriculum-based training strategy

	Training:
	- Dataset: Mathematical proofs and reasoning tasks
	- Framework: PyTorch
	- Approach: Curriculum learning with increasing complexity

	## Integration with Merge2Docs

	To use these models with Merge2Docs, update your `.env_m2d`:

	```bash
	# Download from Hugging Face
	GNN_MODEL_PATH="./models/sparse_hierarchical_net_v1.pth"
	MATH_MODEL_PATH="./models/curriculum_ultimate_model.pth"
	```

	Then download the models:

	```bash
	cd merge2docs
	python scripts/download_models_from_huggingface.py
	```

	## Citation

	If you use these models in your research, please cite:

	```bibtex
	@software{merge2docs_models,
	author = {Chang, Peter},
	title = {Merge2Docs Pre-trained Models},
	year = {2025},
	publisher = {Hugging Face},
	url = {https://huggingface.co/peterliu06/merge2docs-models}
	}
	```

	## License

	MIT License - See [LICENSE](https://github.com/pechang03/merge2docs/blob/main/LICENSE)

	## Project Links

	- GitHub: https://github.com/pechang03/merge2docs
	- Documentation: See project README
	- Issues: https://github.com/pechang03/merge2docs/issues

	## Model Updates

	Models are periodically retrained and updated. Check the git history for version information.