128-dim GraphConv model trained for edge prediction with a dot-product head
This model was trained using Napistu-Torch, a PyTorch framework for training graph neural networks on biological pathway networks.
The dataset used for training is the 8-source "Octopus" human consensus network, which integrates pathway data from STRING, OmniPath, Reactome, and others. The network encompasses ~50K genes, metabolites, and complexes connected by ~8M interactions.
Task
This model performs edge prediction on biological pathway networks. Given node embeddings, the model predicts the likelihood of edges (interactions) between biological entities such as genes, proteins, and metabolites. This is useful for:
- Discovering novel biological interactions
- Validating experimentally observed interactions
- Completing incomplete pathway databases
- Predicting functional relationships between genes/proteins
The model learns to score potential edges based on learned embeddings of source and target nodes, optionally incorporating relation types for relation-aware prediction.
Model Description
- Encoder
- Type:
graph_conv - Hidden Channels:
128 - Number of Layers:
3 - Dropout:
0.2 - Edge Encoder: β (dim=32)
- Type:
- Head
- Type:
dot_product - Relation-Aware: β
- Type:
Training Date: 2025-12-04
For detailed experiment and training settings see this repository's config.json file.
Performance
| Metric | Value |
|---|---|
| Validation AUC | 0.7957 |
| Test AUC | 0.7964 |
| Validation AP | 0.7938 |
| Test AP | 0.7947 |
Links
- π W&B Run
- π» GitHub Repository
- π Read the Docs
- π Napistu Wiki
Usage
1. Setup Environment
To reproduce the environment used for training, run the following commands:
pip install torch==2.8.0
pip install torch-scatter torch-sparse -f https://data.pyg.org/whl/2.8.0+cpu.html
pip install 'napistu==0.8.2'
pip install 'napistu-torch[pyg,lightning]==0.2.13'
2. Setup Data Store
First, download the Octopus consensus network data to create a local NapistuDataStore:
from napistu_torch.load.gcs import gcs_model_to_store
# Download data and create store
napistu_data_store = gcs_model_to_store(
napistu_data_dir="path/to/napistu_data",
store_dir="path/to/store",
asset_name="human_consensus",
# Pin to stable version for reproducibility
asset_version="20250923"
)
3. Load Pretrained Model from HuggingFace Hub
from napistu_torch.ml.hugging_face import HuggingFaceLoader
# Load checkpoint
loader = HuggingFaceLoader("seanhacks/edge_prediction_dotprod_128e")
checkpoint = loader.load_checkpoint()
# Load config to reproduce experiment
experiment_config = loader.load_config()
4. Use Pretrained Model for Training
You can use this pretrained model as initialization for training via the CLI:
# Create a training config that uses the pretrained model
cat > my_config.yaml << EOF
name: my_finetuned_model
model:
use_pretrained_model: true
pretrained_model_source: huggingface
pretrained_model_path: seanhacks/edge_prediction_dotprod_128e
pretrained_model_freeze_encoder_weights: false # Allow fine-tuning
data:
sbml_dfs_path: path/to/sbml_dfs.pkl
napistu_graph_path: path/to/graph.pkl
napistu_data_name: edge_prediction
training:
epochs: 100
lr: 0.001
EOF
# Train with pretrained weights
napistu-torch train my_config.yaml
Citation
If you use this model, please cite:
@software{napistu_torch,
title = {Napistu-Torch: Graph Neural Networks for Biological Pathway Analysis},
author = {Hackett, Sean R.},
url = {https://github.com/napistu/Napistu-Torch},
year = {2025},
note = {Model: graph_conv-dot_product_h128_l3_edge_prediction}
}
License
MIT License - See LICENSE for details.
- Downloads last month
- 13