wide-big-reduced

This model is a GPN trained on the sbuedenb/big_beetle_dataset dataset. It achieves the following results on the evaluation set:

  • Loss: 1.1583

Model description

This model deviates from the default GPN in the following parameters:

first_kernel_size=9 (actually default, just to be explicit about kernel size)
rest_kernel_size=9
dilation_max=81
dilation_cycle=5
dilation_base=3
num_hidden_layers=20

--config_overrides "first_kernel_size=9,rest_kernel_size=9,dilation_max=81,dilation_cycle=5,dilation_base=3,num_hidden_layers=20"

Intended uses & limitations

This model is meant for DNA analysis of the Cucujiformia infraorder of insects.

Training and evaluation data

The dataset was created from 12 NCBI reference genomes from Cucujiformia.

Training procedure

240000 steps with linear LR 30000 steps with cosine LR

Training hyperparameters

The following hyperparameters were used for first 240000 steps training:

  • learning_rate: 0.001
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 1024
  • total_eval_batch_size: 1024
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: linear
  • training_steps: 240000

The following hyperparameters were used for last 30000 steps training:

  • learning_rate: 0.001
  • train_batch_size: 256
  • eval_batch_size: 256
  • seed: 42
  • distributed_type: multi-GPU
  • num_devices: 4
  • total_train_batch_size: 1024
  • total_eval_batch_size: 1024
  • optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
  • lr_scheduler_type: cosine
  • training_steps: 270000

Training results

Training Loss Epoch Step Validation Loss
1.2423 0.0083 1000 1.2538
1.2322 0.0167 2000 1.2448
1.2251 0.025 3000 1.2364
1.2212 0.0333 4000 1.2350
1.218 0.0417 5000 1.2322
1.2153 0.05 6000 1.2298
1.2134 0.0583 7000 1.2292
1.209 0.0667 8000 1.2252
1.2051 0.075 9000 1.2245
1.2016 0.0833 10000 1.2215
1.1981 0.0917 11000 1.2191
1.1953 0.1 12000 1.2172
1.1918 0.1083 13000 1.2158
1.1897 0.1167 14000 1.2146
1.1874 0.125 15000 1.2132
1.1841 0.1333 16000 1.2110
1.1817 0.1417 17000 1.2104
1.1798 0.15 18000 1.2098
1.178 0.1583 19000 1.2083
1.1754 0.1667 20000 1.2071
1.1747 0.175 21000 1.2062
1.1729 0.1833 22000 1.2055
1.1714 0.1917 23000 1.2050
1.1696 0.2 24000 1.2042
1.1682 0.2083 25000 1.2026
1.1668 0.2167 26000 1.2026
1.165 0.225 27000 1.2007
1.1645 0.2333 28000 1.2004
1.1638 0.2417 29000 1.2001
1.1613 0.25 30000 1.2006
1.1615 0.2583 31000 1.1983
1.1606 0.2667 32000 1.1979
1.1598 0.275 33000 1.1991
1.1585 0.2833 34000 1.1971
1.1575 0.2917 35000 1.1976
1.1563 0.3 36000 1.1964
1.1569 0.3083 37000 1.1954
1.1553 0.3167 38000 1.1953
1.1545 0.325 39000 1.1946
1.1537 0.3333 40000 1.1941
1.1523 0.3417 41000 1.1930
1.1522 0.35 42000 1.1930
1.1508 0.3583 43000 1.1926
1.15 0.3667 44000 1.1912
1.1489 0.375 45000 1.1920
1.1482 0.3833 46000 1.1915
1.1486 0.3917 47000 1.1917
1.1467 0.4 48000 1.1895
1.1474 0.4083 49000 1.1915
1.1463 0.4167 50000 1.1894
1.1457 0.425 51000 1.1898
1.1453 0.4333 52000 1.1888
1.1441 0.4417 53000 1.1887
1.1437 0.45 54000 1.1863
1.1425 0.4583 55000 1.1880
1.143 0.4667 56000 1.1873
1.1415 0.475 57000 1.1869
1.1419 0.4833 58000 1.1865
1.1415 0.4917 59000 1.1876
1.1403 0.5 60000 1.1859
1.1394 0.5083 61000 1.1855
1.14 0.5167 62000 1.1870
1.1388 0.525 63000 1.1859
1.1387 0.5333 64000 1.1852
1.1365 0.5417 65000 1.1840
1.1376 0.55 66000 1.1848
1.1369 0.5583 67000 1.1851
1.1362 0.5667 68000 1.1845
1.1364 0.575 69000 1.1835
1.1365 0.5833 70000 1.1833
1.1347 0.5917 71000 1.1832
1.1355 0.6 72000 1.1831
1.1342 0.6083 73000 1.1824
1.134 0.6167 74000 1.1824
1.1339 0.625 75000 1.1819
1.1323 0.6333 76000 1.1820
1.1322 0.6417 77000 1.1816
1.1325 0.65 78000 1.1820
1.1304 0.6583 79000 1.1810
1.1313 0.6667 80000 1.1811
1.1305 0.675 81000 1.1814
1.1309 0.6833 82000 1.1813
1.1294 0.6917 83000 1.1796
1.1299 0.7 84000 1.1816
1.1299 0.7083 85000 1.1796
1.1292 0.7167 86000 1.1793
1.1287 0.725 87000 1.1800
1.1278 0.7333 88000 1.1796
1.1277 0.7417 89000 1.1786
1.1265 0.75 90000 1.1796
1.128 0.7583 91000 1.1787
1.1266 0.7667 92000 1.1787
1.1271 0.775 93000 1.1789
1.1274 0.7833 94000 1.1784
1.1254 0.7917 95000 1.1779
1.1255 0.8 96000 1.1781
1.1252 0.8083 97000 1.1780
1.1255 0.8167 98000 1.1779
1.1249 0.825 99000 1.1776
1.1231 0.8333 100000 1.1767
1.1254 0.8417 101000 1.1770
1.1239 0.85 102000 1.1779
1.1227 0.8583 103000 1.1780
1.1233 0.8667 104000 1.1761
1.1242 0.875 105000 1.1768
1.1226 0.8833 106000 1.1760
1.1228 0.8917 107000 1.1763
1.1223 0.9 108000 1.1757
1.1224 0.9083 109000 1.1763
1.1221 0.9167 110000 1.1760
1.1209 0.925 111000 1.1760
1.1204 0.9333 112000 1.1758
1.1209 0.9417 113000 1.1753
1.1199 0.95 114000 1.1758
1.1203 0.9583 115000 1.1755
1.1196 0.9667 116000 1.1752
1.1207 0.975 117000 1.1754
1.1193 0.9833 118000 1.1747
1.1196 0.9917 119000 1.1748
1.1198 1.0 120000 1.1752
1.1184 0.0056 121000 1.1748
1.1198 0.0111 122000 1.1729
1.1183 0.0167 123000 1.1739
1.1186 0.0222 124000 1.1744
1.1185 0.0278 125000 1.1740
1.1168 0.0333 126000 1.1742
1.1184 0.0389 127000 1.1743
1.1166 0.0444 128000 1.1732
1.1161 0.05 129000 1.1726
1.1162 0.0556 130000 1.1730
1.1161 0.0611 131000 1.1746
1.1178 0.0667 132000 1.1736
1.115 0.0722 133000 1.1742
1.1157 0.0778 134000 1.1734
1.1159 0.0833 135000 1.1723
1.1153 0.0889 136000 1.1725
1.1154 0.0944 137000 1.1736
1.1143 0.1 138000 1.1717
1.1145 0.1056 139000 1.1725
1.1138 0.1111 140000 1.1728
1.1136 0.1167 141000 1.1721
1.1143 0.1222 142000 1.1719
1.1143 0.1278 143000 1.1721
1.1145 0.1333 144000 1.1707
1.1133 0.1389 145000 1.1729
1.1127 0.1444 146000 1.1715
1.1125 0.15 147000 1.1714
1.1126 0.1556 148000 1.1722
1.1132 0.1611 149000 1.1710
1.1114 0.1667 150000 1.1703
1.1134 0.1722 151000 1.1713
1.112 0.1778 152000 1.1713
1.1124 0.1833 153000 1.1717
1.112 0.1889 154000 1.1707
1.1131 0.1944 155000 1.1713
1.1114 0.2 156000 1.1699
1.1132 0.2056 157000 1.1713
1.1116 0.2111 158000 1.1712
1.1121 0.2167 159000 1.1705
1.1117 0.2222 160000 1.1699
1.1099 0.2278 161000 1.1693
1.1108 0.2333 162000 1.1702
1.1102 0.2389 163000 1.1699
1.1099 0.2444 164000 1.1697
1.1098 0.25 165000 1.1700
1.1091 0.2556 166000 1.1698
1.1117 0.2611 167000 1.1700
1.1094 0.2667 168000 1.1700
1.1099 0.2722 169000 1.1691
1.1098 0.2778 170000 1.1690
1.1097 0.2833 171000 1.1709
1.11 0.2889 172000 1.1707
1.1095 0.2944 173000 1.1690
1.108 0.3 174000 1.1687
1.108 0.3056 175000 1.1696
1.1087 0.3111 176000 1.1688
1.1074 0.3167 177000 1.1689
1.1085 0.3222 178000 1.1699
1.1089 0.3278 179000 1.1697
1.1082 0.3333 180000 1.1691
1.1073 0.0042 181000 1.1684
1.1087 0.0083 182000 1.1677
1.1075 0.0125 183000 1.1673
1.108 0.0167 184000 1.1678
1.1082 0.0208 185000 1.1687
1.1063 0.025 186000 1.1689
1.1073 0.0292 187000 1.1678
1.1073 0.0333 188000 1.1675
1.1062 0.0375 189000 1.1678
1.1062 0.0417 190000 1.1686
1.1054 0.0458 191000 1.1679
1.1079 0.05 192000 1.1679
1.1061 0.0542 193000 1.1680
1.106 0.0583 194000 1.1685
1.1061 0.0625 195000 1.1672
1.1064 0.0667 196000 1.1672
1.1064 0.0708 197000 1.1673
1.1047 0.075 198000 1.1671
1.1055 0.0792 199000 1.1674
1.1049 0.0833 200000 1.1678
1.1053 0.0875 201000 1.1675
1.1046 0.0917 202000 1.1666
1.1051 0.0958 203000 1.1672
1.1058 0.1 204000 1.1659
1.1044 0.1042 205000 1.1667
1.1035 0.1083 206000 1.1677
1.1038 0.1125 207000 1.1671
1.1042 0.1167 208000 1.1657
1.1044 0.1208 209000 1.1664
1.1028 0.125 210000 1.1663
1.1046 0.1292 211000 1.1654
1.1031 0.1333 212000 1.1662
1.104 0.1375 213000 1.1673
1.1045 0.1417 214000 1.1666
1.1045 0.1458 215000 1.1665
1.1027 0.15 216000 1.1648
1.1047 0.1542 217000 1.1660
1.1035 0.1583 218000 1.1659
1.1038 0.1625 219000 1.1672
1.1037 0.1667 220000 1.1665
1.1026 0.1708 221000 1.1653
1.1035 0.175 222000 1.1655
1.1033 0.1792 223000 1.1648
1.1018 0.1833 224000 1.1650
1.1024 0.1875 225000 1.1654
1.1017 0.1917 226000 1.1651
1.1036 0.1958 227000 1.1655
1.1021 0.2 228000 1.1667
1.102 0.2042 229000 1.1660
1.1029 0.2083 230000 1.1664
1.102 0.2125 231000 1.1661
1.1026 0.2167 232000 1.1651
1.1019 0.2208 233000 1.1644
1.1014 0.225 234000 1.1651
1.1011 0.2292 235000 1.1658
1.1018 0.2333 236000 1.1658
1.1008 0.2375 237000 1.1653
1.1018 0.2417 238000 1.1651
1.1025 0.2458 239000 1.1652
1.1008 0.25 240000 1.1654
1.0963 0.0037 241000 1.1616
1.0963 0.0074 242000 1.1611
1.0926 0.0111 243000 1.1597
1.0941 0.0148 244000 1.1594
1.094 0.0185 245000 1.1591
1.0917 0.0222 246000 1.1596
1.0915 0.0259 247000 1.1591
1.0922 0.0296 248000 1.1589
1.0913 0.0333 249000 1.1586
1.091 0.0370 250000 1.1584
1.0896 0.0407 251000 1.1583
1.0931 0.0444 252000 1.1586
1.0914 0.0481 253000 1.1582
1.09 0.0518 254000 1.1592
1.0907 0.0556 255000 1.1584
1.0898 0.0593 256000 1.1585
1.0917 0.0630 257000 1.1577
1.0898 0.0667 258000 1.1583
1.0894 0.0704 259000 1.1576
1.0888 0.0741 260000 1.1591
1.0906 0.0778 261000 1.1578
1.0895 0.0815 262000 1.1574
1.0902 0.0852 263000 1.1580
1.0906 0.0889 264000 1.1579
1.088 0.0926 265000 1.1584
1.0896 0.0963 266000 1.1584
1.0894 0.1000 267000 1.1581
1.0895 0.1037 268000 1.1577
1.0894 0.1074 269000 1.1574
1.0891 0.1111 270000 1.1573

Framework versions

  • Transformers 4.51.3
  • Pytorch 2.7.0+cu126
  • Datasets 3.6.0.dev0
  • Tokenizers 0.21.1
Downloads last month
1
Safetensors
Model size
94.7M params
Tensor type
F32
·
Inference Providers NEW
This model isn't deployed by any Inference Provider. 🙋 Ask for provider support

Dataset used to train sbuedenb/beetle-gpn-wide-reduced