wide-big-reduced
This model is a GPN trained on the sbuedenb/big_beetle_dataset dataset. It achieves the following results on the evaluation set:
- Loss: 1.1583
Model description
This model deviates from the default GPN in the following parameters:
first_kernel_size=9 (actually default, just to be explicit about kernel size)
rest_kernel_size=9
dilation_max=81
dilation_cycle=5
dilation_base=3
num_hidden_layers=20
--config_overrides "first_kernel_size=9,rest_kernel_size=9,dilation_max=81,dilation_cycle=5,dilation_base=3,num_hidden_layers=20"
Intended uses & limitations
This model is meant for DNA analysis of the Cucujiformia infraorder of insects.
Training and evaluation data
The dataset was created from 12 NCBI reference genomes from Cucujiformia.
Training procedure
240000 steps with linear LR 30000 steps with cosine LR
Training hyperparameters
The following hyperparameters were used for first 240000 steps training:
- learning_rate: 0.001
- train_batch_size: 256
- eval_batch_size: 256
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 1024
- total_eval_batch_size: 1024
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: linear
- training_steps: 240000
The following hyperparameters were used for last 30000 steps training:
- learning_rate: 0.001
- train_batch_size: 256
- eval_batch_size: 256
- seed: 42
- distributed_type: multi-GPU
- num_devices: 4
- total_train_batch_size: 1024
- total_eval_batch_size: 1024
- optimizer: Use adamw_torch with betas=(0.9,0.999) and epsilon=1e-08 and optimizer_args=No additional optimizer arguments
- lr_scheduler_type: cosine
- training_steps: 270000
Training results
| Training Loss | Epoch | Step | Validation Loss |
|---|---|---|---|
| 1.2423 | 0.0083 | 1000 | 1.2538 |
| 1.2322 | 0.0167 | 2000 | 1.2448 |
| 1.2251 | 0.025 | 3000 | 1.2364 |
| 1.2212 | 0.0333 | 4000 | 1.2350 |
| 1.218 | 0.0417 | 5000 | 1.2322 |
| 1.2153 | 0.05 | 6000 | 1.2298 |
| 1.2134 | 0.0583 | 7000 | 1.2292 |
| 1.209 | 0.0667 | 8000 | 1.2252 |
| 1.2051 | 0.075 | 9000 | 1.2245 |
| 1.2016 | 0.0833 | 10000 | 1.2215 |
| 1.1981 | 0.0917 | 11000 | 1.2191 |
| 1.1953 | 0.1 | 12000 | 1.2172 |
| 1.1918 | 0.1083 | 13000 | 1.2158 |
| 1.1897 | 0.1167 | 14000 | 1.2146 |
| 1.1874 | 0.125 | 15000 | 1.2132 |
| 1.1841 | 0.1333 | 16000 | 1.2110 |
| 1.1817 | 0.1417 | 17000 | 1.2104 |
| 1.1798 | 0.15 | 18000 | 1.2098 |
| 1.178 | 0.1583 | 19000 | 1.2083 |
| 1.1754 | 0.1667 | 20000 | 1.2071 |
| 1.1747 | 0.175 | 21000 | 1.2062 |
| 1.1729 | 0.1833 | 22000 | 1.2055 |
| 1.1714 | 0.1917 | 23000 | 1.2050 |
| 1.1696 | 0.2 | 24000 | 1.2042 |
| 1.1682 | 0.2083 | 25000 | 1.2026 |
| 1.1668 | 0.2167 | 26000 | 1.2026 |
| 1.165 | 0.225 | 27000 | 1.2007 |
| 1.1645 | 0.2333 | 28000 | 1.2004 |
| 1.1638 | 0.2417 | 29000 | 1.2001 |
| 1.1613 | 0.25 | 30000 | 1.2006 |
| 1.1615 | 0.2583 | 31000 | 1.1983 |
| 1.1606 | 0.2667 | 32000 | 1.1979 |
| 1.1598 | 0.275 | 33000 | 1.1991 |
| 1.1585 | 0.2833 | 34000 | 1.1971 |
| 1.1575 | 0.2917 | 35000 | 1.1976 |
| 1.1563 | 0.3 | 36000 | 1.1964 |
| 1.1569 | 0.3083 | 37000 | 1.1954 |
| 1.1553 | 0.3167 | 38000 | 1.1953 |
| 1.1545 | 0.325 | 39000 | 1.1946 |
| 1.1537 | 0.3333 | 40000 | 1.1941 |
| 1.1523 | 0.3417 | 41000 | 1.1930 |
| 1.1522 | 0.35 | 42000 | 1.1930 |
| 1.1508 | 0.3583 | 43000 | 1.1926 |
| 1.15 | 0.3667 | 44000 | 1.1912 |
| 1.1489 | 0.375 | 45000 | 1.1920 |
| 1.1482 | 0.3833 | 46000 | 1.1915 |
| 1.1486 | 0.3917 | 47000 | 1.1917 |
| 1.1467 | 0.4 | 48000 | 1.1895 |
| 1.1474 | 0.4083 | 49000 | 1.1915 |
| 1.1463 | 0.4167 | 50000 | 1.1894 |
| 1.1457 | 0.425 | 51000 | 1.1898 |
| 1.1453 | 0.4333 | 52000 | 1.1888 |
| 1.1441 | 0.4417 | 53000 | 1.1887 |
| 1.1437 | 0.45 | 54000 | 1.1863 |
| 1.1425 | 0.4583 | 55000 | 1.1880 |
| 1.143 | 0.4667 | 56000 | 1.1873 |
| 1.1415 | 0.475 | 57000 | 1.1869 |
| 1.1419 | 0.4833 | 58000 | 1.1865 |
| 1.1415 | 0.4917 | 59000 | 1.1876 |
| 1.1403 | 0.5 | 60000 | 1.1859 |
| 1.1394 | 0.5083 | 61000 | 1.1855 |
| 1.14 | 0.5167 | 62000 | 1.1870 |
| 1.1388 | 0.525 | 63000 | 1.1859 |
| 1.1387 | 0.5333 | 64000 | 1.1852 |
| 1.1365 | 0.5417 | 65000 | 1.1840 |
| 1.1376 | 0.55 | 66000 | 1.1848 |
| 1.1369 | 0.5583 | 67000 | 1.1851 |
| 1.1362 | 0.5667 | 68000 | 1.1845 |
| 1.1364 | 0.575 | 69000 | 1.1835 |
| 1.1365 | 0.5833 | 70000 | 1.1833 |
| 1.1347 | 0.5917 | 71000 | 1.1832 |
| 1.1355 | 0.6 | 72000 | 1.1831 |
| 1.1342 | 0.6083 | 73000 | 1.1824 |
| 1.134 | 0.6167 | 74000 | 1.1824 |
| 1.1339 | 0.625 | 75000 | 1.1819 |
| 1.1323 | 0.6333 | 76000 | 1.1820 |
| 1.1322 | 0.6417 | 77000 | 1.1816 |
| 1.1325 | 0.65 | 78000 | 1.1820 |
| 1.1304 | 0.6583 | 79000 | 1.1810 |
| 1.1313 | 0.6667 | 80000 | 1.1811 |
| 1.1305 | 0.675 | 81000 | 1.1814 |
| 1.1309 | 0.6833 | 82000 | 1.1813 |
| 1.1294 | 0.6917 | 83000 | 1.1796 |
| 1.1299 | 0.7 | 84000 | 1.1816 |
| 1.1299 | 0.7083 | 85000 | 1.1796 |
| 1.1292 | 0.7167 | 86000 | 1.1793 |
| 1.1287 | 0.725 | 87000 | 1.1800 |
| 1.1278 | 0.7333 | 88000 | 1.1796 |
| 1.1277 | 0.7417 | 89000 | 1.1786 |
| 1.1265 | 0.75 | 90000 | 1.1796 |
| 1.128 | 0.7583 | 91000 | 1.1787 |
| 1.1266 | 0.7667 | 92000 | 1.1787 |
| 1.1271 | 0.775 | 93000 | 1.1789 |
| 1.1274 | 0.7833 | 94000 | 1.1784 |
| 1.1254 | 0.7917 | 95000 | 1.1779 |
| 1.1255 | 0.8 | 96000 | 1.1781 |
| 1.1252 | 0.8083 | 97000 | 1.1780 |
| 1.1255 | 0.8167 | 98000 | 1.1779 |
| 1.1249 | 0.825 | 99000 | 1.1776 |
| 1.1231 | 0.8333 | 100000 | 1.1767 |
| 1.1254 | 0.8417 | 101000 | 1.1770 |
| 1.1239 | 0.85 | 102000 | 1.1779 |
| 1.1227 | 0.8583 | 103000 | 1.1780 |
| 1.1233 | 0.8667 | 104000 | 1.1761 |
| 1.1242 | 0.875 | 105000 | 1.1768 |
| 1.1226 | 0.8833 | 106000 | 1.1760 |
| 1.1228 | 0.8917 | 107000 | 1.1763 |
| 1.1223 | 0.9 | 108000 | 1.1757 |
| 1.1224 | 0.9083 | 109000 | 1.1763 |
| 1.1221 | 0.9167 | 110000 | 1.1760 |
| 1.1209 | 0.925 | 111000 | 1.1760 |
| 1.1204 | 0.9333 | 112000 | 1.1758 |
| 1.1209 | 0.9417 | 113000 | 1.1753 |
| 1.1199 | 0.95 | 114000 | 1.1758 |
| 1.1203 | 0.9583 | 115000 | 1.1755 |
| 1.1196 | 0.9667 | 116000 | 1.1752 |
| 1.1207 | 0.975 | 117000 | 1.1754 |
| 1.1193 | 0.9833 | 118000 | 1.1747 |
| 1.1196 | 0.9917 | 119000 | 1.1748 |
| 1.1198 | 1.0 | 120000 | 1.1752 |
| 1.1184 | 0.0056 | 121000 | 1.1748 |
| 1.1198 | 0.0111 | 122000 | 1.1729 |
| 1.1183 | 0.0167 | 123000 | 1.1739 |
| 1.1186 | 0.0222 | 124000 | 1.1744 |
| 1.1185 | 0.0278 | 125000 | 1.1740 |
| 1.1168 | 0.0333 | 126000 | 1.1742 |
| 1.1184 | 0.0389 | 127000 | 1.1743 |
| 1.1166 | 0.0444 | 128000 | 1.1732 |
| 1.1161 | 0.05 | 129000 | 1.1726 |
| 1.1162 | 0.0556 | 130000 | 1.1730 |
| 1.1161 | 0.0611 | 131000 | 1.1746 |
| 1.1178 | 0.0667 | 132000 | 1.1736 |
| 1.115 | 0.0722 | 133000 | 1.1742 |
| 1.1157 | 0.0778 | 134000 | 1.1734 |
| 1.1159 | 0.0833 | 135000 | 1.1723 |
| 1.1153 | 0.0889 | 136000 | 1.1725 |
| 1.1154 | 0.0944 | 137000 | 1.1736 |
| 1.1143 | 0.1 | 138000 | 1.1717 |
| 1.1145 | 0.1056 | 139000 | 1.1725 |
| 1.1138 | 0.1111 | 140000 | 1.1728 |
| 1.1136 | 0.1167 | 141000 | 1.1721 |
| 1.1143 | 0.1222 | 142000 | 1.1719 |
| 1.1143 | 0.1278 | 143000 | 1.1721 |
| 1.1145 | 0.1333 | 144000 | 1.1707 |
| 1.1133 | 0.1389 | 145000 | 1.1729 |
| 1.1127 | 0.1444 | 146000 | 1.1715 |
| 1.1125 | 0.15 | 147000 | 1.1714 |
| 1.1126 | 0.1556 | 148000 | 1.1722 |
| 1.1132 | 0.1611 | 149000 | 1.1710 |
| 1.1114 | 0.1667 | 150000 | 1.1703 |
| 1.1134 | 0.1722 | 151000 | 1.1713 |
| 1.112 | 0.1778 | 152000 | 1.1713 |
| 1.1124 | 0.1833 | 153000 | 1.1717 |
| 1.112 | 0.1889 | 154000 | 1.1707 |
| 1.1131 | 0.1944 | 155000 | 1.1713 |
| 1.1114 | 0.2 | 156000 | 1.1699 |
| 1.1132 | 0.2056 | 157000 | 1.1713 |
| 1.1116 | 0.2111 | 158000 | 1.1712 |
| 1.1121 | 0.2167 | 159000 | 1.1705 |
| 1.1117 | 0.2222 | 160000 | 1.1699 |
| 1.1099 | 0.2278 | 161000 | 1.1693 |
| 1.1108 | 0.2333 | 162000 | 1.1702 |
| 1.1102 | 0.2389 | 163000 | 1.1699 |
| 1.1099 | 0.2444 | 164000 | 1.1697 |
| 1.1098 | 0.25 | 165000 | 1.1700 |
| 1.1091 | 0.2556 | 166000 | 1.1698 |
| 1.1117 | 0.2611 | 167000 | 1.1700 |
| 1.1094 | 0.2667 | 168000 | 1.1700 |
| 1.1099 | 0.2722 | 169000 | 1.1691 |
| 1.1098 | 0.2778 | 170000 | 1.1690 |
| 1.1097 | 0.2833 | 171000 | 1.1709 |
| 1.11 | 0.2889 | 172000 | 1.1707 |
| 1.1095 | 0.2944 | 173000 | 1.1690 |
| 1.108 | 0.3 | 174000 | 1.1687 |
| 1.108 | 0.3056 | 175000 | 1.1696 |
| 1.1087 | 0.3111 | 176000 | 1.1688 |
| 1.1074 | 0.3167 | 177000 | 1.1689 |
| 1.1085 | 0.3222 | 178000 | 1.1699 |
| 1.1089 | 0.3278 | 179000 | 1.1697 |
| 1.1082 | 0.3333 | 180000 | 1.1691 |
| 1.1073 | 0.0042 | 181000 | 1.1684 |
| 1.1087 | 0.0083 | 182000 | 1.1677 |
| 1.1075 | 0.0125 | 183000 | 1.1673 |
| 1.108 | 0.0167 | 184000 | 1.1678 |
| 1.1082 | 0.0208 | 185000 | 1.1687 |
| 1.1063 | 0.025 | 186000 | 1.1689 |
| 1.1073 | 0.0292 | 187000 | 1.1678 |
| 1.1073 | 0.0333 | 188000 | 1.1675 |
| 1.1062 | 0.0375 | 189000 | 1.1678 |
| 1.1062 | 0.0417 | 190000 | 1.1686 |
| 1.1054 | 0.0458 | 191000 | 1.1679 |
| 1.1079 | 0.05 | 192000 | 1.1679 |
| 1.1061 | 0.0542 | 193000 | 1.1680 |
| 1.106 | 0.0583 | 194000 | 1.1685 |
| 1.1061 | 0.0625 | 195000 | 1.1672 |
| 1.1064 | 0.0667 | 196000 | 1.1672 |
| 1.1064 | 0.0708 | 197000 | 1.1673 |
| 1.1047 | 0.075 | 198000 | 1.1671 |
| 1.1055 | 0.0792 | 199000 | 1.1674 |
| 1.1049 | 0.0833 | 200000 | 1.1678 |
| 1.1053 | 0.0875 | 201000 | 1.1675 |
| 1.1046 | 0.0917 | 202000 | 1.1666 |
| 1.1051 | 0.0958 | 203000 | 1.1672 |
| 1.1058 | 0.1 | 204000 | 1.1659 |
| 1.1044 | 0.1042 | 205000 | 1.1667 |
| 1.1035 | 0.1083 | 206000 | 1.1677 |
| 1.1038 | 0.1125 | 207000 | 1.1671 |
| 1.1042 | 0.1167 | 208000 | 1.1657 |
| 1.1044 | 0.1208 | 209000 | 1.1664 |
| 1.1028 | 0.125 | 210000 | 1.1663 |
| 1.1046 | 0.1292 | 211000 | 1.1654 |
| 1.1031 | 0.1333 | 212000 | 1.1662 |
| 1.104 | 0.1375 | 213000 | 1.1673 |
| 1.1045 | 0.1417 | 214000 | 1.1666 |
| 1.1045 | 0.1458 | 215000 | 1.1665 |
| 1.1027 | 0.15 | 216000 | 1.1648 |
| 1.1047 | 0.1542 | 217000 | 1.1660 |
| 1.1035 | 0.1583 | 218000 | 1.1659 |
| 1.1038 | 0.1625 | 219000 | 1.1672 |
| 1.1037 | 0.1667 | 220000 | 1.1665 |
| 1.1026 | 0.1708 | 221000 | 1.1653 |
| 1.1035 | 0.175 | 222000 | 1.1655 |
| 1.1033 | 0.1792 | 223000 | 1.1648 |
| 1.1018 | 0.1833 | 224000 | 1.1650 |
| 1.1024 | 0.1875 | 225000 | 1.1654 |
| 1.1017 | 0.1917 | 226000 | 1.1651 |
| 1.1036 | 0.1958 | 227000 | 1.1655 |
| 1.1021 | 0.2 | 228000 | 1.1667 |
| 1.102 | 0.2042 | 229000 | 1.1660 |
| 1.1029 | 0.2083 | 230000 | 1.1664 |
| 1.102 | 0.2125 | 231000 | 1.1661 |
| 1.1026 | 0.2167 | 232000 | 1.1651 |
| 1.1019 | 0.2208 | 233000 | 1.1644 |
| 1.1014 | 0.225 | 234000 | 1.1651 |
| 1.1011 | 0.2292 | 235000 | 1.1658 |
| 1.1018 | 0.2333 | 236000 | 1.1658 |
| 1.1008 | 0.2375 | 237000 | 1.1653 |
| 1.1018 | 0.2417 | 238000 | 1.1651 |
| 1.1025 | 0.2458 | 239000 | 1.1652 |
| 1.1008 | 0.25 | 240000 | 1.1654 |
| 1.0963 | 0.0037 | 241000 | 1.1616 |
| 1.0963 | 0.0074 | 242000 | 1.1611 |
| 1.0926 | 0.0111 | 243000 | 1.1597 |
| 1.0941 | 0.0148 | 244000 | 1.1594 |
| 1.094 | 0.0185 | 245000 | 1.1591 |
| 1.0917 | 0.0222 | 246000 | 1.1596 |
| 1.0915 | 0.0259 | 247000 | 1.1591 |
| 1.0922 | 0.0296 | 248000 | 1.1589 |
| 1.0913 | 0.0333 | 249000 | 1.1586 |
| 1.091 | 0.0370 | 250000 | 1.1584 |
| 1.0896 | 0.0407 | 251000 | 1.1583 |
| 1.0931 | 0.0444 | 252000 | 1.1586 |
| 1.0914 | 0.0481 | 253000 | 1.1582 |
| 1.09 | 0.0518 | 254000 | 1.1592 |
| 1.0907 | 0.0556 | 255000 | 1.1584 |
| 1.0898 | 0.0593 | 256000 | 1.1585 |
| 1.0917 | 0.0630 | 257000 | 1.1577 |
| 1.0898 | 0.0667 | 258000 | 1.1583 |
| 1.0894 | 0.0704 | 259000 | 1.1576 |
| 1.0888 | 0.0741 | 260000 | 1.1591 |
| 1.0906 | 0.0778 | 261000 | 1.1578 |
| 1.0895 | 0.0815 | 262000 | 1.1574 |
| 1.0902 | 0.0852 | 263000 | 1.1580 |
| 1.0906 | 0.0889 | 264000 | 1.1579 |
| 1.088 | 0.0926 | 265000 | 1.1584 |
| 1.0896 | 0.0963 | 266000 | 1.1584 |
| 1.0894 | 0.1000 | 267000 | 1.1581 |
| 1.0895 | 0.1037 | 268000 | 1.1577 |
| 1.0894 | 0.1074 | 269000 | 1.1574 |
| 1.0891 | 0.1111 | 270000 | 1.1573 |
Framework versions
- Transformers 4.51.3
- Pytorch 2.7.0+cu126
- Datasets 3.6.0.dev0
- Tokenizers 0.21.1
- Downloads last month
- 1